library_name: transformers tags:

  • OCR
  • handwritten-text-recognition
  • multilingual
  • Arabic
  • English

Model Card for TR-OCR Large AR/EN Handwritten

This is a finetuned version of TROCR Large specialized in handwritten text recognition for Arabic and English languages.

Model Details

Model Description

This is a finetuned version of Microsoft's TROCR Large model, adapted for handwritten text recognition in Arabic and English languages using the Khatt and IAM Handwriting datasets.

  • Developed by: Me and my colleague Ahmed Wahdan
  • Model type: OCR (Optical Character Recognition)
  • Language(s) (NLP): Arabic, English
  • Finetuned from model: Microsoft TROCR Large

Model Sources [optional]

Uses

Direct Use

This model is intended for handwritten text recognition in Arabic and English documents.

Out-of-Scope Use

The model should not be used for:

  • Languages other than Arabic and English
  • Printed text recognition
  • Non-text image analysis

Bias, Risks, and Limitations

Limitations

  • Only supports Arabic and English languages
  • Performance may vary with different handwriting styles
  • Not tested on all possible handwriting variations

Recommendations

Users should be aware that the model is specifically trained for Arabic and English handwritten text and may not perform well on other languages or printed text.

How to Get Started with the Model

# Sample code to load the model
from transformers import TrOCRProcessor, VisionEncoderDecoderModel

processor = TrOCRProcessor.from_pretrained("David-Magdy/TR_OCR_LARGE")
model = VisionEncoderDecoderModel.from_pretrained("David-Magdy/TR_OCR_LARGE")
Downloads last month
482
Safetensors
Model size
558M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for David-Magdy/TR_OCR_LARGE

Finetuned
(8)
this model

Dataset used to train David-Magdy/TR_OCR_LARGE