library_name: transformers tags:
- OCR
- handwritten-text-recognition
- multilingual
- Arabic
- English
Model Card for TR-OCR Large AR/EN Handwritten
This is a finetuned version of TROCR Large specialized in handwritten text recognition for Arabic and English languages.
Model Details
Model Description
This is a finetuned version of Microsoft's TROCR Large model, adapted for handwritten text recognition in Arabic and English languages using the Khatt and IAM Handwriting datasets.
- Developed by: Me and my colleague Ahmed Wahdan
- Model type: OCR (Optical Character Recognition)
- Language(s) (NLP): Arabic, English
- Finetuned from model: Microsoft TROCR Large
Model Sources [optional]
- Repository: Kaggle Notebook - Yet to be provided
- Original Model Paper: TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
Uses
Direct Use
This model is intended for handwritten text recognition in Arabic and English documents.
Out-of-Scope Use
The model should not be used for:
- Languages other than Arabic and English
- Printed text recognition
- Non-text image analysis
Bias, Risks, and Limitations
Limitations
- Only supports Arabic and English languages
- Performance may vary with different handwriting styles
- Not tested on all possible handwriting variations
Recommendations
Users should be aware that the model is specifically trained for Arabic and English handwritten text and may not perform well on other languages or printed text.
How to Get Started with the Model
# Sample code to load the model
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
processor = TrOCRProcessor.from_pretrained("David-Magdy/TR_OCR_LARGE")
model = VisionEncoderDecoderModel.from_pretrained("David-Magdy/TR_OCR_LARGE")
- Downloads last month
- 482
Model tree for David-Magdy/TR_OCR_LARGE
Base model
microsoft/trocr-large-handwritten