| RWTH-OCR - Arabic Handwriting Recognition |
The RWTH-OCR system is based on the open-source speech recognition framework RWTH-ASR - The RWTH Aachen University Speech Recognition System, which has been extended by video and image processing methods.
Arabic handwriting recognition -- Due to Pieces of Arabic Words (PAWs), white space models and low loop transitions are important in Arabic handwriting recognition.
The visualization shows a training alignment of an Arabic word to its corresponding HMM states, trained with an HMM based system. We use R-G-B background colors for the 0-1-2 HMM states, respectively, from right-to-left. The position-dependent character model names are written in the upper line, where the white-space models are annotated by 'si' for 'silence'; the state numbers are written in the bottom line. Thus, HMM state-loops and state-transitions are represented by no-color-changes and color-changes, respectively.
Published paper related to this work:
-
P. Dreuw, G. Heigold, and H. Ney. Confidence-Based Discriminative Training for Model Adaptation in Offline Arabic Handwriting Recognition. In International Conference on Document Analysis and Recognition (ICDAR), pages 596-600, Barcelona, Spain, July 2009.
©
-
P. Dreuw, D. Rybach, C. Gollan, and H. Ney. Writer Adaptive Training and Writing Variant Model Refinement for Offline Arabic Handwriting Recognition. In International Conference on Document Analysis and Recognition (ICDAR), pages 21-25, Barcelona, Spain, July 2009.
©
-
P. Dreuw, S. Jonas, and H. Ney. White-Space Models for Offline Arabic Handwriting Recognition. In International Conference on Pattern Recognition (ICPR), Tampa, Florida, USA, December 2008.
Some interesting links:
- Databases:
- Arabic, "A New Comprehensive Database of Hadritten Arabic Words, Numbers, and
Signatures used for OCR Testing", by Nawwaf Kharma. Maher Ahmed, and Rabab
Ward, 1999,
http://users.encs.concordia.ca/~kharma/ExchangeWeb/Databases/ArabicDBases/ - Arabic, "IFN/ENIT-Database of Handwritten Arabic Words", by M. Pechwitz, S. Snoussi Maddouri, V. Märgner, N. Ellouze , and H. Amiri, 2002
http://www.ifnenit.com - Arabic, "Data-Base for Arabic Handwritten Text Recognition Research", by S. Al-Ma'adeed, D Elliman, and C Higgins, 2004.
http://www.cs.nott.ac.uk/~cah/Databases.htm - Farsi, "Isolated Farsi/Arabic Handwritten Character DataBase (IFHCDB)",
http://ele.aut.ac.ir/imageproc/downloads/IFHCDB.htm
- Arabic, "A New Comprehensive Database of Hadritten Arabic Words, Numbers, and
Signatures used for OCR Testing", by Nawwaf Kharma. Maher Ahmed, and Rabab
Ward, 1999,
- Writing:
- Ligature UTF-8 problems: http://homepage2.nifty.com/PAF00305/lib/arabic-lig-alpha.html
- Arabic script: http://www.omniglot.com/writing/arabic.htm
- Nastaliq symbols: http://en.wikipedia.org/wiki/Nastaliq
- Tools:
Philippe Dreuw Last modified: Wed Aug 5 15:42:14 CEST 2008 Disclaimer. Created Tue Sep 22 18:04:32 CET 2007

