Handwritten Arabic text line segmentation using affinity propagation
Title | Handwritten Arabic text line segmentation using affinity propagation |
Publication Type | Conference Papers |
Year of Publication | 2010 |
Authors | Kumar J, Abd-Almageed W, Kang L, Doermann D |
Conference Name | Proceedings of the 9th IAPR International Workshop on Document Analysis Systems |
Date Published | 2010/// |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 978-1-60558-773-8 |
Keywords | affinity propagation, arabic, arabic documents, breadth-first search, clustering, dijkstra's shortest path algorithm, handwritten documents, line detection, text line segmentation |
Abstract | In this paper, we present a novel graph-based method for extracting handwritten text lines in monochromatic Arabic document images. Our approach consists of two steps - Coarse text line estimation using primary components which define the line and assignment of diacritic components which are more difficult to associate with a given line. We first estimate local orientation at each primary component to build a sparse similarity graph. We then, use a shortest path algorithm to compute similarities between non-neighboring components. From this graph, we obtain coarse text lines using two estimates obtained from Affinity propagation and Breadth-first search. In the second step, we assign secondary components to each text line. The proposed method is very fast and robust to non-uniform skew and character size variations, normally present in handwritten text lines. We evaluate our method using a pixel-matching criteria, and report 96% accuracy on a dataset of 125 Arabic document images. We also present a proximity analysis on datasets generated by artificially decreasing the spacings between text lines to demonstrate the robustness of our approach. |
URL | http://doi.acm.org/10.1145/1815330.1815348 |
DOI | 10.1145/1815330.1815348 |