Arvind, KR and Pati, Peeta Basa and Ramakrishnan, AG (2006) Automatic text block separation in document images. In: 4th International Conference on Intelligent Sensing and Information Processing,, Dec 15-18, 2006, Bangalore, India, pp. 53-58.
04286061.pdf - Published Version
Restricted to Registered users only
Download (2023Kb) | Request a copy
Separation of printed text blocks from the non-text areas, containing signatures, handwritten text, logos and other such symbols, is a necessary first step for an OCR involving printed text recognition. In the present work, we compare the efficacy of some feature-classifier combinations to carry out this separation task. We have selected length-nomalized horizontal projection profile (HPP) as the starting point of such a separation task. This is with the assumption that the printed text blocks contain lines of text which generate HPP's with some regularity. Such an assumption is demonstrated to be valid. Our features are the HPP and its two transformed versions, namely, eigen and Fisher profiles. Four well known classifiers, namely, Nearest neighbor, Linear discriminant function, SVM's and artificial neural networks have been considered and efficiency of the combination of these classifiers with the above features is compared. A sequential floating feature selection technique has been adopted to enhance the efficiency of this separation task. The results give an average accuracy of about 96.
|Item Type:||Conference Paper|
|Additional Information:||Copyright 2006 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.|
|Keywords:||Eigen profiles;Fisher profiles;horizontal projection profile.|
|Department/Centre:||Division of Electrical Sciences > Electrical Engineering|
|Date Deposited:||02 Sep 2010 05:41|
|Last Modified:||19 Sep 2010 06:12|
Actions (login required)