A fast recognition system for isolated Arabic characters

J. Cowell, F. Hussain

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

18 Citations (Scopus)

Abstract

This paper presents a very fast multi-stage algorithm for the recognition of non-Latin script. Although the examples use Arabic script, the system could be adapted in minutes to deal with any character set, in particular non-Latin characters where no commercial OCR systems are available. The approach used normalises isolated characters for size and extracts an image signature based on the number of black pixels in the rows and columns of the character and compares these values to a set of signatures for typical characters of the set. This technique identifies not only the closet match but gives the closeness of match to all other characters in the set, which is expressed in a triangular confusion matrix.

Original languageEnglish
Title of host publicationProceedings - 6th International Conference on Information Visualisation, IV 2002
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages650-654
Number of pages5
ISBN (Electronic)0769516564
DOIs
Publication statusPublished - 2002
Externally publishedYes
Event6th International Conference on Information Visualisation, IV 2002 - London, United Kingdom
Duration: 10 Jul 200212 Jul 2002

Publication series

NameProceedings of the International Conference on Information Visualisation
Volume2002-January
ISSN (Print)1093-9547

Conference

Conference6th International Conference on Information Visualisation, IV 2002
Country/TerritoryUnited Kingdom
CityLondon
Period10/07/0212/07/02

Keywords

  • Arabic
  • OCR
  • confusion matrix
  • fonts
  • image signatures
  • normalisation
  • pattern recognition

Cite this