Amharic character recognition using a fast signature based algorithm

John Cowell, Fiaz Hussain

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

19 Citations (Scopus)

Abstract

The Amharic language is the principal language of over 20 million people mainly in Ethiopia. An extensive literature survey reveals no journal or conference papers on Amharic character recognition. The Amharic script has 33 basic characters each with seven orders giving 231 distinct characters, not including numbers and punctuation symbols. The characters are cursive but not connected and unlike other cursive scripts do not use dots. This paper describes the Amharic script and discusses the difficulties of applying conventional structural and syntactic recognition processes. Two statistical algorithms for identifying Amharic characters are described. In both, the characters are normalised for both size and orientation. The first compares the character against a series of templates. The second derives a characteristic signature from the character and compares this against a set of signature templates. The signatures used are fifty times smaller than the original character and the recognition process is corresponding faster but with some loss of accuracy. The statistical techniques described have been fully implemented and the resulting performance outlined.

Original languageEnglish
Title of host publicationProceedings - 7th International Conference on Information Visualization
Subtitle of host publicationAn International Conference on Computer Visualization and Graphics Applications, IV 2003
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages384-389
Number of pages6
ISBN (Electronic)0769519881
DOIs
Publication statusPublished - 2003
Externally publishedYes
Event7th International Conference on Information Visualization, IV 2003 - London, United Kingdom
Duration: 16 Jul 200318 Jul 2003

Publication series

NameProceedings of the International Conference on Information Visualisation
Volume2003-January
ISSN (Print)1093-9547

Conference

Conference7th International Conference on Information Visualization, IV 2003
Country/TerritoryUnited Kingdom
CityLondon
Period16/07/0318/07/03

Keywords

  • Amharic character recognition
  • Character signature
  • Confusion matrix
  • OCR
  • Optical character recognition
  • Structural recognition

Cite this