TY - GEN
T1 - Amharic character recognition using a fast signature based algorithm
AU - Cowell, John
AU - Hussain, Fiaz
N1 - Publisher Copyright:
© 2003 IEEE.
PY - 2003
Y1 - 2003
N2 - The Amharic language is the principal language of over 20 million people mainly in Ethiopia. An extensive literature survey reveals no journal or conference papers on Amharic character recognition. The Amharic script has 33 basic characters each with seven orders giving 231 distinct characters, not including numbers and punctuation symbols. The characters are cursive but not connected and unlike other cursive scripts do not use dots. This paper describes the Amharic script and discusses the difficulties of applying conventional structural and syntactic recognition processes. Two statistical algorithms for identifying Amharic characters are described. In both, the characters are normalised for both size and orientation. The first compares the character against a series of templates. The second derives a characteristic signature from the character and compares this against a set of signature templates. The signatures used are fifty times smaller than the original character and the recognition process is corresponding faster but with some loss of accuracy. The statistical techniques described have been fully implemented and the resulting performance outlined.
AB - The Amharic language is the principal language of over 20 million people mainly in Ethiopia. An extensive literature survey reveals no journal or conference papers on Amharic character recognition. The Amharic script has 33 basic characters each with seven orders giving 231 distinct characters, not including numbers and punctuation symbols. The characters are cursive but not connected and unlike other cursive scripts do not use dots. This paper describes the Amharic script and discusses the difficulties of applying conventional structural and syntactic recognition processes. Two statistical algorithms for identifying Amharic characters are described. In both, the characters are normalised for both size and orientation. The first compares the character against a series of templates. The second derives a characteristic signature from the character and compares this against a set of signature templates. The signatures used are fifty times smaller than the original character and the recognition process is corresponding faster but with some loss of accuracy. The statistical techniques described have been fully implemented and the resulting performance outlined.
KW - Amharic character recognition
KW - Character signature
KW - Confusion matrix
KW - OCR
KW - Optical character recognition
KW - Structural recognition
UR - http://www.scopus.com/inward/record.url?scp=18844429352&partnerID=8YFLogxK
U2 - 10.1109/IV.2003.1218014
DO - 10.1109/IV.2003.1218014
M3 - Conference contribution
AN - SCOPUS:18844429352
T3 - Proceedings of the International Conference on Information Visualisation
SP - 384
EP - 389
BT - Proceedings - 7th International Conference on Information Visualization
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 7th International Conference on Information Visualization, IV 2003
Y2 - 16 July 2003 through 18 July 2003
ER -