An experimental study for Arabic text classification techniques

Bassam Al-Shargabi*, Fekry Olayah

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Several algorithms have been implemented to resolve the problem of text categorization. Most of the work in this area geared for English text, whereas few researches have been conducted on Arabic text. However, the nature of Arabic text is different than English text; pre-processing of Arabic text are more challenging. In this paper an experimental study was conducted on three techniques for Arabic text classification; these techniques, Discriminative Multinominal Naive Bayes (DMNB), Naïve Bayesian (NB) and IBK Algorithms, The paper aimed to assess the accuracy for each classifier and to determine which classifier is more accurate for Arabic text classification based on stop words elimination. The accuracy for each classifier is measured by Percentage split method (holdout), and K-fold cross validation methods, along with the time needed to classify Arabic text.

Original languageEnglish
Title of host publicationFourth International Conference on Digital Image Processing, ICDIP 2012
DOIs
Publication statusPublished - 2 Jun 2012
Externally publishedYes
Event4th International Conference on Digital Image Processing, ICDIP 2012 - Kuala Lumpur, Malaysia
Duration: 7 Apr 20128 Apr 2012

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume8334
ISSN (Print)0277-786X

Conference

Conference4th International Conference on Digital Image Processing, ICDIP 2012
Country/TerritoryMalaysia
CityKuala Lumpur
Period7/04/128/04/12

Keywords

  • Accuracy
  • Arabic text classification
  • categorizations algorithms
  • error rate

Cite this