A comparative study for Arabic text classification algorithms based on stop words elimination

Bassam Al-Shargabi*, Waseem Al-Romimah, Fekry Olayah

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

33 Citations (Scopus)

Abstract

This paper compares three techniques for Arabic text classification; these techniques are Support Vector Machine (SVM) with Sequential Minimal Optimization (SMO), Naïve Bayesian (NB), and J48. The main objective of this paper is to measure the accuracy for each classifier and to determine which classifier is more accurate for Arabic text classification based on stop words elimination. The accuracy for classifier is measured by Percentage split method (holdout), and K-fold cross validation methods,. The results show that the SMO classifier achieves the highest accuracy and the lowest error rate, and shows that the time needed to build the SMO model is the smallest time.

Original languageEnglish
Title of host publicationProceedings of the 2nd International Conference on Intelligent Semantic Web-Services and Applications, ISWSA'11
DOIs
Publication statusPublished - 18 Apr 2011
Externally publishedYes
Event2nd International Conference on Intelligent Semantic Web-Services and Applications, ISWSA 2011 - Amman, Jordan
Duration: 18 Apr 201120 Apr 2011

Conference

Conference2nd International Conference on Intelligent Semantic Web-Services and Applications, ISWSA 2011
Country/TerritoryJordan
CityAmman
Period18/04/1120/04/11

Keywords

  • Arabic text classification
  • Naive Bayesian
  • Stop word elimination
  • Support vector machine

Cite this