AEGD: Arabic essay grading dataset for machine learning

Bassam Al-Shargabi, Rawan Alzyadat, Fadi Hamad

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Recently, developing an Automatic Essays Grading (AEG) system has become an attractive topic in industry and academia. Most of the grading systems rely on machine learning to grade the essays based on a predetermined dataset. However, English essays scored based on Automated Student Assessment Prize (ASAP) dataset whereas the absence of such a dataset for Arabic essays is a major predicament. Therefore, in this paper, we have established the Arabic Essay Grading Dataset (AEGD) that is suitable for machine learning to develop an Arabic AEG system. This dataset comprises a collection of essay questions along with its graded model answers for several topics that cover various school levels. We used the Naive Bayes (NB), Decision tree (J48), and meta classifier as a well-known machine learning algorithms to evaluate and test the established AEGD. The results show that the accuracy rates of the three classifiers have reached 79%, 81%, and 86% based on the established AEGD..

Original languageEnglish
Pages (from-to)1329-1338
Number of pages10
JournalJournal of Theoretical and Applied Information Technology
Volume99
Issue number6
Publication statusPublished - 31 Mar 2021
Externally publishedYes

Keywords

  • Arabic essay grading
  • Automated essay grading
  • Classification algorithm
  • Dataset
  • Machine learning

Cite this