TY - GEN
T1 - An Algorithm for Selecting a Data Mining Technique
AU - Chikohora, Teressa Tjwakinna
AU - Chikohora, Edmore
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/11/25
Y1 - 2021/11/25
N2 - Selecting a data mining technique is an important step in the data mining process. Techniques like association, clustering, regression, naïve Bayes, and time series may be used for data mining. However, the various tools available in the market do not award the user a chance to verify whether the tool is appropriate for their data. This study presents an algorithm that may be used to select a technique based on the structure of the data to be mined. Literature was reviewed to identify the factors that may be considered in selecting a technique. The spiral model was adopted for development of the algorithm. The algorithm compares the data source to the defined criteria which has weights assigned to determine the suitable technique. A score is allocated to each evaluated technique and the technique with the highest score is recommended. The scoring and weighting details are described in pseudocode and flowcharts while Java programming language was used to implement the algorithm. The resultant artefact suggests a data mining technique after analysing the structure of given a data set.
AB - Selecting a data mining technique is an important step in the data mining process. Techniques like association, clustering, regression, naïve Bayes, and time series may be used for data mining. However, the various tools available in the market do not award the user a chance to verify whether the tool is appropriate for their data. This study presents an algorithm that may be used to select a technique based on the structure of the data to be mined. Literature was reviewed to identify the factors that may be considered in selecting a technique. The spiral model was adopted for development of the algorithm. The algorithm compares the data source to the defined criteria which has weights assigned to determine the suitable technique. A score is allocated to each evaluated technique and the technique with the highest score is recommended. The scoring and weighting details are described in pseudocode and flowcharts while Java programming language was used to implement the algorithm. The resultant artefact suggests a data mining technique after analysing the structure of given a data set.
KW - Algorithm
KW - Data mining
KW - Data mining techniques
UR - http://www.scopus.com/inward/record.url?scp=85126603212&partnerID=8YFLogxK
U2 - 10.1109/IMITEC52926.2021.9714525
DO - 10.1109/IMITEC52926.2021.9714525
M3 - Conference contribution
AN - SCOPUS:85126603212
T3 - 2021 3rd International Multidisciplinary Information Technology and Engineering Conference, IMITEC 2021
BT - 2021 3rd International Multidisciplinary Information Technology and Engineering Conference, IMITEC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd International Multidisciplinary Information Technology and Engineering Conference, IMITEC 2021
Y2 - 23 November 2021 through 25 November 2021
ER -