TY - JOUR
T1 - A Cooperative Binary-Clustering Framework Based on Majority Voting for Twitter Sentiment Analysis
AU - Bibi, Maryum
AU - Aziz, Wajid
AU - Almaraashi, Majid
AU - Khan, Imtiaz Hussain
AU - Nadeem, Malik Sajjad Ahmed
AU - Habib, Nazneen
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/3/27
Y1 - 2020/3/27
N2 - Twitter sentiment analysis is a challenging problem in natural language processing. For this purpose, supervised learning techniques have mostly been employed, which require labeled data for training. However, it is very time consuming to label datasets of large size. To address this issue, unsupervised learning techniques such as clustering can be used. In this study, we explore the possibility of using hierarchical clustering for twitter sentiment analysis. Three hierarchical-clustering techniques, namely single linkage (SL), complete linkage (CL) and average linkage (AL), are examined. A cooperative framework of SL, CL and AL is built to select the optimal cluster for tweets wherein the notion of optimal-cluster selection is operationalized using majority voting. The hierarchical clustering techniques are also compared with k-means and two state-of-the-art classifiers (SVM and Naïve Bayes). The performance of clustering and classification is measured in terms of accuracy and time efficiency. The experimental results indicate that cooperative clustering based on majority voting approach is robust in terms of good quality clusters with tradeoff of poor time efficiency. The results also suggest that the accuracy of the proposed clustering framework is comparable to classifiers which is encouraging.
AB - Twitter sentiment analysis is a challenging problem in natural language processing. For this purpose, supervised learning techniques have mostly been employed, which require labeled data for training. However, it is very time consuming to label datasets of large size. To address this issue, unsupervised learning techniques such as clustering can be used. In this study, we explore the possibility of using hierarchical clustering for twitter sentiment analysis. Three hierarchical-clustering techniques, namely single linkage (SL), complete linkage (CL) and average linkage (AL), are examined. A cooperative framework of SL, CL and AL is built to select the optimal cluster for tweets wherein the notion of optimal-cluster selection is operationalized using majority voting. The hierarchical clustering techniques are also compared with k-means and two state-of-the-art classifiers (SVM and Naïve Bayes). The performance of clustering and classification is measured in terms of accuracy and time efficiency. The experimental results indicate that cooperative clustering based on majority voting approach is robust in terms of good quality clusters with tradeoff of poor time efficiency. The results also suggest that the accuracy of the proposed clustering framework is comparable to classifiers which is encouraging.
KW - Cooperative clustering
KW - majority voting
KW - sentiment analysis
KW - twitter sentiment analysis
UR - http://www.scopus.com/inward/record.url?scp=85084111546&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2020.2983859
DO - 10.1109/ACCESS.2020.2983859
M3 - Article
AN - SCOPUS:85084111546
SN - 2169-3536
VL - 8
SP - 68580
EP - 68592
JO - IEEE Access
JF - IEEE Access
M1 - 9049112
ER -