An interpretable and balanced machine learning framework for Parkinson’s disease prediction using feature engineering and explainable AI

Nasim Mahmud Nayan, Al Mamun Rana, Md. Monirul Islam, Jia Uddin, Tahmina Yasmin, Jasim Uddin*, Syed Nisar Hussain Bukhari (Editor)

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Parkinson’s disease (PD) is a progressive neurological disorder that affects millions globally, posing significant challenges in early and accurate diagnosis. Recent advancements in machine learning (ML) offer promising approaches for addressing these challenges by enabling more precise and efficient PD predictions. This paper proposes an enhanced ML framework for PD prediction, integrating data balancing, feature selection, and explainable AI techniques. We evaluate nine different ML algorithms using a dataset of clinical and voice features. To address the class imbalance, we employ the Synthetic Minority Oversampling Technique (SMOTE) and NearMiss, comparing results to an imbalanced baseline. Feature engineering approaches, including Featurewiz, Tree based Feature Importance and the chi-square test, are utilized to identify key predictive features such as Pitch Period Entropy (PPE), Noise-to-Harmonic Ratio (NHR), and other voice biomarkers. Explainable AI (XAI) techniques (SHAP and LIME) interpret model decision-making and highlight influential features. The best-performing model, KNN with SMOTE, achieved 92% accuracy, F1-score 0.94, and a G-Mean of 0.95—demonstrating balanced, reliable PD detection. While some models achieved higher accuracy on imbalanced data (up to 97%), their performance lacked sensitivity and balance. Our findings suggest that combining SMOTE with feature engineering and XAI substantially enhances model fairness, performance, and interpretability. This research advances PD prediction by providing an accurate and interpretable ML-based diagnostic tool to support early diagnosis and better patient management.
Original languageEnglish
Article numbere0333418
JournalPLoS ONE
Volume20
Issue number10
Early online date31 Oct 2025
DOIs
Publication statusPublished - 31 Oct 2025

Keywords

  • Algorithms
  • Humans
  • Machine Learning
  • Male
  • Parkinson Disease/diagnosis

Cite this