Predicting Hospital Readmission for Campylobacteriosis from Electronic Health Records: A Machine Learning and Text Mining Perspective

Shang Ming Zhou*, Ronan A. Lyons, Muhammad A. Rahman, Alexander Holborow, Sinead Brophy

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)

Abstract

(1) Background: This study investigates influential risk factors for predicting 30-day readmission to hospital for Campylobacter infections (CI). (2) Methods: We linked general practitioner and hospital admission records of 13,006 patients with CI in Wales (1990–2015). An approach called TF-zR (term frequency-zRelevance) technique was presented to evaluates how relevant a clinical term is to a patient in a cohort characterized by coded health records. The zR is a supervised term-weighting metric to assign weight to a term based on relative frequencies of the term across different classes. Cost-sensitive classifier with swarm optimization and weighted subset learning was integrated to identify influential clinical signals as predictors and optimal model for readmission prediction. (3) Results: From a pool of up to 17,506 variables, 33 most predictive factors were identified, including age, gender, Townsend deprivation quintiles, comorbidities, medications, and procedures. The predictive model predicted readmission with 73% sensitivity and 54% specificity. Variables associated with readmission included male gender, recurrent tonsillitis, non-healing open wounds, operation for in-gown toenails. Cystitis, paracetamol/codeine use, age (21–25), and heliclear triple pack use, were associated with a lower risk of readmission. (4) Conclusions: This study gives a profile of clustered variables that are predictive of readmission associated with campylobacteriosis.

Original languageEnglish
Article number86
Pages (from-to)86
Number of pages1
JournalJournal of Personalized Medicine
Volume12
Issue number1
Early online date10 Jan 2022
DOIs
Publication statusPublished - 10 Jan 2022

Keywords

  • Campylobacter infections
  • Electronic health records
  • Feature selection
  • Hospitalisation
  • Machine learning
  • Readmission
  • Text mining

Cite this