Publication Date

2024

Document Type

Thesis

Committee Members

Tanvi Banerjee, Ph.D. (Advisor); Michael L. Raymer, Ph.D. (Committee Member); Wen Zhang, Ph.D. (Committee Member)

Degree Name

Master of Science (MS)

Abstract

Sickle Cell Disease (SCD) is one of the most prevalent genetic blood disorders affecting millions of people worldwide. It is often accompanied by acute and/or chronic pain leading to increased healthcare costs and adverse outcomes. Effective management of SCD requires an understanding of the diverse physiological profiles. This study employs unsupervised machine learning, specifically K-means clustering to categorize the patients suffering with SCD into different clusters based on their vital signs. The main aim is to identify the groups that reflect similarities in physiological and pain profiles, allowing an in-depth analysis to reveal distinctive features distinguishing patient clusters. The project pipeline involved data collection, preprocessing, clustering, cluster validation and statistical analysis of clusters. Following this we found the choice of four clusters to be the best fit for the patient cohort using cluster validity measures with the following physiological behavior: (i) a combination of low blood pressure, high respiration, and elevated heart rates; (ii) low blood pressure and slightly high oxygen saturation; (iii) high blood pressure; and (iv) elevated heart rates. Statistical methods ANOVA (Analysis of Variance) and effect size calculations were performed to validate the obtained clusters and assess the importance and amplitude of feature differences across the clusters. The findings demonstrate the effectiveness of unsupervised learning in revealing patient heterogeneity within SCD population. The study concludes that clustering can play a vital role in enabling healthcare providers with a better understanding of patient-specific needs.

Page Count

62

Department or Program

Department of Computer Science and Engineering

Year Degree Awarded

2024


Share

COinS