Document Type


Publication Date



136361489 (Orcid)


Background: Sickle cell disease (SCD) is the most common inherited blood disorder affecting millions of people worldwide. Most patients with SCD experience repeated, unpredictable episodes of severe pain. These pain episodes are the leading cause of emergency department visits among patients with SCD and may last for several weeks. Arguably, the most challenging aspect of treating pain episodes in SCD is assessing and interpreting a patient's pain intensity level. Objective: This study aims to learn deep feature representations of subjective pain trajectories using objective physiological signals collected from electronic health records. Methods: This study used electronic health record data collected from 496 Duke University Medical Center participants over 5 consecutive years. Each record contained measures for 6 vital signs and the patient's self-reported pain score, with an ordinal range from 0 (no pain) to 10 (severe and unbearable pain). We also extracted 3 features related to medication: medication type, medication status (given or applied, or missed or removed or due), and total medication dosage (mg/mL). We used variational autoencoders for representation learning and designed machine learning classification algorithms to build pain prediction models. We evaluated our results using an accuracy and confusion matrix and visualized the qualitative data representations. Results: We designed a classification model using raw data and deep representational learning to predict subjective pain scores with average accuracies of 82.8%, 70.6%, 49.3%, and 47.4% for 2-point, 4-point, 6-point, and 11-point pain ratings, respectively. We observed that random forest classification models trained on deep represented features outperformed models trained on unrepresented data for all pain rating scales. We observed that at varying Likert scales, our models performed better when provided with medication data along with vital signs data. We visualized the data representations to understand the underlying latent representations, indicating neighboring representations for similar pain scores with a higher resolution of pain ratings. Conclusions: Our results demonstrate that medication information (the type of medication, total medication dosage, and whether the medication was given or missed) can significantly improve subjective pain prediction modeling compared with modeling with only vital signs. This study shows promise in data-driven estimated pain scores that will help clinicians with additional information about the patient's condition, in addition to the patient's self-reported pain scores.


This work is licensed under CC BY 4.0