Resident University of Pittsburgh Medical Center Pittsburgh, Pennsylvania, United States
Introduction: Patient-reported outcome measures (PROMs) play a crucial role in spine surgery research. However, maintaining a robust database presents logistical challenges, including the need for data vendors and research coordinators, often resulting in missed assessments. Artificial intelligence (AI) chatbots, like Microsoft Copilot, may offer a free solution by predicting patient outcomes based on clinical history. This study evaluates Microsoft Copilot’s ability to predict a patient’s preoperative quality of life using provider-documented patient history.
Methods: Copilot was given de-identified preoperative notes from patients with spinal deformity (n=29) who underwent surgical correction and was prompted to predict five SRS-22r subscores. Paired t-test analysis was utilized to then compare these AI-generated subscores against the patients’ actual SRS-22r sub-scores to assess for variance.
Results: Of the five subcategories, the analysis failed to detect a significant difference between the patient-reported pain score (2.39 +/- 0.63) and Copilot-generated pain score (2.21 +/- 0.56; p = 0.39). Paired t-test was also unable to detect a significant difference between the patient-reported management satisfaction score (2.9 +/- 1.0) and Copilot-generated management satisfaction score (2.90 +/- 0.67; p = 0.9845). The remaining SRS-22 subcategories (mental health, function, and self-image) showed a significant difference between patient-reported and Copilot-generated scores (p < 0.05) suggesting that Copilot is unable to predict these with equal variance.
Conclusion : No significant differences were found between patient-reported and Copilot-generated SRS-22r pain and management satisfaction subscores, indicating that chatbots may offer a viable and feasible retrospective method of PROM collection. This approach could be especially beneficial for low-resource institutions looking to develop a research enterprise.