Medical Student Drexel University College of Medicine
Introduction: Bone health is a critical consideration in preoperative spine surgery assessments, as poor bone quality increases the risk of adverse surgical outcomes. The vertebral bone quality (VBQ) score, a measurement obtained from T1-weighted MRI scans, offers a measurement of bone quality when a DEXA scan or CT scan is unavailable. However, calculating VBQ manually is time-intensive and subject to interrater variability. This study aims to develop a deep learning algorithm to automate VBQ calculation from lumbar spine MRIs, enhancing efficiency and clinical applicability.
Methods: We trained a YOLOv8 object detection model on T1-weighted lumbar MRI scans from the SPIDER dataset, which included 257 patient scans with vertebral body segmentations. The dataset was divided into 70% training, 20% validation, and 10% test sets. The model was designed to detect bounding boxes around vertebral bodies from L1 to L4, placing regions of interest (ROI) within each vertebral body and cerebrospinal fluid (CSF) posterior to L3. The average signal intensity of these ROIs was used to calculate the VBQ score by normalizing vertebral body intensity to the CSF intensity. The model’s performance was validated on MRI scans from 47 patients, with two human raters annotating the VBQ score.
Results: The AI model achieved a precision of 0.9429, a recall of 0.9076, and a mean Average Precision (mAP) of 0.9403 at a 0.5 threshold, demonstrating high accuracy in vertebral body detection. The Intraclass Correlation Coefficient (ICC) between AI-derived and human-derived VBQ scores was 0.88–0.93, with Pearson correlation coefficients above 0.85, indicating strong agreement. Root Mean Square Error (RMSE) values were 0.58 and 0.42 for the two raters, supporting the model’s accuracy.
Conclusion : This deep learning algorithm provides a reliable, automated method for calculating VBQ scores from MRI scans, enabling efficient bone quality assessment without the need for manual calculation. By streamlining VBQ score derivation, this AI model supports early detection of patients at risk for poor surgical outcomes, potentially improving preoperative planning and clinical decision-making. Further evaluation on diverse datasets may enhance its generalizability.