Medical Student Drexel University College of Medicine
Introduction: Compression fractures in osteoporotic patients are often underdiagnosed, with approximately one in four women remaining undiagnosed. This study aims to automate the diagnosis of compression fractures using an artificial intelligence (AI) model designed to predict the Genant classification, a widely used system for assessing the severity of vertebral compression fractures. By leveraging AI to identify vertebral keypoints on spinal X-rays, this approach seeks to improve detection accuracy, standardize fracture classification, and reduce the rate of missed diagnoses in clinical practice.
Methods: We developed and trained a YOLO V11 Pose model on the BUU-LSPINE dataset, which includes lateral (LA) X-ray images from 3,600 anonymized patients. Each patient’s dataset contains two views, totaling 7,200 images. The model was trained to accurately locate vertebral keypoints essential for calculating wedge ratios and determining compression fracture severity according to Genant classification criteria. Validation was conducted on an independent test set from the Mayo Clinic, comprising 25 patients representing each Genant grade for anterior and posterior wedge compression fractures, addressing both mild and severe cases that might be missed in standard X-ray evaluation.
Results: The model achieved a precision of 0.96 in locating vertebral keypoints. For classification of compression fractures, sensitivity for low-grade fractures (Genant Grades 0 and 1) was 92%, with specificity at 88%. For higher-grade fractures (Genant Grades 2 and 3), sensitivity reached 94%, with specificity at 89%. Area Under the Curve (AUC) values exceeded 0.90 across all grades, indicating consistent classification performance across varying severity levels, including detection of subtle fractures that can often be missed on X-ray.
Conclusion : The proposed AI-based approach demonstrates classification accuracy on par with manual assessments and has the potential to improve clinical workflows by standardizing fracture detection and reducing grading variability. Its capacity to reliably detect fractures that might be missed in traditional X-ray analysis highlights its utility as a supportive tool in clinical settings. Further validation on external datasets is recommended to evaluate the model’s generalizability across diverse patient populations and fracture presentations.