Comparative Analysis of Machine Learning Models for Predicting Operative Time in Neurosurgical Procedures: Institutional Data vs. Nationwide Dataset Approaches
Research Fellow Mayo Clinic Rochester, Minnesota, United States
Introduction: Estimating operative time for planned surgeries typically relies on the surgical team's experience or retrospective case studies. Efficient scheduling and OR management are crucial for hospitals to deliver timely, cost-effective care. This study aimed to develop predictive machine learning models using both institutional data and a nationwide database to assess the accuracy of operative time predictions for elective neurosurgical procedures.
Methods: We analyzed two datasets: one from an institutional cohort and another from the ACS-NSQIP database, each containing details on patients and neurosurgical procedures. Due to the positively skewed distribution of operative times, the top 5% of outliers were removed from both datasets. We applied various machine learning models, including Linear Regression (LR), Support Vector Regression (SVR), Deep Neural Network (DNN), and Extreme Gradient Boosting Regression (XGBR), followed by extensive hyperparameter tuning. Model performance was assessed using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and R^2 metrics.
Results: Models trained on the institutional dataset generally yielded more accurate predictions (DNN RMSE: 53.44, MAE: 38.33, R^2: 0.74; XGBR RMSE: 52.24, MAE: 37.25, R^2: 0.76) compared to those trained on the ACS-NSQIP dataset (XGBR RMSE: 61.98, MAE: 47.90, R^2: 0.38). This performance gap underscores the impact of dataset characteristics on predictive accuracy. The XGBR model consistently performed best across both datasets, demonstrating robustness with diverse data features.
Conclusion : Our findings suggest that while machine learning models offer valuable support in predicting operative times, the choice of dataset significantly influences their effectiveness. Institutional data, which may capture finer clinical details, appears to enable more precise predictions. These results imply that institution-specific models may be preferable for optimizing OR scheduling.