Associate Professor of Clinical Anesthesiology UCLA
Disclosure information not submitted.
Introduction: Invasive mechanical ventilation (IMV) is vital in intensive care. Extubation failure (EF), although rare, is associated with worse outcomes. Minimizing IMV duration and EF are key priorities. We hypothesize that machine learning (ML) models trained on high-resolution ventilator and clinical data can accurately predict EF within 48 hours across diverse ICU settings.
Methods: Forty-eight-hour EF occurs 1.3% of the time. We trained various ML models with a dataset of 3,642 instances of IMV, from both pediatrics and adults. Of the 3642, 1821 experienced EF at 48 hours and were matched 1:1 with 1821 controls. The dataset was then split into 90% for training and 10% for testing. Summary statistics of readily available hemodynamic and ventilator flowsheet metrics were evaluated for 45 input variables. We performed 5-fold cross-validation on the training set to assess 10 different ML classifiers, including Weighted Light Gradient Boosting Machine (LGBM), Extreme Gradient Boosting, and a Tuned Random Forest Classifier. Hyperparameter tuning was also conducted during cross-validation. The model with the highest mean AUC across cross-validation was subsequently evaluated on the test set.
Results: Among the 10 ML classifiers, the weighted LGBM achieved the highest mean AUC of 0.83 across 5-fold cross-validation. Of the 45 variables evaluated, 39 features were used for model training. When evaluated on the test set, the model achieved an AUC of 0.85 (95% CI: 0.81–0.89), precision of 0.75, recall of 0.87, and F1-score of 0.80.
Conclusions: We developed and evaluated an ML model capable of predicting EF within 48 hours, showing strong performance across ICU datasets and can reliably identify high-risk patients while minimizing false alarms. These results suggest that the model may outperform traditional clinician judgment, providing the basis for an implementable model to predict EF in both pediatric and adult ICUs. While our initial findings show promise for a weighted LGBM model for EF prediction, rigorous external validation on larger datasets is needed to evaluate the model's generalizability.