Introduction: Post-extubation failure (PEF)—reintubation within 48 h of extubation—affects 5 %–10 % of ICU patients and portends higher mortality, longer stays, and greater resource use. Accurate risk stratification remains elusive. We sought to develop and validate machine-learning (ML) models leveraging high-resolution temporal data from the MIMIC-IV Respiratory Support benchmark to predict PEF and elucidate key predictors.
Methods: In this retrospective cohort study, we identified 17 476 adult ICU patients in MIMIC-IV who underwent invasive mechanical ventilation followed by an extubation event. For each patient, we extracted hourly ventilator settings, vital signs, and laboratory values over the 12 h preceding the first extubation. Missing data were addressed via simple imputation, k-nearest neighbors (KNN), and multiple imputation by chained equations (MICE), with KNN and MICE performance compared using Kolmogorov–Smirnov tests and kernel density estimation. To mitigate class imbalance (5 % PEF), we applied SMOTE oversampling. Four classifiers—elastic-net logistic regression, Random Forest, XGBoost, and LightGBM—were trained on 80 % of the data with 5-fold cross-validation for hyperparameter tuning and evaluated on a held-out 20 % test set. Model discrimination was assessed by area under the receiver-operating characteristic curve (AUC); calibration and learning curves were also examined. SHAP (Shapley Additive Explanations) were used to interpret model outputs.
Results: PEF occurred in 4.97 % of patients. After SMOTE, the training set comprised 313 100 samples (50 % PEF). On the test set, logistic regression achieved AUC=0.95; Random Forest, AUC=0.98; LightGBM, AUC=0.997; and XGBoost, AUC=0.998. Learning curves plateaued at ~15,000 samples, indicating model stability. SHAP analysis identified length of stay, heart rate variability, and peripheral oxygen saturation as the strongest predictors of PEF.
Conclusions: ML models—particularly tree-based boosters—can accurately predict post-extubation failure using temporal ICU data. SHAP-derived insights highlight modifiable clinical parameters that may guide extubation readiness and resource allocation. External validation is warranted to confirm generalizability.