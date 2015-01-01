Abstract

BACKGROUND: Alcohol-related road-traffic injury is the leading cause of premature death in middle- and lower-income countries, including Thailand. Applying machine-learning algorithms can improve the effectiveness of driver-impairment screening strategies by legal limits.



METHODS: Using 4794 RTI drivers from secondary cross-sectional data from the Thai Governmental Road Safety Evaluation project in 2002-2004, the machine-learning models (Gradient Boosting Classifier: GBC, Multi-Layers Perceptrons: MLP, Random Forest: RF, K-Nearest Neighbor: KNN) and a parsimonious logistic regression (Logit) were developed for predicting the mortality risk from road-traffic injury in drunk drivers. The predictors included alcohol concentration level in blood or breath, driver characteristics and environmental factors.



RESULTS: Of 4974 drivers in the derived dataset, 4365 (92%) were surviving drivers and 429 (8%) were dead drivers. The class imbalance was rebalanced by the Synthetic Minority Oversampling Technique (SMOTE) into a 1:1 ratio. All models obtained good-to-excellent discrimination performance. The AUC of GBC, RF, KNN, MLP, and Logit models were 0.95 (95% CI 0.90 to 1.00), 0.92 (95% CI 0.87 to 0.97), 0.86 (95% CI 0.83 to 0.89), 0.83 (95% CI 0.78 to 0.88), and 0.81 (95% CI 0.75 to 0.87), respectively. MLP and GBC also had a good model calibration, visualized by the calibration plot.



CONCLUSIONS: Our machine-learning models can predict road-traffic mortality risk with good model discrimination and calibration. External validation using current data is recommended for future implementation.

