SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.
RSS Feed

HELP: Tutorials | FAQ
CONTACT US: Contact info

Search Results

Journal Article

Citation

Tahfim SAS, Yan C. Safety (Basel) 2021; 7(2): e32.

Copyright

(Copyright © 2021, MDPI: Multidisciplinary Digital Publishing Institute)

DOI

10.3390/safety7020032

PMID

unavailable

Abstract

The unobserved heterogeneity in traffic crash data hides certain relationships between the contributory factors and injury severity. The literature has been limited in exploring different types of clustering methods for the analysis of the injury severity in crashes involving large trucks. Additionally, the variability of data type in traffic crash data has rarely been addressed. This study explored the application of the k-prototypes clustering method to countermeasure the unobserved heterogeneity in large truck-involved crashes that had occurred in the United States between the period of 2016 to 2019. The study segmented the entire dataset (EDS) into three homogeneous clusters. Four gradient boosted decision trees (GBDT) models were developed on the EDS and individual clusters to predict the injury severity in crashes involving large trucks. The list of input features included crash characteristics, truck characteristics, roadway attributes, time and location of the crash, and environmental factors. Each cluster-based GBDT model was compared with the EDS-based model. Two of the three cluster-based models showed significant improvement in their predicting performances. Additionally, feature analysis using the SHAP (Shapley additive explanations) method identified few new important features in each cluster and showed that some features have a different degree of effects on severe injuries in the individual clusters. The current study concluded that the k-prototypes clustering-based GBDT model is a promising approach to reveal hidden insights, which can be used to improve safety measures, roadway conditions and policies for the prevention of severe injuries in crashes involving large trucks.


Language: en

Keywords

clustering; GBDT; heterogeneity; k-prototypes; large trucks; machine learning; severe injuries

NEW SEARCH


All SafetyLit records are available for automatic download to Zotero & Mendeley
Print