TY - JOUR PY - 2019// TI - Prediction and factor identification for crash severity: comparison of discrete choice and tree-based models JO - Transportation research record A1 - Wang, Xinyi A1 - Kim, Sung Hoo SP - 640 EP - 653 VL - 2673 IS - 9 N2 - Crash severity is one of the most widely studied topics in traffic safety area. Scholars have studied crash severity through various types of models. Using the publicly available 2017 Maryland crash data from the Department of Maryland State Police, the authors develop a multinomial logit (MNL) model and a random forest (RF) model, which belong to discrete choice and tree-based models, respectively, to (1) identify factors contributing to crash severity and (2) compare prediction performances and interpretation abilities between the two models. Based on the model results, major contributing factors of crash severity are identified, including collision type, occupant age, and speed limit. For the given dataset, RF has a higher prediction accuracy than MNL based on multiple measures (precision, recall, and F1 score), even though the differences are not dramatic. Sensitivity analysis results show that RF is less sensitive than MNL. RF can automatically capture the non-linear effects of continuous variables and reduce the influence of collinearity relationships existing among explanatory variables. This study shows the possibility of conducting sensitivity analysis to enhance understanding of MNL and RF results, and uncovers unique characteristics of the discrete choice and tree-based models.

Language: en

LA - en SN - 0361-1981 UR - http://dx.doi.org/10.1177/0361198119844456 ID - ref1 ER -