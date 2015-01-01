Abstract

Road crashes cause significant traffic delay, which can bring unnecessary financial losses. The objective of this study is to predict the level of delay caused by crashes (LDC) and discuss significant risk factors. To ensure the efficiency and accuracy of prediction, an improved stacking model was developed using Texas crash data of 2020. The first layer integrates seven base classifiers and the second layer tests three classifiers with different advantages. To improve and simplify the stacking model, three state-of-the-art methods--Bayesian hyperparameter optimization (BO), multiobjective feature selection (FS), and ensemble selection (ES)--were used. First, the hyperparameters and the least and most effective features were selected for each base classifier by BO and FS, respectively. Then ES, considering diversity and performance, selects the least base classifiers to reduce the input of the second layer. Finally, permutation feature importance was used to interpret the best stacking model. The results indicate that the stacking model achieves superior performance on four indicators: recall,