TY - JOUR PY - 2021// TI - Spatio-temporal crash prediction: effects of negative sampling on understanding network-level crash occurrence JO - Transportation research record A1 - Way, Peter A1 - Roland, Jeremiah A1 - Sartipi, Mina A1 - Osman, Osama SP - 225 EP - 234 VL - 2675 IS - 6 N2 - In projects centered around rare event case data, the challenge of data comprehension is greatly increased because of insufficient data for deriving insight and analysis. This is particularly the case with traffic crash occurrence, where positive events (crashes) are rare and, in most cases, no data set exists for negative events (non-crashes). One method to increase available data is negative sampling, which is the process of creating a negative event based on the absence of a positive event. In this work, four negative sampling techniques are presented with varying ratios of negative to positive data. These types of techniques are based on spatial data, temporal data, and a mixture of the two, with the data ratios acting as class balancing tools. The best performing model found was with a negative sampling technique that shifted temporal information and had an even 50/50 data split, with an F-1 score, a formulaic combination of precision and recall, of 93.68. These results are promising for Inteligent Transportation Systems (ITS) applications to inform of potential crash locations in an entire area for proactive measures to be put in place.
Language: en
LA - en SN - 0361-1981 UR - http://dx.doi.org/10.1177/0361198121991836 ID - ref1 ER -