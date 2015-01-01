Abstract

Crash data are often characterized with numerous zero observations. Sometimes, the number of zero observations is directly correlated with the selected spatial and/or temporal scales for data aggregation. Finding a balance in aggregation is a critical task in data preparation. On the one hand, using the disaggregated data may result in having excessive zero observations, in which the popular negative binomial model may not be adequate for the safety analysis. On the other hand, too much aggregation may result in loss of information. This paper documents a simulation study that aimed at determining criteria for deciding when data aggregation is needed. The simulation study explores the information loss due to aggregation as a function of precision or accuracy in estimation of model coefficients. The simulation results indicate that the reduction in variability, i.e. coefficient of variation, of the independent variables after aggregation is important criteria to decide on the aggregation level.

Language: en