Abstract

An abundance of GPS data on walking and cycling requires substantial data processing - a primary component of which is trip identification to distinguish between data recorded during travel versus activities. The objective of this paper is to systematically evaluate trip identification algorithms in the literature and provide recommendations to improve performance for walking and cycling trips. Fourteen algorithms are applied to 1685 GPS trajectories from Vancouver, Canada, and evaluated on the bases of their agreement, distinction of trip and activity data characteristics, processing time, and accuracy (based on a labeled subset). Error sources are identified in relationship to trajectory, network, and weather factors.



RESULTS indicate poor concordance and widely varying performance, with no algorithm best across all measures. Four high-performing algorithms are identified with at least 90% record-level accuracy; other considerations include accuracy of the inferred number and duration of trips, precision of identified trip end points, and computational resource requirements. Density is a key variable for trip identification in the best-performing algorithms. Proximity to tree canopy, buildings, bridges, and tunnels affects the accuracy of some algorithms more than others. Most algorithms err almost entirely with false trips or false activities, which is a bias of concern for analysis. The importance of trip identification decisions should motivate more thorough reporting to enhance reproducibility and reliability.

