TY - JOUR PY - 2021// TI - Inferring Twitters' socio-demographics to correct sampling bias of social media data for augmenting travel behavior analysis JO - Journal of big data analytics in transportation A1 - Cui, Yu A1 - He, Qing SP - 159 EP - 174 VL - 3 IS - 2 N2 - Many studies demonstrated that social media data, especially Twitter data, have significant potentials to develop models for estimating travel demand, managing operation, and conducting long-term planning purposes. However, it is well known that research with social media data is facing a looming challenge in sampling bias. The Twitter user's population has huge discrepancies compared with the overall population. Therefore, social media data, when it is directly used for travel behavior analysis, contains biases and errors to some degree. The objective of this study is to correct sampling bias of Twitter data for travel behavior analysis by inferring Twitter users' socio-demographics. This study first links travelers' Twitter account with their Facebook account, and verifies their socio-demographics by Facebook data, assuming that one's Facebook information is real. Second, several models are proposed for predicting socio-demographics, including gender, age, ethnicity, and education levels. Afterward, this paper resamples social media data and compares it to the 2009 California Household Travel Survey data. The resampled data show comparable characteristics to the survey data. This research shed light on tackling sampling bias issues when social media data are incorporated for augmenting travel behavior analysis and urban planning.
Language: en
LA - en SN - 2523-3556 UR - http://dx.doi.org/10.1007/s42421-021-00037-0 ID - ref1 ER -