Detecting binge drinking and alcohol-related risky behaviours from Twitter's users: an exploratory content- and topology-based analysis

Crocamo, Cristina; Viviani, Marco; Bartoli, Francesco; Carrà, Giuseppe; Pasi, Gabriella

doi:10.3390/ijerph17051510

SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.

RSS Feed

HELP: Tutorials | FAQ

CONTACT US: Contact info

Search Results

Journal Article

Detecting binge drinking and alcohol-related risky behaviours from Twitter's users: an exploratory content- and topology-based analysis
Citation	Crocamo C, Viviani M, Bartoli F, Carrà G, Pasi G. Int. J. Environ. Res. Public Health 2020; 17(5): e1510.
Affiliation	Department of Informatics, Systems, and Communication. University of Milano-Bicocca, 20126 Milan, Italy.
Copyright	(Copyright © 2020, MDPI: Multidisciplinary Digital Publishing Institute)
DOI	10.3390/ijerph17051510
PMID	32111047
Abstract	Binge Drinking (BD) is a common risky behaviour that people hardly report to healthcare professionals, although it is not uncommon to find, instead, personal communications related to alcohol-related behaviors on social media. By following a data-driven approach focusing on User-Generated Content, we aimed to detect potential binge drinkers through the investigation of their language and shared topics. First, we gathered Twitter threads quoting BD and alcohol-related behaviours, by considering unequivocal keywords, identified by experts, from previous evidence on BD. Subsequently, a random sample of the gathered tweets was manually labelled, and two supervised learning classifiers were trained on both linguistic and metadata features, to classify tweets of genuine unique users with respect to media, bot, and commercial accounts. Based on this classification, we observed that approximately 55% of the 1 million alcohol-related collected tweets was automatically identified as belonging to non-genuine users. A third classifier was then trained on a subset of manually labelled tweets among those previously identified as belonging to genuine accounts, to automatically identify potential binge drinkers based only on linguistic features. On average, users classified as binge drinkers were quite similar to the standard genuine Twitter users in our sample. Nonetheless, the analysis of social media contents of genuine users reporting risky behaviours remains a promising source for informed preventive programs. Language: en
Keywords	binge drinking; data science; risky health behaviour; social media analytics; supervised machine learning; user-generated content; vulnerability