https://doi.org/10.1140/epjds/s13688-023-00401-w
Regular Article
Design and analysis of tweet-based election models for the 2021 Mexican legislative election
1
Niels Bohr International Academy, Niels Bohr Institute, Blegdamsvej 17, DK-2100, Copenhagen, Denmark
2
The Aspen Institute México, Av. Cd. Universitaria 298, Jardines del Pedregal,Álvaro Obregón, 01900, Mexico City, Mexico
3
Max-Planck-Institut für Astrophysik, Karl-Schwarzschild-Str. 1, D-85748, Garching, Germany
4
Ciencia de Datos & Tecnología, Metrics, Cd. Satélite, Naucalpan de Juárez, Juan Escutia 7, 53100, Estado de México, Mexico
5
Facultad de Ciencias, Universidad Nacional Autónoma de México, Investigación Científica, Ciudad Universitaria, Coyoacan, 04510, Mexico City, Mexico
6
Facultad de Negocios, Universidad La Salle México, Benjamín Franklin 45 Col. Condesa, Del. Cuauhtémoc, 06140, Mexico City, Mexico
7
Department of Mathematics, Imperial College London, London, United Kingdom
Received:
3
January
2023
Accepted:
21
June
2023
Published online:
7
July
2023
Modelling and forecasting real-life human behaviour using online social media is an active endeavour of interest in politics, government, academia, and industry. Since its creation in 2006, Twitter has been proposed as a potential laboratory that could be used to gauge and predict social behaviour. During the last decade, the user base of Twitter has been growing and becoming more representative of the general population. Here we analyse this user base in the context of the 2021 Mexican Legislative Election. To do so, we use a dataset of 15 million election-related tweets in the six months preceding election day. We explore different election models that assign political preference to either the ruling parties or the opposition. We find that models using data with geographical attributes determine the results of the election with better precision and accuracy than conventional polling methods. These results demonstrate that analysis of public online data can outperform conventional polling methods, and that political analysis and general forecasting would likely benefit from incorporating such data in the immediate future. Moreover, the same Twitter dataset with geographical attributes is positively correlated with results from official census data on population and internet usage in Mexico. These findings suggest that we have reached a period in time when online activity, appropriately curated, can provide an accurate representation of offline behaviour.
Key words: Social media / Elections / Polling / Twitter
© The Author(s) 2023
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.