https://doi.org/10.1140/epjds/s13688-018-0163-7
Regular article
Big data would not lie: prediction of the 2016 Taiwan election via online heterogeneous information
1
Beijing Key Laboratory of Emergency Support Simulation Technologies for City Operations, School of Economics and Management, Beihang University, Beijing, China
2
School of Economics and Management, Beihang University, Beijing, China
3
Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, Beijing, China
4
Foster School of Business, University of Washington, Seattle, USA
* e-mail: liugn@buaa.edu.cn
Received:
19
April
2018
Accepted:
31
August
2018
Published online:
12
September
2018
The prevalence of online media has attracted researchers from various domains to explore human behavior and make interesting predictions. In this research, we leverage heterogeneous data collected from various online platforms to predict Taiwan’s 2016 general election. In contrast to most existing research, we take a “signal” view of heterogeneous information and adopt the Kalman filter to fuse multiple signals into daily vote predictions for the candidates. We also consider events that influenced the election in a quantitative manner based on the so-called event study model that originated in the field of financial research. We obtained the following interesting findings. First, public opinions in online media dominate traditional polls in Taiwan election prediction in terms of both predictive power and timeliness. But offline polls can still function on alleviating the sample bias of online opinions. Second, although online signals converge as election day approaches, the simple Facebook “Like” is consistently the strongest indicator of the election result. Third, most influential events have a strong connection to cross-strait relations, and the Chou Tzu-yu flag incident followed by the apology video one day before the election increased the vote share of Tsai Ing-Wen by 3.66%. This research justifies the predictive power of online media in politics and the advantages of information fusion. The combined use of the Kalman filter and the event study method contributes to the data-driven political analytics paradigm for both prediction and attribution purposes.
Key words: Election prediction / Heterogeneous data / Kalman filter / Event study method / Big data
© The Author(s), 2018