https://doi.org/10.1140/epjds/s13688-025-00544-y
Research
Unsupervised detection of coordinated information operations in the wild
1
School of Mathematical and Statistical Sciences, Clemson University, 220 Parkway Drive, 29634, Clemson, SC, USA
2
Research Computing and Data, Clemson University, 405 S. Palmetto Blvd., 29634, Clemson, SC, USA
3
Watt Family Innovation Center, Clemson University, 405 S. Palmetto Blvd., 29634, Clemson, SC, USA
4
John E. Walker Dept. of Economics, Clemson University, 309F Wilbur O. and Ann Powers College of Business, 29634, Clemson, SC, USA
a
patrick.lee.warren@gmail.com
Received:
28
April
2024
Accepted:
20
March
2025
Published online:
27
March
2025
This paper introduces and tests an unsupervised method for detecting novel coordinated inauthentic information operations (CIOs) in realistic settings. This method uses Bayesian inference to identify groups of accounts that share similar account-level characteristics and target similar narratives. We solve the inferential problem using amortized variational inference, allowing us to efficiently infer group identities for millions of accounts. We validate this method using a set of five CIOs from three countries discussing four topics on Twitter. We demonstrate that our unsupervised approach detects CIO accounts much better then two existing unsupervised CIO detection methods across the four topics that we consider. Our approach increases detection power (area under the precision-recall curve) relative to a naive baseline (by a factor of 76 to 580), relative to the use of simple flags or narratives on their own (by a factor of 1.3 to 4.8), and comes quite close to a supervised benchmark. Our method is robust to observing only a small share of messaging on the topic, having only weak markers of inauthenticity, and to the CIO accounts making up a tiny share of messages and accounts on the topic. Although we evaluate the results on Twitter, the method is general enough to be applied in many social-media settings.
Key words: Social media / Coordinated information operations / Unsupervised detection / Bayesian modeling
Supplementary Information The online version contains supplementary material available at https://doi.org/10.1140/epjds/s13688-025-00544-y.
© The Author(s) 2025
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.