https://doi.org/10.1140/epjds/s13688-025-00530-4
Research
Demographic disparity in Wikipedia coverage: a global perspective
1
School of Information, University of Michigan, 48109, Ann Arbor, MI, USA
2
Department of Electrical Engineering and Computer Science, University of Michigan, 48109, Ann Arbor, MI, USA
3
Center for the Study of Complex Systems, University of Michigan, 48109, Ann Arbor, MI, USA
Received:
25
April
2024
Accepted:
3
February
2025
Published online:
21
February
2025
Despite decades-long efforts to increase diversity, underrepresented social groups remain small minorities in many fields. Here, we ask whether disparities in global recognition exist for traditionally underrepresented demographic groups. We investigate whether a notable person’s demographic attributes are associated with their global recognition, considering both the global availability of public information about the person’s life and the consistency of such information. To track bibliographical information about notable people, we study Wikipedia, one of the most accessible knowledge bases on the Web. Using more than 1 million biographical articles from Wikipedia over ten years across the 12 largest language editions of Wikipedia, we study global gender and citizenship disparities in Wikipedia coverage. We measure global coverage in several ways, including the number of languages in which a person appears, the length of a person’s articles, and the global consensus about the person, which measures content similarity in the person’s articles across languages. We find that while females are broadly well-represented in terms of coverage in multiple languages starting from 2015, the quantity of the content of their articles and global consensus disparities persist consistently over time from 2010 to 2020. Additionally, some traditionally underrepresented nationalities are still covered less than their majority counterparts. Also, we observe an improvement on average in coverage while finding a persistent gender disparity in a specific domain, the global appearance of Olympic medal winners.
Key words: Gender disparity / Global disparity / Wikipedia coverage / Computational social science
© The Author(s) 2025
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.