https://doi.org/10.1140/epjds/s13688-026-00647-0
Research
Remembering unequally: global and disciplinary bias in LLM reconstruction of scholarly coauthor lists
1
School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran
2
Computing and Software Systems, University of Washington, Bothell, WA, USA
a
This email address is being protected from spambots. You need JavaScript enabled to view it.
Received:
2
November
2025
Accepted:
16
March
2026
Published online:
20
April
2026
Abstract
Ongoing breakthroughs in large language models (LLMs) are reshaping scholarly search and discovery interfaces. While these systems offer new possibilities for navigating scientific knowledge, they also raise concerns about fairness and representational bias rooted in the models’ memorized training data. As LLMs are increasingly used to answer queries about researchers and research communities, their ability to accurately reconstruct scholarly coauthor lists becomes an important but underexamined issue. In this study, we investigate how memorization in LLMs affects the reconstruction of coauthor lists and whether this process reflects existing inequalities across academic disciplines and world regions. We evaluate three prominent models—DeepSeek R1, Llama 4 Scout, and Mixtral 8×7B—by comparing their generated coauthor lists against bibliographic reference data. Our analysis reveals a systematic advantage for highly cited researchers, indicating that LLM memorization disproportionately favors already visible scholars. However, this pattern is not uniform: certain disciplines, such as Clinical Medicine, and some regions, including parts of Africa, exhibit more balanced reconstruction outcomes. These findings highlight both the risks and limitations of relying on LLM-generated relational knowledge in scholarly discovery contexts and emphasize the need for careful auditing of memorization-driven biases in LLM-based systems.
Key words: Large language models / LLM memorization / Disciplinary and regional bias / Coauthor list reconstruction / Fairness and inclusion in scholarly discovery
Handling Editor: Alexander Gates
© The Author(s) 2026
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

