https://doi.org/10.1140/epjds/s13688-025-00550-0
Research
Uncovering large inconsistencies between machine learning derived gridded settlement datasets
1
Department of Computer Science, IT University of Copenhagen, Rued Langgaards Vej 7, DK-2300, Copenhagen, Denmark
2
Pioneer centre for Artifcial Intelligence (P1), Copenhagen, Denmark
3
United Nations Children’s Fund, New York, NY, USA
Received:
18
September
2024
Accepted:
3
April
2025
Published online:
26
August
2025
High-resolution human settlement maps provide detailed delineations of where people live and are vital for scientific and practical purposes, such as rapid disaster response, allocation of humanitarian resources, and international development. The increased availability of high-resolution satellite imagery, combined with powerful techniques from machine learning and artificial intelligence (AI), has spurred the creation of a wealth of settlement datasets. The agreement and alignment between these datasets has not been studied in detail. We compare three settlement maps developed by Google (Open Buildings), Meta (High Resolution Population Density Maps) and Microsoft (Global Building Footprints), and uncover which factors drive mismatch. Our study focuses on 44 African countries. We build a global machine learning model to predict where datasets agree, and find that geographic and socio-economic factors considerably impact overlap. However, we also find there is great variability across countries, suggesting complex interactions between country morphology and dataset overlap. It is vital to understand the shortcomings of AI-derived settlement layers as international organizations, governments, and NGOs are already experimenting with incorporating these into programmatic work. We anticipate our work to be a starting point for more critical and detailed analyses of AI derived datasets for humanitarian, policy, and scientific purposes.
Key words: Machine learning / Remote sensing / Population estimates / Human settlements
Supplementary Information The online version contains supplementary material available at https://doi.org/10.1140/epjds/s13688-025-00550-0.
© The Author(s) 2025
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

