https://doi.org/10.1140/epjds/s13688-025-00590-6
Research
Exploring the relationship between cancer incidence and the sustainable development goals through complex networks and machine learning
1
Dipartimento Interateneo di Fisica, Università degli Studi di Bari Aldo Moro, 70125, Bari, Italy
2
Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125, Bari, Italy
3
Dipartimento di Biomedicina Traslazionale e Neuroscienze (DiBraiN), Università degli Studi di Bari Aldo Moro, 70124, Bari, Italy
4
Department of Network and Data Science, Central European University, A-1100, Vienna, Austria
a
loredana.bellantuono@uniba.it
Received:
11
May
2025
Accepted:
28
September
2025
Published online:
28
October
2025
The awareness that socioeconomic factors play a significant role in the potential onset of cancer is increasingly widespread; however, a clear understanding of the most influential factors is still lacking. In this study, we explore the relationship between cancer incidence, as recorded by the International Agency for Research on Cancer, and the environmental and socioeconomic well-being of countries, measured by the Sustainable Development Goals (SDGs) indicators. To identify relevant predictors of cancer incidence, we construct a weighted complex network where nodes represent SDG indicators, and links correspond to statistically significant correlations between them. We implement community detection to identify a subset of indicators that incorporates the non-redundant dataset’s information, and use the selected features for a machine learning prediction of cancer incidence rates. Furthermore, we highlight the most influential SDG indicators by means of an eXplainable Artificial Intelligence analysis. We find that not only health-related indicators play a key role in explaining cancer incidence, but also factors related to agriculture, resource availability, and water cleanliness. These findings provide insights into the complex interplay between socioeconomic, environmental, and health factors. This study aims to expand knowledge on non-intuitive associations related to cancer onset and may contribute to the development of effective public prevention policies.
Key words: Network science / Machine learning / SDG / Cancer incidence / XAI
Supplementary Information The online version contains supplementary material available at https://doi.org/10.1140/epjds/s13688-025-00590-6.
© The Author(s) 2025
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

