Measuring the effect of node aggregation on community detection

Yérali Gandica; Adeline Decuyper; Christophe Cloquet; Isabelle Thomas; Jean-Charles Delvenne

doi:10.1140/epjds/s13688-020-00223-0

EPJ

Measuring the effect of node aggregation on community detection

Yérali Gandica¹^,2^*, Adeline Decuyper¹, Christophe Cloquet¹^,2^,3, Isabelle Thomas¹ and Jean-Charles Delvenne¹^,2

¹ Center for Operations Research and Econometrics, Université catholique de Louvain, Louvain-la-Neuve, Belgium
² Institute of Information and Communication Technologies, Electronics and Applied Mathematics, Université catholique de Louvain, Louvain-la-Neuve, Belgium
³ Poppy, Jette, Belgium

Received: 3 June 2019
Accepted: 25 February 2020
Published online: 11 March 2020

Abstract

Many times the nodes of a complex network, whether deliberately or not, are aggregated for technical, ethical, legal limitations or privacy reasons. A common example is the geographic position: one may uncover communities in a network of places, or of individuals identified with their typical geographical position, and then aggregate these places into larger entities, such as municipalities, thus obtaining another network. The communities found in the networks obtained at various levels of aggregation may exhibit various degrees of similarity, from full alignment to perfect independence. This is akin to the problem of ecological and atomic fallacies in statistics, or to the Modified Areal Unit Problem in geography.

We identify the class of community detection algorithms most suitable to cope with node aggregation, and develop an index for aggregability, capturing to which extent the aggregation preserves the community structure. We illustrate its relevance on real-world examples (mobile phone and Twitter reply-to networks). Our main message is that any node-partitioning analysis performed on aggregated networks should be interpreted with caution, as the outcome may be strongly influenced by the level of the aggregation.

Key words: Community detection / Data aggregation / Twitter data / Phone call data

12 Internat. Congress of the Balkan Physical Union
July 8-12, 2025
Bucharest, Romania

EPJ

Measuring the effect of node aggregation on community detection

Conference announcements