Uncovering nodes that spread information between communities in social networks
Department of Mathematics and Statistics, University of Strathclyde, 26 Richmond Street, Glasgow, G1 1XH, UK
* e-mail: firstname.lastname@example.org
Accepted: 17 September 2014
Published online: 22 October 2014
From many datasets gathered in online social networks, well defined community structures have been observed. A large number of users participate in these networks and the size of the resulting graphs poses computational challenges. There is a particular demand in identifying the nodes responsible for information flow between communities; for example, in temporal Twitter networks edges between communities play a key role in propagating spikes of activity when the connectivity between communities is sparse and few edges exist between different clusters of nodes. The new algorithm proposed here is aimed at revealing these key connections by measuring a node’s vicinity to nodes of another community. We look at the nodes which have edges in more than one community and the locality of nodes around them which influence the information received and broadcasted to them. The method relies on independent random walks of a chosen fixed number of steps, originating from nodes with edges in more than one community. For the large networks that we have in mind, existing measures such as betweenness centrality are difficult to compute, even with recent methods that approximate the large number of operations required. We therefore design an algorithm that scales up to the demand of current big data requirements and has the ability to harness parallel processing capabilities. The new algorithm is illustrated on synthetic data, where results can be judged carefully, and also on a real, large scale Twitter activity data, where new insights can be gained.
Key words: social network analysis / community analysis / Twitter / viral content / community connectivity / betweenness / information diffusion
© The Author(s), 2014