Elsevier

Computer Networks

Volume 51, Issue 16, 14 November 2007, Pages 4586-4595
Computer Networks

A stochastic evolutionary growth model for social networks

https://doi.org/10.1016/j.comnet.2007.06.013Get rights and content

Abstract

We present a stochastic model for a social network, where new actors may join the network, existing actors may become inactive and, at a later stage, reactivate themselves. Our model captures the evolution of the network, assuming that actors attain new relations or become active according to the preferential attachment rule. We derive the mean-field equations for this stochastic model and show that, asymptotically, the distribution of actors obeys a power-law distribution. In particular, the model applies to social networks such as wireless local area networks, where users connect to access points, and peer-to-peer networks where users connect to each other. As a proof of concept, we demonstrate the validity of our model empirically by analysing a public log containing traces from a wireless network at Dartmouth College over a period of three years. Analysing the data processed according to our model, we demonstrate that the distribution of user accesses is asymptotically a power-law distribution.

Introduction

We present a stochastic model for a social network [23], where new actors may join the network, existing actors may become inactive and, at a later stage, may reactivate themselves. Our model captures the evolution of the network, assuming that actors attain new relations or become active according to the preferential attachment rule. The concept of preferential attachment, originating from [19], has become a common theme in stochastic models of networks [2], [16]. This behaviour often results in the “rich get richer” phenomenon, for example, where new relations to existing actors are formed in proportion to the number of relations those actors currently have.

The model presented incorporates the novel aspect of differentiating between active and inactive actors, and allowing actors’ status to change between active and inactive over time. This type of network dynamics is especially relevant to situations where actors may connect/disconnect or login/logout from the network, in particular, when network registration is needed as a prior condition to the first time an actor connects to the network. The network models proposed so far either assume that all actors are active, or that when actors leave the network they do not rejoin it [3].

By deriving the mean-field equations for this model of a social network, we obtain the result that, asymptotically, the distribution of actors obeys a power law. Power-law distributions taking the formf(i)=Ci-ϕ,where C and ϕ are positive constants, are abundant in nature [22]. The constant ϕ is called the exponent of the distribution. Examples of such distributions are: Zipf’s law, which states that the relative frequency of a word in a text is inversely proportional to its rank, Pareto’s law, which states that the number of people whose personal income is above a certain level follows a power-law distribution with an exponent between 1.5 and 2 (Pareto’s law is also known as the 80:20 law, stating that about 20% of the population earn 80% of the income) and Lotka’s law, which states that the number of authors publishing a prescribed number of papers is inversely proportional to the square of the number of publications.

Recently, several researchers have detected power-law distributions in the topology of several networks such as the World-Wide-Web [4], e-mail networks [5], collaboration networks [8], [10] and peer-to-peer networks [20].

There are several examples of networks that can be modelled within our formalism. One example is that of a wireless network [12], where mobile users having, e.g., a laptop, PDA or mobile phone, connect to access points within a defined region (e.g., campus, building or airport). In this case the actors are the users and the relations are between users and access points. The user is active during a connection and inactive otherwise. Another example is that of a peer-to-peer network [17], where users (referred to as peers) connect to other peers in order to exchange information. Peer-to-peer networks are of prime importance to the future of the Internet, as networks such as Bittorrent [18], Kazaa [15] and Skype [11] are becoming increasingly popular and thus account for a sizeable amount of all Internet traffic.

Our stochastic model is based on the transfer of balls (representing actors) between urns (representing actor states), where we distinguish between active balls in, regular, unstarred urns and inactive balls in starred urns. The relationships of a particular actor are represented as pins attached to the corresponding ball.

We note that our urn model is an extension of the stochastic model proposed by Simon in his visionary paper published in 1955 [24], which was couched in terms of word frequencies in a text. Previously, in [8], we considered an alternative extension of Simon’s model by adding a preferential mechanism for discarding balls from urns resulting in an exponential cutoff in the power-law distribution.

In the model we present here, at each step of the stochastic process, with probability p, two events may happen: either a new active ball is added to the first unstarred urn with probability r, or with probability 1  r an inactive ball is selected preferentially from a starred urn and is activated by moving it to the corresponding unstarred urn. Alternatively, with probability 1  p, an active ball is selected preferentially from an unstarred urn and then two further events may happen: it is either moved along to the next unstarred urn with probability q, or with probability 1  q the selected ball becomes inactive by moving it to the corresponding starred urn. We assume that a ball in the ith urn has i pins attached to it (which represents an actor having i relations). Our main result is that the steady-state distribution of this model is an asymptotic power law, and, moreover, as a proof of concept we demonstrate the validity of our model by analysing a large data set from a real wireless network.

The rest of the paper is organised as follows. In Section 2 we present an urn transfer model allowing balls to be active or inactive by moving from starred urns to unstarred urns and vice versa. We then derive in Section 3 the steady-state distribution of the model, which, as stated earlier, follows an asymptotic power-law distribution. In Section 4 we show how we can fit the parameters of the model to data, and in Section 5 we demonstrate how our model can provide an explanation of the empirical distributions found in wireless networks. Finally, in Section 6 we give our concluding remarks.

Section snippets

An urn transfer model

We now present an urn transfer model for a stochastic process that emulates the situation where balls (which might represent actors) become inactive with a small probability, and can later become active again with some probability. We assume that a ball in the ith urn has i pins attached to it (which might represent the actors’ relations). The model is an extension of our previous model of exponential cutoff [6], where balls are discarded with a small probability.

We assume a countable number of

Derivation of the steady-state distribution

Following Simon [24], we now state the mean-field equations for the urn transfer model. For i > 1 we haveEk(Fi(k+1))=Fi(k)+βk(q(i-1)Fi-1(k)-iFi(k))+αk(1-r)iFi(k),where Ek(Fi(k + 1)) is the expected value of Fi(k + 1) given the state of the model at stage k, andβk=1-pi=1kiFi(k),αk=pi=1kiFi(k)are the normalising factors.

Eq. (8) gives the expected number of balls in urni at stage k + 1. This is equal to the previous number of balls in urni plus the probability of adding a ball to urni minus the

Fitting the parameters of the model

In order to validate the model we use the equations we have derived in Section 3 to fit the parameters of the model. As a first step we validate the model through stochastic simulation, and then, in Section 5, we provide a proof of concept on a large data set of a real wireless network.

We note that the full set of parameters will, generally, be unknown for real data sets. The output from each simulation run is the set of unstarred and starred urns, from which we can infer ballsk and ballsk,

Real social networks

As a proof of concept we made use of a public log containing traces of the activity of users within a campus-wide WLAN network recorded by the Crawdad project (http://crawdad.cs.dartmouth.edu) at the Centre for Mobile Computing at Dartmouth College [13]. The data set we elected to work with was collected during 2001–2003 using the syslog system event logging facility available on the wireless access points. Each access point was configured so as to transmit a message logged at one of two

Concluding remarks

We have presented an extension of Simon’s classical stochastic process where each actor can be either in an active or an inactive state. Actors, chosen by preferential attachment, may attain a new relation, become inactive or later become active again. The system is closed in the sense that once an actor enters the system he remains within the system. We have shown in (24), (26) that, asymptotically, the number of active and inactive actors having the prescribed number of relations is a

Trevor Fenner received his Ph.D. in Computer Science from Birkbeck College, University of London, in 1978, having previously gained a B.A. in Mathematics and the Diploma in Computer Science from the University of Cambridge. He is currently a Senior Lecturer in Computer Science at Birkbeck College, and is a member of the Information Management and Web Technologies and the Computational Intelligence research groups. He has research interests and journal publications in many areas including:

References (24)

  • M. Goldstein et al.

    Problem with fitting to the power-law distribution

    European Physical Journal B

    (2004)
  • J. Grossman, Patterns of collaboration in mathematical research, SIAM News 35...
  • Cited by (0)

    Trevor Fenner received his Ph.D. in Computer Science from Birkbeck College, University of London, in 1978, having previously gained a B.A. in Mathematics and the Diploma in Computer Science from the University of Cambridge. He is currently a Senior Lecturer in Computer Science at Birkbeck College, and is a member of the Information Management and Web Technologies and the Computational Intelligence research groups. He has research interests and journal publications in many areas including: algorithms and data structures; algorithms for problems in bio-informatics (in particular, evolutionary reconstruction); stochastic models of the web and other complex networks; game trees; random graphs; and graph reconstruction.

    Mark Levene received his PhD in Computer Science in 1990 from Birkbeck College, University of London, having previously been awarded a BSc in Computer Science from Auckland University New Zealand in 1982. He is currently Professor of Computer Science at Birkbeck College, where he is a member of the Information Management and Web Technologies research group. His main research interests are web search and navigation, web data mining, stochastic models for the evolution of the web and machine learning in games. He has published extensively in these areas, and has recently published a book called An Introduction to Search Engines and Web Navigation.

    George Loizou is Emeritus Professor of The Mathematics of Computation and was previously Head of the School of Computer Science and Information Systems at Birkbeck, University of London. He received the B.A. degree in Mathematics (first class honours), the Postgraduate Diploma in Numerical Analysis, and the Ph.D degree in Computational Methods, all from the University of London. He serves on the editorial board of International Journal of Computer Mathematics, Journal of Parallel, Emergent and Distributed Systems, and Annals of Mathematics, Computing and Teleinformatics.

    George Roussos is a Senior Lecturer at Birkbeck College, University of London where he conducts research in pervasive computing. In particular, he investigates the effects of social activity on system architectures, and explores mechanisms to support navigation and findability. He holds a B.Sc. from University of Athens, an M.Sc. from UMIST and a PhD from Imperial College.

    View full text