Preface
Transportation networks, communication networks and social networks form the
backbone of modern society. In recent years there has been a growing fascination
with the complex connectedness such networks provide. This connectedness
manifests itself in many ways: in the rapid growth of Internet and the WorldWide Web, in the ease with which global communication takes place, in the
speed at which news and information travel around the world, and in the fast
spread of an epidemic or a financial crisis. These phenomena involve networks,
incentives, and the aggregate behaviour of groups of people. They are based on
the links that connect people and their decisions, with global consequences.
A network can be viewed as a graph. The vertices in the graph represent the
nodes of the network, the edges connecting the vertices represent the links between the nodes. Neurons in the brain are connected by synapses, proteins
inside the cell by physical contacts, people within social groups by common interests, countries of the world by economic relationships and financial markets,
companies by trade, and computers in the Internet by cables transferring data.
Despite the very different origin and nature of these links, all systems share an
underlying networked structure representing their large-scale organisation. This
organisation in turn leads to self-organisation, and other emergent properties,
which can only be understood by analysing the overall architecture of the system
rather than its constituents elements alone.
The Science of Complex Networks constitutes a young and active area of research, inspired by the empirical study of real-world networks (either physical, chemical, biological, economic or social). Big Data are continuously being
recorded and stored into large data sets, from biological data resulting from
DNA sequencing and the investigation of protein interactions and function, via
financial data reporting the high-frequency behaviour of stock markets, to informatics data mapping the structure and dynamics of the Internet and the
World-Wide Web. Each such data set is the analogue of the outcome of a
large-scale experiment that is rather different from the experiments carried out
in a laboratory. We are therefore experiencing an unprecedented possibility of
analysing experimental data and using them to formulate and test theoretical
models of complex networks.
Most complex networks display non-trivial topological features, with patterns
of connection that are neither purely regular nor purely random. Such features
include a heavy tail in the empirical distribution of the number of edges incident to a vertex (scale freeness), insensitivity of this distribution to the size
of the network (sparseness), small distances between most vertices (small
2
world), likeliness that two neighbours of a vertex are also neighbours of each
other (highly clustered), positivity of the correlation coefficient between the
numbers of edges incident to two neighbouring vertices (assortativity), community structure and hierarchical structure. The challenge is to understand the
effect such features have on the performance of the network, via the study of
models that allow for computation, prediction and control.
The present document contains the notes for the course on Complex Networks
offered by the Departments of Mathematics, Physics and Computer Science of
the Faculty of Science at Leiden University, The Netherlands. This course is
intended for third-year bachelor students and first-year master students. Its aim
is to provide an introduction to the area, covering both theoretical principles
and practical applications from various different directions (see the course schedule and the table of contents below). Complex Networks is a multi-disciplinary
course: it exposes views on the area from mathematics, physics and computer
science, and is open to students from all programs in these three disciplines.
At the same time it assumes basic knowledge at the bachelor level in each of
these disciplines, including key concepts from calculus (differentiation, integration, limits), probability theory (probability distributions, random variables,
stochastic processes), statistical physics (ensembles, entropy), and computer
programming (C, Java or Python). The course is both challenging in terms of
panorama and rewarding in terms of insight.
In the course we highlight some of the many fruitful ways in which mathematics,
physics and computer science come together in the study of complex networks.
The course is divided into two parts:
(I) Theory of Networks. In Chapter 1 we provide a general introduction
to complex networks by reporting on some of the empirically observed
properties of real-world networks, highlighting the universal behaviour
observed across many of them. In Chapters 23 we introduce some of the
most important mathematical models of networks, from simple models
to more difficult models aimed at reproducing the empirical properties of
real-world networks. In Chapters 45 we offer an empirical characterization of real-world networks and a statistical-physics description of some
of the aforementioned models, with the aim of providing tools to identify
structural patterns in real-world networks. In Chapters 67, finally, we
review various key contributions of computer science, including algorithms
to generate random graphs and measure their properties, as well as the use
of visualization tools to gain insight into the structure of random graphs.
(II) Applications of Networks. We exploit the theoretical and methodological tools introduced in (I) to illustrate important applications in Percolation (Chapter 8), Epidemiology (Chapter 9), Pattern detection (Chapter 10), Self-organisation (Chapter 11), Network dynamics and Network
properties (Chapter 12) and Real Networks (Chapter 13). Much of (II)
deals with the interplay between structure and functionality of random
graphs, i.e., with the question how the topology of a network affects the
behaviour of a process taking place on it.
As a red thread through the course we use the so-called Configuration Model,
a random graph with a prescribed degree sequence. This allows us to link up
concepts and tools.
Course overview
Chapter
Teacher
Topic
Introduction
Theory
FdH
FdH
DG
DG
AP
AP
FdH
FdH
10
DG
11
DG
self-organised networks
12
AP
13
AP
Applications
Contents
I
Theory of Networks
1 Real-world Networks
1.1 Complex networks . . . . . . . . . . . . . . . . . . . . . . .
1.2 Social Networks . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1 Acquaintance Networks . . . . . . . . . . . . . . . .
1.2.2 Collaboration Networks . . . . . . . . . . . . . . . .
1.2.3 World-Wide Web . . . . . . . . . . . . . . . . . . . .
1.3 Technological Networks . . . . . . . . . . . . . . . . . . . .
1.3.1 Internet . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.2 Transportation Networks . . . . . . . . . . . . . . .
1.3.3 Energy Networks . . . . . . . . . . . . . . . . . . . .
1.4 Economic Networks . . . . . . . . . . . . . . . . . . . . . . .
1.4.1 Financial Networks . . . . . . . . . . . . . . . . . . .
1.4.2 Shareholding Networks . . . . . . . . . . . . . . . . .
1.4.3 World Trade Web . . . . . . . . . . . . . . . . . . .
1.5 Biological Networks . . . . . . . . . . . . . . . . . . . . . .
1.5.1 Metabolic Networks . . . . . . . . . . . . . . . . . .
1.5.2 Protein Interaction Networks and Genetic Networks
1.5.3 Neural Networks and Vascular Networks . . . . . . .
1.5.4 Food Webs . . . . . . . . . . . . . . . . . . . . . . .
1.6 Still other types of networks . . . . . . . . . . . . . . . . . .
1.6.1 Semantic Networks . . . . . . . . . . . . . . . . . . .
1.6.2 Co-occurrence Networks . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
10
10
11
11
12
15
18
18
21
22
22
22
23
23
25
25
25
25
26
26
26
26
2 Random Graphs
2.1 Graphs, random graphs, four scaling features
2.2 Erd
os-Renyi random graph . . . . . . . . . .
2.2.1 Percolation transition . . . . . . . . .
2.2.2 Scaling features . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
31
31
34
35
38
3 Network Models
3.1 The configuration model . . . . . .
3.1.1 Motivation . . . . . . . . .
3.1.2 Construction . . . . . . . .
3.1.3 Graphical degree sequences
3.1.4 Percolation transition . . .
3.1.5 Scaling features . . . . . . .
3.2 Preferential attachment model . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
40
40
40
40
41
44
45
45
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
3.2.1
3.2.2
3.2.3
3.2.4
Motivation . . . . .
Construction . . . .
Scaling features . . .
Dynamic robustness
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
45
46
49
50
4 Network Topology
4.1 Basic notions . . . . . . . . . .
4.2 Empirical topological properties
4.2.1 First-order properties .
4.2.2 Second-order properties
4.2.3 Third-order properties .
4.2.4 Global properties . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
53
53
56
57
61
66
70
5 Network Ensembles
5.1 Equiprobability in the Erd
os-Renyi model . .
5.2 Implementations of the Configuration Model
5.2.1 Link stub reconnection . . . . . . . . .
5.2.2 The local rewiring algorithm . . . . .
5.2.3 The Chung-Lu model . . . . . . . . .
5.2.4 The Park-Newman model . . . . . . .
5.3 Maximum-entropy ensembles . . . . . . . . .
5.3.1 The Maximum Entropy Principle . . .
5.3.2 Simple undirected graphs . . . . . . .
5.3.3 Directed graphs . . . . . . . . . . . . .
5.3.4 Weighted graphs . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
77
78
79
80
81
82
84
88
89
91
92
93
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
95
96
96
97
97
98
99
99
99
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
102
102
102
103
104
104
104
105
II
Applications of Networks
106
8 Percolation
107
8.1 Ordinary percolation . . . . . . . . . . . . . . . . . . . . . . . . . 107
CONTENTS
8.2
8.3
9 Epidemiology
9.1 The contact process on infinite lattices . .
9.1.1 Construction . . . . . . . . . . . .
9.1.2 Shift-invariance and attractiveness
9.1.3 Convergence to equilibrium . . . .
9.1.4 Critical infection threshold . . . .
9.2 The contact process on large finite lattices
9.3 The contact process on random graphs . .
9.4 Spread of a rumour on random graphs . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
116
116
116
117
118
118
119
120
121
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
124
124
125
126
127
127
129
129
134
141
11 Self-Organized Networks
11.1 Introduction . . . . . . . . . . . . . . .
11.2 Scale invariance and selforganization
11.2.1 Geometric fractals . . . . . . .
11.2.2 SelfOrganized Criticality . . .
11.3 The fitness model . . . . . . . . . . . .
11.3.1 Particular cases . . . . . . . . .
11.4 A selforganized network model . . . .
11.4.1 Motivation . . . . . . . . . . .
11.4.2 Definition . . . . . . . . . . . .
11.4.3 Analytical solution . . . . . . .
11.4.4 Particular cases . . . . . . . . .
11.5 Conclusions . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
145
145
146
146
147
151
152
153
154
155
155
159
163
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
170
170
170
173
174
175
175
176
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Part I
Theory of Networks
Chapter 1
Real-world Networks
1.1
Complex networks
The advent of the computer age has incited a mounting interest in the fundamental properties of real-world networks. Due to the vast computational power
that is presently available, large data sets can be easily stored and analysed.
This has had a profound impact on the empirical study of large networks. A
striking conclusion from this empirical work is that real-world networks share
fascinating features. Many are small worlds, which means that most nodes are
separated from each other by relatively short chains of links. Because networks
tend to operate efficiently, this property was perhaps to be expected. More surprisingly, however, many networks are sparse, which means that the empirical
distribution of the degree (= number of links to other nodes) of the nodes is
almost independent of the size of the network. In addition, they are scale-free,
which means that the fraction of nodes with degree k is approximately proportional to k for some > 1, i.e., many real-world networks appear to have
power-law degree distributions.1 The above observations have had fundamental
implications for scientific research on networks. The aim of this research is to
understand why networks share these features, and what the qualitative and the
quantitative aspects of these features are.
Complex networks plays an increasingly important role in science. Examples
include electrical power grids, transportation and traffic networks, telephony
networks, Internet and the World-Wide Web, Facebook and Twitter, as well as
collaboration and citation networks of scientists. The structure of such networks
affects their performance. For instance, the topology of social networks affects
the spread of information and disease, while the topology of Internet affects
its success as a means of communication. See Barabasi [4], Watts [39] and
Newman, Watts and Barabasi [32] for expository accounts on the discovery of
network properties and the empirical measurement of these properties.
Networks are modelled as graphs, i.e., a set of vertices connected by a set
of edges. A common feature of real-world networks is that they are large and
complex. Consequently, a global description of their topology is impossible,
which is why researchers have turned to local properties: How many vertices
1 In Chapters 2, 3 and 4 we provide rigorous definitions of the above structural properties,
as well as extensive supporting empirical evidence.
10
11
does the network have? According to what rules are the vertices connected to
one another by edges? What cluster sizes and cluster shapes are most common?
What is the average distance between two vertices? What is the maximal distance between two vertices? These local properties are typically probabilistic,
which leads to the study of random graphs.
The observation that many real-world networks share the properties mentioned above has incited a burst of activity in network modeling. In this course
we survey some of the proposals made for network models. Most models use
random graphs as a way to model the uncertainty and the lack of regularity
in real-world networks. These models can be divided into two distinct types:
(1) static, where the aim is to describe networks and their topology at a given
instant of time; (2) dynamic, where the aim is to explain how networks came
to be as they are. The goal is to explain the universal behaviour exhibited by
real-world networks. Dynamic explanations often focus on the growth of the
network as a way to explain power-law degree distributions by means of preferential attachment growth, where new vertices are more likely to be attached to
vertices that already have large degrees.
Most real-world networks can be classified into four broad classes:
(I) Social Networks: WWW, Facebook, Twitter, WhatsApp.
(II) Technological Networks: Internet, power grids, traffic, transportation.
(III) Economic Networks: trade, interbank, interfirm, input/output.
(IV) Biological Networks: metabolic, neural, protein interaction.
In Sections 1.21.5 we describe examples drawn from each of these classes. In
Section 1.6 we mention a few further examples that lie beyond.
Reviews on the subject can be found in Albert and Barabasi [2], Dorogovtsev
and Mendes [18], Newman [30], and van der Hofstad [22]. The exposition below
borrows from Chapter 1 of the latter reference.
1.2
1.2.1
Social Networks
Acquaintance Networks
12
John Guare popularised the phrase when he chose it as the title for his 1990
play. In this play, Ousa, one of the main characters, says:
Everybody on this planet is separated only by six other people. Six degrees
of separation. Between us and everybody else on this planet. The president
of the United States. A gondolier in Venice ... Its not just the big names.
Its anyone. A native in the rain forest. (...) An Eskimo. I am bound to
everyone on this planet by a trail of six people. It is a profound thought.
The fact that, on average, people can be reached by a chain of at most 6 intermediaries is rather striking. It implies that any two people in remote areas
such as Greenland and the Amazone can be linked by a sequence of on average
6 intermediaries. This makes the phrase It is a small world we live in! very
appropriate indeed.
The idea of Milgram was taken up afresh in 2001, with the added possibilities
of the computer era. In 2001, Duncan Watts, a professor at Columbia University,
recreated Milgrams experiment using an e-mail message as the package that
needed to be delivered. Surprisingly, after reviewing the data collected by 48,000
senders and 19 targets in 157 different countries, Watts again found that the
average number of intermediaries was 6. The research of Watts and the advent
of the computer age have opened up new areas of inquiry related to Six Degrees
of Separation in diverse areas of network theory, such as electrical power grids,
disease transmission, corporate communication, and computer circuitry.
To put the idea of a small world into network language, we define the vertices
of the social graph to be the inhabitants of the world (n 7 109 ), and we
draw an edge between two people when they know each other. Of course, we
should make precise what the latter means. Possibilities are various: it could
mean that the two people involved have shaken hands at some point, or meet
regularly, or address each other on a first-name basis, etc. The precise choice
affects the connectivity of the social graph and hence the conclusions we may
draw about its topology.
One of the main difficulties with social networks is that they are notoriously
hard to measure. Questionaires cannot always be trusted, because people have
different ideas about what a certain social relation is. Also, questionaires take
time to fill out and to collect. As a result, researchers are interested in examples
of social networks that can be more easily measured, for instance, because they
are electronic. Examples are e-mail networks, or social networks such as Hyves
and Facebook (see Fig. 1.1).
1.2.2
Collaboration Networks
An interesting example of a complex network that has drawn attention is the collaboration graph in mathematics, which is popularized under the name Erdosnumber project. In this graph, the vertices are mathematicians, and there is
an edge between two mathematicians when they have co-authored a paper.
The Erd
os number of a mathematician counts how many papers that mathematician is away from the legendary mathematician Paul Erdos, who was extremely prolific and wrote around 1500 papers with 511 collaborators. Of all
the mathematicians who are connected to Erdos by a trail of collaborators, the
maximal Erd
os-number is claimed to be 15. On the above website, you can find
13
Figure 1.1: A map of the network of all friendships formed on Facebook across the
world (from https://www.facebook.com/zuck).
out how far your own professors are away from Erdos. Also, it is possible to
find the distance between any two mathematicians worldwide.
The distribution of the Erd
os numbers is given in the following table (based
on data collected in July 2004):
Erd
os number
number of mathematicians
0
1
2
3
4
5
6
7
8
9
10
11
12
13
1
504
6593
33605
83642
87760
40014
11591
146
819
244
68
23
4
The median is 5, the mean is 4.65, and the standard deviation is 1.21. We
note that the Erd
os number is finite if and only if the corresponding mathematician is in the largest connected component of the collaboration graph. See
Fig. 1.2 for an artistic impression of the collaboration graph in mathematics
taken from
http://www.orgnet.com/Erdos.html
14
and Fig. 1.3 for the degree distribution in the collaboration graph.
percentage
62.4%
27.4%
8.0%
1.7%
0.4%
0.1%
The largest number of authors shown for a single item lies in the 20s. Sometimes
the author list includes et al., in which case the number of co-authors is not
known precisely.
The fraction of items authored by just one person has steadily decreased
over time, starting out above 90% in the 1940s and currently standing at under
50%. The entire graph has about 676,000 edges, so that the average number of
collaborators per person is 3.36. See
http://www.oakland.edu/enp
15
1000000
100000
10000
1000
Series1
100
10
1
1
10
100
1000
Degree
In the collaboration graph, the average number of collaborators for people who
have collaborated is 4.25. There are only 5 mathematicians with degree at least
200. The largest degree is for Erd
os, who has 511 co-authors.
The clustering coefficient of a graph is equal to the number of ordered triples
of vertices a, b, c in which the edges ab, bc and ac are present, divided by the
number in which ab and bc are present. In other words, the clustering coefficient
describes how often two neighbors of a vertex are adjacent to each other. The
clustering coefficient of the collaboration graph is 1308045/9125801 = 0.14. The
relatively high value of this number, together with the fact that average path
lengths are small, indicates that the collaboration graph is a small-world graph.
1.2.3
World-Wide Web
The vertices of the WWW are electronic web pages, the edges are hyperlinks
(or URLs) pointing from one web page to another. The WWW is therefore
a directed network, since hyperlinks are not necessarily reciprocated. The
properties of the WWW have been studied by a number of authors: see e.g.
Albert, Jeong and Barab
asi [3], Kleinberg, Kumar, Raghavan, Rajagopalan
and Tomkins [25], Broder, Kumar, Maghoul, Raghavan, Rajagopalan, Stata,
Tomkins and Wiener [7], and the reviews cited at the end of Section 1.1.
While Internet is physical, the WWW is virtual. With the rapid growth
of the WWW, the interest in its properties is growing as well. It is of great
practical importance to know what the structure of the WWW is, for example,
to allow search engines to explore it efficiently. Notorious is the Page-Rank
problem: to rank web pages in such a way that the most important pages come
up first. The Page-Rank algorithm is claimed to be the main reason behind the
success of Google, and its inventors were also the founders of Google (see Brin
16
Albert, Jeong and Barabasi [3] studied the degree distribution of the WWW.
They found that the in-degrees obey a power-law distribution with exponent
in 2.1, while the out-degrees obey a power-law distribution with exponent
out 2.5. Their analysis was based on several Web domains, such as nd.edu,
mit.edu and whitehouse.gov (the Web domains of Notre Dame University, Massachusetts Institute of Technology and the White House). Furthermore, they
investigated the average distance d between the vertices in these domains, and
found it to grow linearly with the logarithm of the size n of the domain, with
an estimated dependence of the form
d = 0.35 + 2.06 log n.
Extrapolating this relation to the estimated size of the WWW at the time
(n = 8 108 ), they concluded that the diameter of the WWW was 19, which
prompted them to the following quote:
Fortunately, the surprisingly small diameter of the web means that all
information is just a few clicks away.
Kumar, Raghavan, Rajagopalan and Tomkins [27] were the first to observe that
the WWW has a power-law degree distribution (see Fig. 1.4).
The most extensive analysis of the WWW was performed by Broder, Kumar,
Maghoul, Raghavan, Rajagopalan, Stata, Tomkins and Wiener [7]. They divide
the WWW into four parts (see Fig. 1.5):
17
Tendrils
44 Million
nodes
IN
SCC
OUT
44 Million nodes
56 Million nodes
44 Million nodes
Tubes
Disconnected components
(a) The Strongly Connected Component (SCC) the central core consisting
of those pages that can reach each other along the directed links (28% of
the pages).
(b) The IN part, consisting of pages that can reach the SCC, but cannot be
reached from it (21% of the pages).
(c) The OUT part, consisting of pages that can be reached from the SCC,
but do not link back to it (21% of the pages).
(d) The TENDRILS and other components, consisting of pages that can neither reach the SCC, nor be reached from it (30% of the pages).
It was found that the SCC has diameter at least 28, while the WWW as a whole
has diameter at least 500. The relatively high values of these numbers are due
in part to the fact that the graph for the WWW is directed. When the WWW
is considered as an undirected graph, the average distance between vertices
decreases to around 7. Furthermore, it was found that both the in-degrees and
the out-degrees in the WWW follow a power-law distribution, with exponents
in 2.1 and out 2.5, in accordance with the rough findings obtained earlier.
When the WWW is considered as a directed graph, the distances between
most pairs of vertices within the SCC are at most 7, similar to the Six Degrees
of Separation found in social networks. See Fig. 1.6 for a histogram of pairwise
distances in the sample.
18
Figure 1.6: Average distances in the Strongly Connected Component of the WWW
(Adamic [1]).
1.3
1.3.1
Technological Networks
Internet
19
0.4
0.12
0.3
0.08
0.2
0.04
0.1
0.00
0.0
1
9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43
8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44
10
11
12
13
Figure 1.8: Internet hopcount data and number of AS traversed in hopcount data.
Data courtesy Hongsuda Tangmunarunkit.
1000
1000
100
100
10
10
100
1
10
[10
20
10000
10000
10000
"971108.out"
exp(7.68585) * x ** ( -2.15632 )
10000
"980410.out"
"981205.out"
exp(7.89793)
exp(8.11393) **xx****(( -2.16356
-2.20288))
exp(8.52124) * x
1000
1000
1000
1000
100
100
100
100
10
10
10
10
10
100
11
11
[ 10
D10@
100
100
[
10
Figure 1.9:
@$ $in
J@the
! : $%7
511/1997
Degree
=$7
distributions
5 - -7$ /
of
AS
X$ domains
5 -- --77H$ and
/
12/1998
X$ @$ $J@= : $%
5 -
=
months
on a log-log scale (Faloutos [20]): Power-law distribution with exponent 2.15 2.20.
U ! $ -_ $ -
$ $ c
X U5$
e>
$ >Hc! !
$ U5$ : - Z X- W $ 5 $
-( f $[ H e HD $U
10000
: =O[W exp(8.52124)
$
$ $
* x ** ("routes.out"
$-2.48626
)
[ X 5 ; @/ _ $
with Nk the number of vertices of
degree
k.
When
N
is
proportional
to
an
$
[1000
/ e$ $$ @D $;:
$ - kH_
3 $;
1000
$10 BQ O[B100
KS/eGH/Cf(7)/g0C./FD=6 1=@/e1?0C.H
$ and
@
$this
e $log-log
$ plot,
B$
for
W the
AS-data
$D5gives [ I
mated by the slope of the line in the
I Uc
- aa X H H_ $ P
$ - 5 B
the estimate
D J
$ : $that
$@%$ $must
S@$= have
$H KN
k =
3n,
so i $ D $ -;
!$72.15
5 -
-72.20.
$ /Naturally,
-X$ : we
%
B 5 - - 7kN
:
$O$ =\ X U0 Qe
@ @
$
$ [ k j
that it is reasonable to assume that
> 1.
a ! $ Wd%(
S _ $$ 5 % $ D
@ / @ -; K $[ H
$
>
U ! $ -_ $ -
$ $ c
X U5$
$ ;$ UWe >
$ %
$ > Hc !$ ! :eH ( $$ ;$
U 7D X $
$ [ X
W !
S D $ $ % $K K=
$ @B$
dqdo|vlv ri wkh ghjuhh judsk prgho +vhf1
=duw
O[riW $ $
$
$ [ $ !
: LLL,1
Wkh
= $ dv 5vlpsoh
$ dqg
1sdu0
0.4 = $ O@
[ X 5
; @/ _ $
$
$ D
prgholqj frqvlvwv lq sursrvlqj d prgho
wkdw
- H_
3 $;
/ e$ $$ @D $;: $ D ; > $ W
$
[pdwfkhv
$ uhdolw|
vlprqlrxv lq lwv sdudphwhuv dv srvvleoh
i $ D $ -;
>$D5 @ ;
0.0
Lq wklv sdshu/ zh irfxv sulpdulo|:rq
wkh
prgholqj
=\ X0 U10 2 Q3e 4 @ 5 @ 6
$
7$ 8[ 9kj X H% W
$Ori$ wkh
prghov=
wkh
DV0krsfrxqw kDV 1 Zh sursrvh wzr
a glhuhqw
! $
S $$ % $ # AS Hops
Z$ $ W D $
@k / @ -; K $[ H <
$;wkh
$
> $% dqg
$ H( $;$ U7D
ghjuhh judsk +vhf1 LLL, iru prgholqj
$ UWDV0krsfrxqw
$
$ [ X [ $ $
>
Figure+vhf1
1.10:
of AS traversed in various
K=
sets.
@B$courtesy
wkh udqgrp judsk zlwk sro|qrpldo olqn zhljkwv
YL,Number
$% $K data
$ Data
7$ Piet
$ Hvan
6} Ai ThLM@M*|) _i?t|) u?U|L? Lu |i 5 LTUL?| uLh
dv d prgho iru wkh LS0krsfrxqw lq dq DV1
Hduolhu zrun
Mieghem.
|hii _gihi?| hi4L|i hL|i UL**iU|Lh +W,c 5Wj @?_ wWj
+^48` wr ^4<`, zdv pruh eldvhg wr prgho wkh LS krsfrxqw
*tL |i ?4Mih Lu _gihi?| 5AOt E| _gihi?| +,6Wj
t 4i?|L?i_ ? |i *@t| UL*4?
kLS 1 Vhfwlrq Y sorwv vlpxodwlrq uhvxowv ri wkh DV krs0
frxqw glvwulexwlrq ghulyhg iurp wkh ghjuhh judsk dqg frp0
Interestingly,
the AS-counts of various different data sets (focussing on differsduhv wkhvh zlwk wkh suhvhqwhg Lqwhuqhw phdvxuhphqwv
ri d Srlvvrq
udqgrp yduldeoh dv h{sodlqhg lq ^48` dqg ixuwkhu
vhfwlrq LL1 Wkh qryhow| ri wkh sdshu olhv lq
glvfxvvlrq
entwkhparts
of the hoderudwhg
Internet)lqyield
roughly
same
picture.
AskDV
shown in Fig. 1.10,
vhf1 YL
dqg YLL/the
zkloh
wkh DV
krsfrxqw
ri wzr ixqgdphqwdoo| glhuhqw prghov wkdw
kdyh
erwk wkh between
the
AS-count
ASs in North-America, respectively, between ASs in
ehkdyhv glhuhqwo|1
srwhqwldo wr pdwfk wkh uvw rughu fkdudfwhulvwlfv +H ^kQ `
wkhvh
revhuydwlrqv/
glhuhqw
are Lq
quite Lqvsluhg
close. e|
This
indicates
thatwzr
theedvlfdoo|
AS-count
is robust, and hints
dqg ydu ^kQ `, ri wkh DV dqg LS krsfrxqw/Europe
uhvshfwlyho|1
zloo eh glvfxvvhg=
wkh ghjuhh judsk
lq vhf1 LLL
dv d the dependence
at uhvxowv
the fact
thatprghov
the AS-graph
is homogenous.
In other
words,
dgglwlrq/ zh suhvhqw qhz dqg pruh suhflvh
rq wkh
prgho iru wkh DV judsk dqg wkh udqgrp judsk zlwk sro|0
ghjuhh judsk wkdq suhylrxvo| rewdlqhg e|ofQhzpdq
hw do1
the AS-count
on the
of YL
thedv network
is LS0krsfrxqw
fairly weak, even though a
qrpldo
olqngeometry
zhljkwv lq vhf1
prgho iru wkh
^46` ru Uhlwwx dqg Qruurv ^47`/ exw suhvhqw wkh lqyroyhg lq dq DV1
pdwkhpdwlfdo surriv hovhzkhuh ^53`1 Rxu frqvwuxfwlrq ri
wkh ghjuhh judsk doprvw dozd|v dyrlgv +xquhdolwlvwlf, vhoi0
YYY Cj aj}ijj }iBV
orrsv/ zklfk duh wrohudwhg lq ^46` dqg ^47`1 Ilqdoo|/ zh sur0
Wkh uvw dssurdfk wr prgho wkh DV krsfrxqw vwduwv e|
srvh dq lqwhjudwhg prgho iru wkh hqg0wr0hqg LS0krsfrxqw
zklfk lv edvhg rq wkh wzr0ohyho urxwlqj klhudufk| lq Lqwhu0 frqvlghulqj d judsk zlwk Q qrghv frqvwuxfwhg iurp d jlyhq
ghjuhh vhtxhqfh/
qhw1
G4> G5> = = = > GQ
YY jBtij6jA|t Nu 7 NVWNA| A YA|jiAj| Kdyho dqg Kdnlpl ^8/ ss1 49` kdyh sursrvhg dq dojrulwkp wr
Wkh Urxwlqj Lqirupdwlrq Vhuylfh +ULV, surylghv lqiru0 frqvwuxfw iurp d jlyhq ghjuhh vhtxhqfh d frqqhfwhg judsk
pdwlrq derxw EJS urxwlqj lq wkh Lqwhuqhw1 Wkh ULV lv d zlwkrxw vhoi0orrsv1 Pruhryhu/ wkh| ghprqvwudwh wkdw/ li
surmhfw ri ULSH +vhh iru pruh ghwdlov ^54`, dqg wkh ULV fro0 wkh ghjuhh vhtxhqfh vdwlvhv fhuwdlq frqvwudlqwv vxfk dv
10000
"981205.out"
exp(8.11393) * x ** ( -2.20288 )
Pr[h AS = k]
RIPE
AMSIX
LINX
E[hAs ]
Var[hAs ]
alfa
2.81
3.13
2.91
1.04
1.06
0.98
2.70 1163687
2.95 366075
2.97 168398
# points
21
priori we might expect geometry to play a role. As a result, most models for
the Internet, as well as for the AS-graph, ignore geometry altogether.
A topic of research that is receiving considerable attention is how the Internet
behaves under random breakdown or malicious attacks. The conclusion is that
the topology of the Internet is critical for its vulnerability. When vertices with
high degrees are taken out, the random graph models for the Internet cease to
have the necessary connectivity properties. See Cohen, Erez, ben Avraham and
Havlin [12, 13].
1.3.2
Transportation Networks
Transportation networks such as road, railway or airline networks (see Fig. 1.11)
tend to become increasingly complex. To allow these networks to function efficiently, traffic controllers need to deal with disruptions (for instance, due to bad
weather conditions). One objective is to develop robust scheduling algorithms
that take the random nature of traffic into account and can properly cope with
disturbances. Another objective is to be able to provide on-the-fly information
to travelers, so that they can adapt their travel plans to changing circumstances.
The most important issue in the Dutch railway network is that it is tight, due
to the scarce space that is available for extending the network where needed.
As a result, the timetable is not sufficiently robust with respect to modifications in the circumstances caused by accidents, weather conditions, or signaling
breakdown.
Most of the scheduling and planning problems in railway and airline traffic
are very hard (so-called NP-complete), but in practice good approximation
algorithms may do a great deal. Also randomized algorithms can be very useful,
22
1.3.3
Energy Networks
Energy networks transport energy from providers to users. Examples are electricity grids. Because of their vital interest, these grids need to be designed to
achieve consistently high levels of performance and reliability, and yet need to
be cost-effective to operate. In order to prevent overflow of buffers, mechanisms
must be put in place to ensure that it is highly unlikely for the aggregate arrival rate to exceed the service rate for any length of time. It is critical that
the aggregate production rate of the energy sources is sufficient to meet the
consumption rate of the users with extremely high probability.
With the rising deployment of renewable resources such as wind farms and
solar panels, the generation of energy increasingly exhibits random fluctuations
over time. In addition, the production rate of conventional energy resources
and power plants is subject to uncertainty and variability, due to supply disruptions, technical failures or calamities. These phenomena give rise to very
distinct characteristics, rendering centralised operation impractical, and creating a strong need for distributed control mechanisms. At the same time, the
rapid advance of smart-grid technology offers growing opportunities for actively
controlling energy supply and demand.
1.4
Economic Networks
1.4.1
Financial Networks
23
these networks are extremely difficult to observe and analyse, because of the
high confidentiality of the data that are required as input.
What is much easier to obtain are the (publicly available) time series of price
increments2 of stocks. From a set of n synchronous time series, it is possible to
calculate the n n matrix of pairwise correlation coefficients between each pair
of stocks. The correlation coefficient ij between two time series xi (t) and xj (t)
(where xi (t) denotes the increment of the i-th time series at time t) is defined
as
xi xj xi xj
(1.3)
ij q
2
xi xi 2 x2j xj 2
where, if f (t) is a time series defined for t = 1, . . . , T , the time average f is dePT
fined as f T1 t=1 f (t). After the empirical correlation matrix is calculated,
networks of financial correlations can be defined by representing the financial
entities (e.g. stocks) as vertices and the strongest correlations as links. The
strongest correlations are defined either as the set {ij } of correlations exceeding a given global threshold , or as the minimum set of correlations (taken in
decreasing value) that ensure some global connectivity property in the output
network (such as the existence of paths connecting all pairs of stocks, while
avoiding the creation of loops, or finally as the set of correlations that exceed
a reference value calculated under some null hypothesis.Networks of financial
correlations have been used to study the returns of assets in a stock market
[28, 5, 33] and interest rates [17]. See Fig. 1.12(a) for an example.
1.4.2
Shareholding Networks
1.4.3
Yet another important networked economic system is the World Trade Web, describing the trade relationships among the world countries (Serrano and Bogu
na
[38]).
2 In the simplest case, the t-th increment of a financial time series is defined as the difference
between the price at time t and the price at time t 1. However, for technical reasons that we
do not discuss here, an alternative and frequently used definition of increment is the difference
between the logarithms of the prices.
24
Figure 1.12: (a) Network formed by the strongest correlations among the stocks of the
S&P500 index, based on correlations between the log-returns of daily closing prices
from 2001 to 2011. Stocks are coloured according to their industrial classification
(from MacMahon and Garlaschelli 2014). (b) Snapshot of the shareholding network
in Italy in 2001. Vertices are companies and edges represent who owns whom, i.e.,
the ownership relations among companies (from Garlaschelli et al. 2005).
1.5
1.5.1
25
Biological Networks
Metabolic Networks
Figure 1.13: A functional network for a yeast cell of correlated genetic interaction
profiles. Genes sharing similar genetic interaction profiles are proximal to one another.
Less similar genes are positioned further apart. Colored genes are enriched for GO
biological processes as indicated (from Costanzo et al. 2010).
1.5.2
Examples at the cellular level include metabolic networks (Jeong, Tombor, Albert, Oltvai and Barab
asi [24]), where metabolic substrates are linked by directed edges when a known biochemical reaction exists between them, and
protein interaction networks (Jeong, Mason, Barabasi and Oltvai [23]), where
proteins are connected by an undirected edge when they interact by physical
contact. Similarly, genetic networks represent the correlations among the expression profiles of different genes in a cell (see Fig. 1.13 for an example on a
yeast cell).
1.5.3
26
1.5.4
Food Webs
Examples at the community level include food webs (Elton [19], Pimm [36],
Cohen, Briand and Newman [11]), where two biological species are connected
by a directed edge when a predator-prey relation exists between them.
1.6
The four classes (I)(IV) in Section 1.1, of which examples were listed in Sections 1.21.5, are not exhaustive. We give two further examples.
1.6.1
Semantic Networks
In word networks, words are represented by vertices and edges are placed between words when some linguistic relation exists between them. Two examples
of undirected networks are word synonymy networks (Ravasz and Barabasi [37]),
where words are connected when they are listed as synonyms in a dictionary,
and word co-occurrence networks (Ferrer i Cancho and Sole [21]), where words
are connected when they appear one or two words apart from each other in the
sentences of a given text.
Examples of directed networks are given by networks of dictionary terms,
where words are connected when a (directed) link between them is reported in
a given dictionary, and of free associations, reporting the outcomes of psychological experiments where people are asked to associate input words to freely
chosen output words.
1.6.2
Co-occurrence Networks
Co-occurrence networks, where nodes represent events and edges are established
between events that co-occur together (possibly with a weight that quantifies
the frequency of co-occurrence), form a huge class of networks. Examples include the aforementioned word co-occurrence networks, as well as the collaboration networks discussed in Section 1.2.2, viewed as examples of social networks.
For instance, examples in the field of scientometrics are co-authorship and cocitation networks, where nodes are scientific articles and edges indicate that two
articles have been co-authored by the same author, respectively, co-cited by the
same paper.
Yet another example is given by networks of co-purchased products, where
two products are linked when they have been frequently purchased together.
Such networks are at the basis of the automatic recommendation systems routinely used e.g. by online shops. In Fig. 1.14 we show The Political Books Network compiled by Valdis Krebs [43]. This network represents books about US
politics sold by Amazon.com. Edges represent frequent co-purchasing of books
by the same buyers, as i ndicated by the customers who bought this book also
bought these other books feature on Amazon.
27
Bibliography
[1] L.A. Adamic, The small world web, in: Lecture Notes in Computer Science
1696, Springer, 1999, pp. 443454.
[2] R. Albert and A.-L. Barabasi, Rev. Mod. Phys. 74 (2002) 47.
[3] R. Albert, H. Jeong and A.-L. Barabasi, Internet: Diameter of the worldwide web, Nature 401 (1999) 130131.
[4] A.-L. Barab
asi, Linked: The New Science of Networks, Perseus Publishing,
Cambridge, Massachusetts, 2002.
[5] G. Bonanno, F. Lillo and R.N. Mantegna, Quantitative Finance 1 (2001)
96.
[6] S. Brin and L. Page, The anatomy of a large-scale hypertextual web search
engine, Computer Networks and ISDN Systems 33 (1998) 107117.
[7] A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata,
A. Tomkins and J. Wiener, Graph structure in the web, Computer Networks 33 (2000) 309320.
[8] G. Caldarelli, S. Battiston, D. Garlaschelli and M. Catanzaro, book chapter
in: Complex Networks (eds. E. Ben-Naim, H. Frauenfelder, Z. Toroczkai),
Lecture Notes in Physics 650, Springer, 2004, pp. 399423.
[9] G. Caldarelli, R. Marchetti and L. Pietronero, Europhys. Lett. 52 (2000)
386.
[10] Q. Chen, H. Chang, R. Govindan, S. Jamin, S.J. Shenker and W. Willinger,
Proceedings of the 21st Annual Joint Conference of the IEEE Computer
and Communications Societies, IEEE Computer Society, 2002.
[11] J.E. Cohen, F. Briand and C.M. Newman, Community Food Webs: Data
and Theory, Springer, Berlin, 1990.
[12] R. Cohen, K. Erez, D. ben Avraham and S. Havlin, Resilience of the internet to random breakdowns, Phys. Rev. Lett. 85 (2000) 4626.
[13] R. Cohen, K. Erez, D. ben Avraham and S. Havlin, Breakdown of the
internet under intentional attack, Phys. Rev. Lett. 86 (2001) 3682.
[14] G.F. Davis, M. Yoo and W.E. Baker, Strategic Organization 1 (2003) 301.
28
BIBLIOGRAPHY
29
[15] R. De Castro and J.W. Grossman, Famous trails to Paul Erdos, Rev. Acad.
Colombiana Cienc. Exact. Fs. Natur. 23 (1999) 563582. Translated and
revised from the English.
[16] R. De Castro and J.W. Grossman. Famous trails to Paul Erdos, Math.
Intellingencer 21(1999) 5163. With a sidebar by P.M.B. Vitanyi.
[17] T. Di Matteo, T. Aste, S.T. Hyde and S. Ramsden, Proceedings of the
First Bonzenfreies Colloquium on Market Dynamics and Quantitative Economics, Physica A 355 (2005) 2135.
[18] S.N. Dorogovtsev and J.F.F. Mendes, Advances in Physics 51 (2002) 1079.
[19] C.S. Elton, Animal Ecology, Sidgwick & Jackson, London, 1927.
[20] C. Faloutsos, P. Faloutsos and M. Faloutsos, On power-law relationships of
the internet topology, Computer Communications Rev. 29 (1999) 251262.
[21] R. Ferrer i Cancho and R.V. Sole, Proceedings of the Royal Society of
London B268 (2001) 2261.
[22] R. van der Hofstad, Random Graphs and Complex Networks, Volume I,
monograph in preparation. File can be downloaded from http://www.win.
tue.nl/~rhofstad/
[23] H. Jeong, S. Mason, A.-L. Barabasi and Z.N. Oltvai, Nature 411 (2001) 41.
[24] H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai and A.-L. Barabasi, Nature
407 (2000) 651.
[25] J.M. Kleinberg, S.R. Kumar, P. Raghavan, S. Rajagopalan and A. Tomkins,
in: Proceedings of the International Conference on Combinatorics and
Computing, Lecture Notes in Computer Science 1627, Springer, Berlin,
1999, pp. 118.
[26] B. Kogut and G. Walker, American Sociological Review 66 (2001) 317.
[27] R. Kumar, P. Raghavan, S. Rajagopalan and A. Tomkins, Trawling the web
for emerging cyber communities, Computer Networks 31 (1999) 14811493.
[28] R.N. Mantegna, Eur. Phys. J. B 25 (1999) 193.
[29] S. Maslov, K. Sneppen and A. Zaliznyak, Physica A 333 (2004) 529540.
[30] M.E.J. Newman, SIAM Review 45 (2003) 167.
[31] M.E.J. Newman, S.H. Strogatz and D.J. Watts, Phys. Rev. E 64 (2001)
026118.
[32] M.E.J. Newman, D.J. Watts and A.-L. Barabasi, The Structure and Dynamics of Networks, Princeton Studies in Complexity, Princeton University
Press, 2006.
[33] J.-P. Onnela, A. Chakraborti, K. Kaski and J. Kertesz, Eur. Phys. J. B 30
(2002) 285.
30
BIBLIOGRAPHY
Chapter 2
Random Graphs
In this chapter we describe some key concepts in graph theory. In Section 2.1
we introduce graphs and random graphs, and look at four particular scaling
features as these graphs become large. (More detailed scaling features will be
discussed in Chapter 4.) In Section 2.2 we analyse the simplest random graph
model, due to Erd
os and Renyi, where edges occur randomly and independently.
Random graphs are models for complex networks (randomness is often synonymous to complexity). They are inspired by real-world networks, and are
used as null-models. They play an important role in analysing and explaining
the empirical properties observed in real-world networks. They can also be used
to make predictions.
2.1
A graph G = (V, E) consists of a set of vertices V (also called nodes or sites) and
a set of edges E (also called links or bonds) connecting pairs of vertices. A graph
is called simple when there are no self-edges (= no edges between a vertex and
itself) and no multiple edges (= at most one edge between a pair of vertices). A
graph that is not simple is called a multi-graph. Edges are undirected. Graphs
with directed edges are called directed graphs. See Fig. 2.1.
s
s
s
s @
@s
s
s
Figure 2.1: Examples 13 are complete graphs. Examples 13 and 5 are simple
graphs, examples 4 and 6 are multi-graphs. Examples 1 and 5 contain isolated vertices.
Example 5 has two clusters.
32
(2.1)
with ki the degree of vertex i. The degree distribution is the probability distribution
X
fG = |V |1
k i ,
(2.2)
iV
k N0 ,
(2.4)
G
[0, 1],
WG
(2.5)
where
G =
X
i1 ,i2 ,i3 V
are present} ,
WG =
1{i1 i2 ,i2 i3
are present} ,
i1 ,i2 ,i3 V
(2.6)
i.e., G is 3! = 6 times the number of triangles in G and WG is 2! = 2 times
the number of wedges in G. This definition is sometimes referred to as the
wedge-triangle clustering coefficient. (In Section 4.2.3 a different definition will
be used, but with the same flavour.) A complete graph has clustering coefficient
1, a tree graph has clustering coefficient 0.
The typical distance in G is the ratio
P
i,jV : ij, i6=j d(i, j)
HG = P
[1, ),
(2.7)
i,jV : ij, i6=j 1
where i j means that i and j are connected, and d(i, j) denotes the graph
distance between i and j (= the minimal number of edges in a path between
i and j). In words, HG is the distance between two vertices drawn uniformly
from all pairs of connected vertices. The complete graph has typical distance 1,
a linear graph has typical distance roughly one third of its length.
A random graph is a graph where the vertices and/or edges are chosen randomly. There are many possible ways in which this can be done, and various
different choices have been made with the aim to model real-world networks of
33
(2.9)
(2.10)
kN0
log f (k)
=
log(1/k)
(2.11)
E(Gn )
=C
E(WGn )
(2.12)
for some C (0, 1]. Not highly clustered means that locally the graph
looks like a tree.
(4) G is called a small world when
lim P(HGn K log n) = 1
(2.13)
34
In asymptotic analysis, three symbols are used frequently: o, O and . The symbol o stands for is of smaller order than: an = o(bn ) when limn an /bn = 0.
The symbol O stands for is at most of the same order as: an = O(bn ) when
lim supn an /bn < . The symbol stands for is of the same order as:
an = (bn ) when both an = O(bn ) and bn = O(an ). This is also written as
an bn .
[End intermezzo]
2.2
Erd
os-R
enyi random graph
Figure 2.2: Two realizations of Erdos-Renyi random graphs with 100 vertices and
edge probabilities 1/200, respectively, 3/200. The three largest clusters are ordered
by the darkness of their edge colors (dark blue, blue, light blue). The remaining edges
all have the lightest shade (grey). Courtesy Remco van der Hofstad.
Homework 2.1 Find the distribution of the number of edges in ERn (p)?
Compute its mean and its variance, and show that it satisfies the law of large
numbers and the central limit theorem in the limit as n (look P
up on
Wikipedia what this means). Hint:
Use
that
the
number
of
edges
is
e Ye ,
where the sum runs over the n2 edges of the complete graph Kn , and Ye =
1{e is retained} are i.i.d. (= independent and identically distributed) random variables taking the values 1 with probability p and 0 with probability 1 p. Note
that E(Ye ) = P(e is retained) = p.
2.2. ERDOS-R
ENYI
RANDOM GRAPH
35
The Erd
os-Renyi random graph is not really suitable as a model of a realworld network, for which typically neither the number of vertices is fixed nor the
edges are retained or removed independently. Yet, it captures a basic feature of
a real-world network: complexity.
2.2.1
Percolation transition
We follow the exposition in van der Hofstad [2, Chapter 4]. The Erdos-Renyi
random graph exhibits an interesting phenomenon: ERn (p) has a percolation
transition when we pick p = /n with (0, ) and let n . Namely, the
largest cluster has size
(log n) when < 1,
(n2/3 ) when = 1,
(n) when > 1.
Thus, there is a critical value c = 1 such that ERn (/n) consists of a large
number of small disconnected components when < c (subcritical regime),
but has a large connected component containing a positive fraction of all the
vertices when > c (supercritical regime). At = c there is a percolation
transition: the small clusters coagulate into a large cluster. It can be shown
that for > c there is only one cluster of size (n), while all the other clusters
are of size (log n). It can also be shown that for = c there are multiple
clusters of size (n2/3 ).
Before we explain the intuition behind the above percolation transition, we
make a brief digression into the mathematics of branching processes.
[Begin intermezzo]
A branching process is a simple model for a population evolving over time. Suppose that, in each generation, each individual in the population independently
gives birth to a random number of children, chosen according to a prescribed
probability distribution f = (f (k))kN0 called the offspring distribution, i.e.,
f (k) is the probability that an individual has k children. Let Zn denote the
number of individuals in the n-th generation, where for convenience we pick
Z0 = 1. Then Zn satisfies the recursion relation
Zn+1 =
Zn
X
Xi,n ,
n N0 ,
(2.14)
i=1
where (Xi,n )iN,nN0 is an array of i.i.d. random variables with common distribution f (i.e., Xi,n is the number of children of individual i in generation n).
Let
X
m=
kf (k).
(2.15)
kN0
One of the key results for branching processes is that if m 1, then the population dies out with probability 1 (unless f = 1 ), while if m > 1, then the
population has a strictly positive probability to survive forever. In fact, it turns
out that the extinction probability
= P( n N : Zn = 0)
(2.16)
36
Gf (x) =
xk f (k),
x [0, 1].
(2.17)
kN0
f (0) s
s
x
Figure 2.3: Plot of the generating function x 7 Gf (x) for the case where m =
G0f (1) > 1.
(2.18)
that we can think of as an exploration process, counting the vertices that are
connected to ? at successive distances (with multiplicities) and painting them
green. The idea is that this exploration process is close to a branching process
when n is large, because the exploration process rarely creates loops.
2.2. ERDOS-R
ENYI
RANDOM GRAPH
37
k N,
k
,
k!
k N.
(2.20)
38
2.2.2
Scaling features
(2.21)
E(Nd ) = , it follows that Nd = (n) when d = (K log n). Since there are
not more than n vertices, the exploration process from vertex ? must stop after
at most (K log n) iterations.
Finally, it is possible to consider a generalised Erd
os-Renyi random graph
in which the parameter is chosen randomly according to a distribution with
a power law tail. In this way the random graph can be made to be scale free
and highly clustered as well. In Chapter 3 we look at more realistic models to
construct random graphs with these properties.
Bibliography
[1] P. Erd
os and A. Renyi, On random graphs, I. Publ. Math. Debrecen 6
(1959) 290297.
[2] R. van der Hofstad, Random Graphs and Complex Networks, Volume I,
monograph in preparation. File can be downloaded from http://www.win.
tue.nl/~rhofstad/
39
Chapter 3
Network Models
In this chapter we describe two examples of random graphs that are more realistic models of real-world networks than the Erdos-Renyi random graph encountered in Chapter 2. In Section 3.1 we look at the configuration model, in
Section 3.2 at the preferential attachment model. The former is a static realisation of a random graph (like the Erdos-Renyi random graph), the latter is a
dynamic realisation, i.e., it is the result of a growth process.
3.1
3.1.1
3.1.2
Construction
We follow van der Hofstad [9, Chapter 7]. Suppose that we take the degree
sequence as the starting point of our model, i.e., for n N we associate with
each vertex i V = {1, . . . , n} a pre-specified degree ki N0 , forming a prespecified degree sequence
~k = (k1 , . . . , kn ),
(3.1)
and we connect the vertices with edges in some way so as to realise these
degrees. To that end, we think of placing ki half-edges (stubs) incident to
40
41
vertex i, and matching the different half-edges in some way so as to form full
edges. One way to do this is to match the half-edges in a uniformly random
manner. This leads to what is called the configuration model (see Fig. 3.1). The
resulting random multi-graph is denoted by CMn (~k) and is referred to as the
configuration model. Chapter 7 describes algorithms to simulate CMn (~k).
It does not matter in which order the half-edges are paired in the pairing
procedure. As long as, conditionally on the paired half-edges so far, the next
half-edge is paired to any of the remaining half-edges with equal
Pnprobability, the
final outcome is the same
in
distribution.
The
total
degree
is
i=1 ki , the total
Pn
number of edges is 21 i=1 ki , which is why the total degree must be even.
Exercise 3.1 Show that there are (2m 1)!! = (2m 1) (2m 3) 3 1
different ways of pairing 2m half-edges. Show that not all pairings give rise to
a different graph.
The degree distribution associated with CMn (~k) is (recall (2.2))
fCMn (~k) = n1
n
X
ki .
(3.2)
i=1
(3.3)
(3.4)
with Var(f ) the variance of f , then the resulting graph is simple with a strictly
positive probability. By conditioning on the graph being simple, we end up
with a random graph that has the pre-specified degree sequence. Sometimes
this is referred to as the repeated configuration model, since we may think of
the conditioning as repeatedly forming the graph until it is simple. Another
approach is to remove the self-edges and multiple edges afterwards, which is
referred to as the erased configuration model. It can be shown that when n ,
the degree distributions in these two models also converges to f . Hence, for large
n the conditioning and the erasing do not alter the degrees by much, and they
are completely harmless in the limit as n . To keep the computations
simple we stick to the original construction.
3.1.3
A natural question is: Which sequences of numbers can occur as the degree
sequence of a simple graph? A sequence ~k = (k1 , . . . , kn ) with k1 k2 . . .
42
Figure 3.1: Simulation of the configuration model with n = 7 vertices and degree
sequence ~k = (5, 5, 4, 5, 5, 3, 5). The pictures show how 16 pairs of half-edges are
randomly matched to become 16 edges. Courtesy Oliver Jovanovski.
43
n
X
ki l(l 1) +
i=1
min(l, ki ),
l = 1, . . . , n 1.
(3.5)
i=l+1
The necessity of this condition is easy to see. Indeed, the left-hand side is the
total degree of the first l vertices. The first term on the right-hand side is the
maximal total degree of the first l vertices coming from edges between them,
while the second term is a bound on the total degree of the first l vertices coming
from edges that connect to the other vertices. The sufficiency is harder to see,
and we refer to Choudum [7] for a proof.
Exercise 3.2 Give an example of a non-graphical sequence ~k = (k1 , . . . , k4 )
for which k1 + . . . + k4 is even, and explain in a picture why it is non-graphical.
Arratia and Liggett [1] investigate the probability that an i.i.d. sequence
~ = (D1 , . . . , Dn )
D
(3.6)
is graphical. This becomes relevant when the degree sequence ~k in the configuration model is itself drawn as an i.i.d. sequence, say according to a pre-specified
probability distribution f on N0 . In that case automatically
lim E(kfCMn (D)
~ f k ) = 0.
(3.7)
P
It turns out that, under the assumption that 0 < k even f (k) < 1 (i.e., both
even and odd degrees are possible),
0, if lim
n nF (n) = ,
~ is graphical) =
lim P(D
(3.8)
n
1 , if lim
nF (n) = 0,
n
where F (n) =
kn
k even
i=1
i=1
~ for which Pn Di is
In other words, by retaining only those realisations of D
i=1
~ to be simple, i.e., the probability that
even, we make it possible for CMn (D)
44
~
CM
P n (D)2 is simple is strictly positive. It can be shown that if f (0) = 0 and
kN0 k f (k) < , then
n
X
~ is simple
lim P CMn (D)
Di even = exp[ 12 2 14 4 ] (0, 1] (3.11)
i=1
with
P
=
kN
(3.12)
where N is the normalisation constant. This shows that the repeated configuration model is a feasible way to generate simple random graphs with a prescribed
degree distribution.
3.1.4
Percolation transition
k N0 ,
(3.13)
s k
s
]
Figure 3.2: The vertex ? linked to a neighbour ] that has k neighbours not linked to
?.
45
Erd
os-Renyi random graph). But for large n this effect is minor and so we can
think of f as the forward degree of vertices in the exploration process. Since
(compare (3.12) and (3.13))
X
k f(k)
(3.14)
=
kN0
is the average forward degree, this explains why the percolation transition occurs
at = 1 (recall Homework 2.2 and the intermezzo on branching processes in
Chapter 2).
P
Note that if f 6= 1 , then > kN kf (k), which is the average degree. In
the language of social networks this inequality can be expressed as:
On average your friends have more friends than you do!
This sounds paradoxical, but it is not. You are more likely to be friends with
a person who has many friends than with a person who has few friends. This
causes a bias, which is precisely what (3.13) captures.
3.1.5
Scaling features
The configuration model can be made sparse and scale free by construction:
since the degree distribution is pre-described it can be chosen so as to satisfy
the conditions in (2.9) and (2.11).
In van der Hofstad [10, Chapter 5] it is shown that the configuration model
with i.i.d. degrees is small-world, namely, for any > 1,
lim P HCMn (D)
K > /( 1),
(3.15)
~ K log n = 1
n
where we recall (2.7). The intuition behind this result is similar to that for
the Erd
os-Renyi random graph, with taking over the role of . If the degree
distribution f has exponent (2, 3) (recall (2.11)), so that
X
X
kf (k) < ,
k 2 f (k) = ,
(3.16)
kN0
kN0
then the configuration model is even ultra small-world : distances are at most
of order log log n.
Homework 3.1 Is the configuration model with i.i.d. degrees highly clustered?
Hint: Recall (2.12), and compute the probability that ? lies in a wedge, respectively, in a triangle in the limit as n . Use Fig. 3.2.
Homework 3.1 shows that CMn is locally tree-like, i.e., the number of triangles
grows much slower with n than the number of wedges.
3.2
3.2.1
46
3.2.2
Construction
We follow van der Hofstad [9, Chapter 8]. The preferential attachment we
consider depends on two parameters, m N and [m, ), and produces a
random multi-graph process, denoted by
PAn (m, ) nN ,
(3.17)
such that for every n the graph has n vertices, mn edges and total degree 2mn
(see Exercise 3.6 below).
We begin by defining the model for m = 1 (see Fig. 3.3). In this case,
PA1 (1, ) consists of a single vertex v1 with a single self-loop (which has degree
2). Let
{v1 , . . . , vn }
(3.18)
47
1+
3+2
s
v1
s
v1
2+
3+2
s
v1
s
v2
s
v2
Figure 3.3: The first two iterations in the construction of PAn (1, ). The first iteration
is a single vertex v1 with a single self-loop. The second iteration adds a vertex v2 and
links this via a single edge either to itself or to v1 , with probabilities that depend on
the degree of v1 (which is 2 after the first iteration). Subsequent iterations involve
adding vertices one by one and linking them via a single edge to the already existing
vertices with probabilities that depend on the current degrees of these vertices.
(3.19)
(1+)
n(2+)+(1+) ,
i = n + 1,
Di (n)+
n(2+)+(1+) ,
i = 1, . . . , n.
(3.21)
Note that the degrees in (3.19) are random and typically change as more vertices
are added: Di (n) depends on the vertex label i and the stage of the iteration
n. Note that the parameter is added to the degrees, which amounts to a shift
of the proportionality in the preferential attachment.
Exercise 3.3 Verify that Di (n) 1 for P
all n i, so that Di (n) + 0 for
n
all n i because 1. Also verify that i=1 Di (n) = 2n for all n.
Exercise 3.4 Verify that the attachment probabilities in (3.21) sum up to 1.
Exercise 3.5 Show that PAn (1, 1) consists of a self-loop at vertex v1 while
each other vertex is connected to v1 by precisely one edge.
Homework 3.2 Fix i N. Show that
P lim Di (n) = = 1.
n
(3.22)
48
1
2n+1
,
P vn+1 vi | PAn (1, ) =
D
(n)
i ,
2n+1
i = n + 1,
(3.23)
i = 1, . . . , n,
and for = 1 to
P vn+1
0,
vi | PAn (1, ) =
Di (n)1 ,
n
i = n + 1,
(3.24)
i = 1, . . . , n.
(3.25)
Note that an edge in PAmn (1, /m) is attached to vertex vi with a probability proportional to the weight of vertex vi , which according to the second
line of (3.21) is equal to the degree of vertex vi plus /m. Since for each
j {1, . . . , n} the vertices {v(j1)m+1 , . . . , vjm } in PAmn (1, /m) are collapsed
into a single vertex vj [m] in PAn (m, ), an edge in PAn (m, ) is attached to
vertex vj [m] with a probability proportional to the total weight of the vertices
{v(j1)m+1 , . . . , vjm }. Since the sum of the degrees of these vertices is equal to
the degree of vertex vj [m], this probability in turn is proportional to the degree
49
of vertex vj [m] in PAn (m, ) plus . Thus, also PAn (m, ) grows in an affine
manner.
In the above construction the degrees are updated each time an edge is
attached. This is referred to as intermediate updating of the degrees. It is
possible to define the model with m N\{1} directly, without the help of the
model with m = 1, but the construction is a bit more involved.
The model with = 0 is the Barab
asi-Albert model, which has received
a lot of attention in the literature and was formally defined in Bollobas and
Riordan [6]. The extra parameter was introduced by van der Hofstad [9,
Chapter 8] and makes the model more flexible.
3.2.3
Scaling features
The following two results are taken from van der Hofstad [9, Chapter 8] and are
valid for any m N and > m.
(1) The random graph process (PAn (m, ) nN is sparse with limiting degree
distribution fPA given by
0,
k = 0, . . . , m 1,
fPA (k) =
(k+)
(m+2++(/m))
2+
, k m,
m
(k+3++(/m))
(m+)
R
0
(3.27)
xt1 ex dx, t > 0.
k ,
(3.28)
with
= 3 + (/m),
Hence (PAn (m, )
nN
cm, = ( 1)
(m + + ( 1))
.
(m + )
(3.29)
Exercise 3.7 Look up the properties of t 7 (t) on Wikipedia. With the help
of partial integration and induction it can be shown that (j) = (j 1)! for
j N.
50
Exercise 3.8 Why is the result in Homework 3.2 (= every vertex eventually
sees its degree tend to infinity) not in contradiction with the fact that the degree
distribution converges to fPA (= sparseness)?
For m = 1 the above formulas simplify to
0,
fPA (k) =
(2 + ) (k+)
k = 0,
(3+2)
(k+3+2) (1+) ,
(3.30)
k 1,
and
fPA (k) = c1, k [1 + O(1/k)],
k ,
(3.31)
with
(3 + 2)
.
(3.32)
(1 + )
Figure 3.6 shows a realisation of the degree sequence of PAn (2, 0) for n =
300, 000 and n = 1, 000, 000. The horizontal axis is the degree k, the vertical
axis is the number of vertices with degree k, corresponding to nfPA (k).
= 3 + ,
c1, = (2 + )
100000.
100000.
10000
10000
1000
1000
100
100
10
10
1
1
1
10
50 100
5001000
10
100
1000
Figure 3.6: The degree sequence of a preferential attachment random graph with
m = 2, = 0 and n = 300, 000, respectively, n = 1, 000, 000 on a log-log scale.
Courtesy Remco van der Hofstad.
In van der Hofstad [10, Chapter 7] it is shown that (PAn (m, ) nN is smallworld for any m N and > m. Unfortunately, the proof is not easy and
there is no good control on the constant K. For m N\{1} and (m, 0)
it is even ultra small-world. It can also be shown that (PAn (m, ) nN is not
highly clustered because the random graph is locally tree-like, i.e., the number
of triangles grows much slower with n than the number of wedges.
3.2.4
Dynamic robustness
The important feature of the preferential attachment model is that, unlike the
configuration model, the power law degree distribution is explained via a mechanism for the growth of the graph. Therefore, preferential attachment offers
a possible explanation as to why power-law degree distributions occur in realworld networks. As Barabasi [2] puts it:
... the scale-free topology is evidence of organising principles acting at
each stage of the network formation. (...) No matter how large and complex a network becomes, as long as preferential attachment and growth are
present it will maintain its hub-dominated scale-free topology.
51
This is correct, but it is overstating the point a bit, since power laws are intimately related to the affineness of the attachment probabilities. Indeed, it
turns out that if the probability for a new vertex to attach itself to an old vertex with degree k is chosen proportional to k with (0, 1), then fPA falls off
like a stretched exponential and scale freeness is lost (Krapivsky, Redner and
Leyvraz [12]). On the other hand, if (1, ), then there is a single vertex
that is connected to nearly all the other vertices (Krapivsky and Redner [11]).
Moreover, if 1/( 1) is non-integer, then there are finitely many vertices with
degree > 1/( 1) and infinitely many vertices with degree < 1/( 1) (Oliviera
and Spencer [13]).
Many more possible explanations have been given for why power laws occur
in real-world networks, and many adaptations of the above simple preferential
attachment model have been studied in the literature, all giving rise to powerlaw degree distributions.
While preferential attachment is natural in social networks, also in other
examples of real-world networks some form of preferential attachment is likely
to be present. For example, in the WWW when a new webpage is created it
is more likely to link to an already popular site, such as Google, than to the
personal web page of a single individual. For Internet it may be profitable
for new routers to be connected to highly connected routers, since these give
rise to short distances. Even in biological networks some form of preferential
attachment exists. In fact, the idea of preferential attachment in the context of
the evolution of species dates back to Yule [14] in 1925.
Bibliography
[1] R. Arratia and T.M. Liggett, How likely is an i.i.d. degree sequence to be
graphical? Ann. Appl. Probab. 15 (2005) 652670.
[2] A.-L. Barab
asi, Linked: The New Science of Networks, Perseus Publishing,
Cambridge, Massachusetts, 2002.
[3] A.-L. Barab
asi and R. Albert, Emergence of scaling in random networks,
Science 286 (1999) 509512.
[4] E.A. Bender and E.R. Canfield, The asymptotic number of labelled graphs
with given degree sequences, Journal of Combinatorial Theory (A) 24
(1978) 296307.
[5] B. Bollob
as, A probabilistic proof of an asymptotic formula for the number
of labelled regular graphs, European J. Combin. 1 (1980) 311316.
[6] B. Bollob
as and O. Riordan, The diameter of a scale-free random graph,
Combinatorica 24 (2004) 534.
[7] S.A. Choudum, A simple proof of the Erdos-Gallai theorem on graph sequences, Bull. Austral. Math. Soc. 33 (1986) 6770.
[8] P. Erd
os and T. Gallai, Graphs with points of prescribed degrees (in Hungarian), Mat. Lapok 11 (1960) 264274.
[9] R. van der Hofstad, Random Graphs and Complex Networks, Volume I,
monograph in preparation. File can be downloaded from http://www.win.
tue.nl/~rhofstad/
[10] R. van der Hofstad, Random Graphs and Complex Networks, Volume II,
monograph in preparation. File can be downloaded from http://www.win.
tue.nl/~rhofstad/
[11] P.L. Krapivsky and S. Redner, Organization of growing random networks,
Phys. Rev. E 63 (2001) 066123.
[12] P.L. Krapivsky, S. Redner and F. Leyvraz, Connectivity of growing random
networks, Phys. Rev. Lett. 85 (2000) 4629.
[13] R. Oliveira and J. Spencer, Connectivity transitions in networks with superlinear preferential attachment, Internet Math. 2 (2005)121163.
[14] G.U. Yule, A mathematical theory of evolution, based on the conclusions of
Dr. J.C. Willis, F.R.S, Phil. Trans. Roy. Soc. London B 213 (1925) 2187.
52
Chapter 4
Network Topology
In this chapter we discuss some of the most important empirical properties
observed in real-world networks. To this end, we first introduce some basic
notions in Section 4.1 and then discuss various empirical properties in some
detail throughout Section 4.2. Most of these empirical properties are computed
on real-world examples of the type discussed in Chapter 1.
This chapter is meant as a general introduction to the structural characterization of real-world networks, and also as a compact summary of the most
commonly observed empirical properties. The chapter puts some of the definitions already introduced in Chapters 2 and 3 at work on empirical networks,
and at the same time it introduces a series of new definitions aimed at capturing
more structural details. The aim is to complement the mathematical definitions
of the previous chapters with a phenomenological basis and to provide a solid
empirical reference for the following chapters.
Other reviews presenting an empirical introduction to networks from a rather
general point of view can be found in review articles [1, 2, 3, 4] and books
[5, 6, 7, 8, 9].
4.1
Basic notions
First we introduce some basic definitions. In some cases, these definitions (or
the notation) are slightly different than the corresponding definitions we gave in
Chapter 2, because, for instance, here we need to distinguish between directed
and undirected networks, and between so-called local and global properties.
This should not alarm readers: being aware of the existence of different quantitative expressions for the same abstract notion is actually an instructive exercise. These different expressions reflect the existing variety of definitions in the
scientific literature about complex networks (and in most other fields as well).
While the use of the terms graph and edge are preferred in the definition
of abstract mathematical models, the terms network and links are more used
when referring to real-world objects. In this chapter we will therefore prefer
the latter choice, even if we will occasionally employ the former one as well.
In general, the links of a network can be either directed, if an orientation (i.e.
an arrow) is specified along them, or undirected, if no orientation is specified.
Correspondingly, the whole network is denoted as directed (see Fig. 4.1a) or
53
54
undirected (see Fig. 4.1b). More precisely, undirected links are bidirectional
ones, since they allow transit in both directions. For this reason, an undirected
network can always be thought of as a directed one where each undirected link
is replaced by two directed ones pointing in opposite directions (see Fig. 4.1c).
A link in a directed network is said to be reciprocated if another link between
the same pair of vertices, but with opposite direction, is there. Therefore, an
undirected network can be regarded as a special case of a directed network where
all links are reciprocated.
In a real-world network, the identity of each vertex matters. For this reason,
if n is the number of vertices, each vertex is explicitly labelled with an integer
number i = 1, . . . , n. All the topological information can then be compactly
expressed by defining the n n adjacency matrix of the network, whose
entries tell us whether a link is present between two vertices (this is what is
ordinarily done, for instance, to store network data in a computable form). For
directed networks, we denote the adjacency matrix elements by aij and define
them as follows:
55
Figure 4.2: Examples of simple, familiar undirected networks. a) Periodic onedimensional chain (ring) with first- and second-neighbour interactions. b) Twodimensional lattice with only first-neighbour interactions. c) Fully connected network (mean-field approximation). All these networks are regular, and no disorder is
introduced.
simply given by
aij bij ,
(4.3)
where bij is that of the original undirected network. In this particular case,
aij is a symmetric matrix. Note that this mapping can be reversed in order to
recover the original undirected network: from Fig. 4.1b we can always obtain
Fig. 4.1c, and vice versa. By contrast, the mapping of a directed network
onto an undirected one - where an undirected link is placed between vertices
connected by at least one directed link - is also possible, even if in general it
cannot be reversed due to a loss of information. For instance, the network shown
in Fig. 4.1b is the undirected version of that shown in Fig. 4.1a. From Fig. 4.1a
we can obtain Fig. 4.1b, but from the latter we cannot go back to Fig. 4.1a
unless we are given more information.
Homework 4.1 Imagine a generic directed network where not all links are
reciprocated, and then consider its undirected projection. Find the mathematical relation between the entries {bij } of the adjacency matrix of the projected
undirected network and the entries {aij } of the adjacency matrix of the original
directed network. Test your relation on the networks shown in Figs. 4.1a and
4.1b (by assuming that the former is the original directed network), and then
for the networks shown in Figs. 4.1b and 4.1c (by assuming that the latter is
the original directed network).
Before introducing some specific real-world networks and presenting their
empirical properties, we briefly mention the simplest and most familiar kind of
networks that scientists from different fields have been traditionally experienced
with, namely the class of (deterministic) regular networks . These are networks where all vertices are connected to the same number z of neighbours, in
a highly ordered fashion. A different class of regular networks is that of random
regular graphs, where all vertices still have the same number of neighbours, but
the connections are randomly established between vertices. It should be noted
that such graphs are another example of random graphs, different from the
56
Erd
os-Renyi model discussed in Chapter 2. In fact, they are a particular case
of the Configuration Model introduced in Chapter 3, where the degree sequence
is now chosen to be the constant vector ~k = (z, . . . , z).
In Fig. 4.2 we show three examples of regular (undirected) networks: a
periodic chain with first- and second-neighbour interactions (z = 4), a twodimensional lattice with only first-neighbour interactions (z = 4) and a complete network (where each vertex is connected to all the others: z = n 1).
Chains and square lattices are particular examples of the more general class
of D-dimensional discrete lattices, used whenever a set of elements is assumed
to be connected to its first, second, . . . and lth neighbours (nearest-neighbour
connections). In this case, each vertex is connected to z = 2Dl other vertices.
Complete networks are instead used when infinite-range connections are assumed, resulting in what is sometimes referred to as the mean-field scenario,
i.e. z = n 1. The highly ordered structure of these networks translates into
certain regularities of their adjacency matrices.
Exercise 4.2 Write the adjacency matrices for the three networks shown in
Fig. 4.2. Before doing that, find out what is the most convenient labelling of
vertices in each network and describe it.
The examples of regular networks considered above can be built deterministically, or in other words without introducing randomness. They are among
the simplest specifications of networks and represent only a small subset of
the full range of possible topological configurations. The rest of this Chapter
aims at showing that real networks are not consistent with neither ER random graphs, nor regular networks. Therefore, traditional assumptions such
as nearest-neighbour or mean-field connections cannot be considered as good
choices for most real-world networks, along with their predictions for the dynamical behaviour of any process defined on them. This problem persists even
after introducing randomness in regular networks: random regular graphs, while
exhibiting a behaviour that is much richer than that of their deterministic counterparts, are still not good models of real-world networks. The failure of regular
networks motivates the introduction of more complex models, some of which
have been presented in Chapter 3 and some of which will be introduced in
Chapter 5.
4.2
4.2.1
57
First-order properties
bij .
(4.4)
j6=i
kiout
aij .
j6=i
(4.6)
58
(4.7)
where in and out have in general different values, both typically between 2
and 3.
For the practical purpose of plotting empirical degree distributions and estimating their exponents, the (empirical) cumulative distributions are commonly
used:
X
X
X
P> (k)
P (k 0 ), P>in (k in )
P in (k 0 ), P>out (k out )
P out (k 0 ).
k0 k
k0 kout
k0 kin
(4.8)
In this way, the statistical noise is reduced by summing over k 0 .
Exercise 4.4 Plot the empirical cumulative (in- and out- where applicable)
degree distributions for the three networks shown in Fig. 4.1.
If the empirical degree distribution has the power-law behaviour of Eq. (4.6)
or (4.7), then the empirical cumulative distributions are again power-laws, but
with an exponent reduced by one:
P> (k) k +1 ,
P>in (k in )in +1 ,
(4.9)
k N,
a > 0.
(4.10)
59
Figure 4.3: Empirical cumulative degree distribution for three different networks. a)
P> (k) for the Internet at the autonomous system level in 1999 [10]. b) P> (k) for
the protein interaction network of the yeast Saccaromyces cerevisiae [11]. c) P>in (kin )
for a 300 million vertex subset of the WWW in 1999 [12]. All curves are approximately straight lines in log-log scale, indicating that they are power-law distributions
(modified from ref. [4]).
presence of so-called fat tails: compared to distributions that decay (at least)
exponentially (e.g. Gaussian or Poisson distributions), power-law distribution
decay much slower and assign a much bigger probability to rare events (the
outcomes in the tail of the distribution).
The empirical scale-free behaviour means that in real-world networks there
are predominantly many low-degree vertices but also a fair number of highdegree ones, which are connected to a significant fraction of the other vertices.
The fraction of vertices with very large degrees (i.e. hubs) is not negligible
and gives rise to a whole hierarchy of connectivities, from small to large. If we
imagine that a dynamical process takes place on the network (see for instance
Chapters 8, 9, 11, 12), the scale-free property as a remarkable effect: once the
process reaches a high-degree vertex, it then propagates to a large portion of
the entire network, resulting in extremely fast dynamics.
By contrast, note that regular networks introduced in Section 4.1 have a
delta-like empirical degree distribution of the form P (k) = k,z where z is
the degree of every vertex (see Fig. 4.2). For D-dimensional lattices with lthneighbours interactions, z = 2Dl and no vertex is connected to a significant
fraction of the other vertices. For fully connected networks, z = n 1 and every
vertex is connected to all the others. In all these cases, no hierarchy is present
and the network is perfectly homogeneous.
Average degree and number of links
It is possible to consider the average degree k as a single quantity characterizing the overall first-order properties of a network, and then compare different
networks with respect to it. In an undirected network the average degree can
be expressed as
P
ki
2Lu
k i =
,
(4.11)
n
n
where
XX
1 XX
Lu
bij =
bij
(4.12)
2 i
i j<i
j6=i
60
is the total number of undirected links in the network, expressed in terms of the
entries of the adjacency matrix. Note that in principle the total number of links
also includes self-loops, which are links starting and ending at the same vertex
(corresponding to nonzero diagonal entries of the adjacency matrix). However,
here and in the following we assume that there are no self-loops in the network,
and this is reflected in the requirement i 6= j in Eq. (4.12).
Exercise 4.5 Using the result of exercise 4.3, calculate k and Lu for the undirected network in Fig. 4.1. Check that k = 2Lu /n as stated in Eq. (4.11).
For directed networks, it is easy to see that the average in-degree kin equals
the average out-degree kout , and both quantities can be expressed as
k
kiin
Pn out
k
L
= i i = ,
n
n
kin =
kout
(4.13)
where
L
XX
i
aij
(4.14)
j6=i
is the total number of directed links, expressed in terms of the adjacency matrix
entries.
Exercise 4.6 Using the result of Exercise 4.3, calculate kin , kout and L for the
two directed networks in Fig. 4.1. For both networks, check that kin = kout =
L/n as stated in Eq. (4.13).
It should be noted that we chose a different notation for Lu and L to avoid
confusion when an undirected network is regarded as directed, with two directed
links replacing each undirected one. In that case the mapping described by
Eq. (4.3) allows us to recover Eq. (4.14) consistently from Eq. (4.12), and our
notation yields L = 2Lu as expected.
Note that, in terms of the degree distribution, the average degree k reads
k
k 0 P (k 0 ),
(4.15)
k0
X
k0
k 0 P in (k 0 ) =
k 0 P out (k 0 )
(4.16)
k0
61
Link density
The number of links is an interesting property in itself, being a measure of the
density of connections in the network. In order to compare networks with different numbers of vertices, the number of links is usually divided by its maximum
possible value in order to obtain the link density. In an undirected network, the
maximum possible number of links (with
self-loops excluded) is given by the
total number of vertex pairs, which is n2 = n(n 1)/2 if the number of vertices
is n. Therefore the link density is defined as
cu
2Lu
.
n(n 1)
(4.17)
L
.
n(n 1)
(4.18)
It is instructive to plot the values of the link density c for different realworld networks in a single figure, as a function of the number n of vertices. An
example of such plot is shown in Fig. 4.4. We see that networks of the same
type tend to be clustered together, and that all points approximately follow the
trend
c(n) n1 .
(4.19)
There is however one important exception: the temporal snapshots of the World
Trade Web lie out of the curve and tend to have a constant link density, independent of n.
Homework 4.5 Show that, in the limit n of large network size, Ddimensional lattices display cu (n) n1 0 (note that these networks are undirected, unlike those in Fig. 4.4).
In the limit n , the case cu (n) 0 is often referred to as the sparse
network limit, while cu (n) cu () > 0 (where cu () is a finite constant) is
the dense network limit. In graph theory, these limits can be defined rigorously
for mathematical models (see for instance Chapter 2). Real networks are however of finite size, therefore in principle we should speak of large size regime
rather than infinite size limit, the latter being only a formal extrapolation of
Eq. (4.19). Bearing this warning in mind, we conclude that all the real-world
networks in Fig. 4.4 are sparse, except the WTW which is a dense network.
Indeed, most real-world networks are found to be sparse.
4.2.2
Second-order properties
By second-order topological properties we denote those properties which depend not only on the direct connections between a vertex and its nearest neighbours, but also on the indirect connections from a vertex to the neighbours of
its neighbours. Therefore the computation of these properties involves products of two adjacency matrix elements bij bjk . In Section 3.1.4 we encountered a
second-order property when we looked at the distribution of vertices at distance
2 from a given vertex ?.
62
Figure 4.4: Link density c versus number of vertices n for several real-world directed
networks. Except for the WTW, all points roughly follow the dashed line n1 .
Degree-degree correlations
An important example of second-order structure is given by the degree correlations: is the degree of a vertex correlated with that of its first neighbours?
Statistically speaking, the most complete way to describe second-order topological properties is to consider the two-vertices conditional empirical degree
distribution P (k 0 |k) specifying the probability that a vertex with degree k
is connected to a vertex with degree k 0 . In the trivial case with no correlation between the degrees of connected vertices, the second-order properties can
be obtained in terms of the first-order ones, or in other words the conditional
probability is equal to the unconditional (marginal) probability that a vertex is
connected to a vertex of degree k 0 :
k0
P (k 0 |k) = P (k 0 ).
k
(4.20)
However, as we will show in the following, real networks display a more complex
behaviour and are characterized by nontrivial degree correlations which make
the form of P (k 0 |k) deviate from Eq. (4.20).
63
<knn>
assortative mixing
disassortative mixing
k
Figure 4.5: Assortative and disassortative mixing in a generic network, as measured by the increasing or decreasing trend of the average nearest-neighbour
degree knn (k) as a function of the degree k.
64
Figure 4.6: Plots of the average nearest-neighbour degree for two real networks. a)
nn (k) plot for the 1998 snapshot of the Internet (circles); the solid line is proThe k
in (kin ), k
out (kout )
portional to k0.5 (modified from Ref. [16]). b) The three plots k
nn
and k (k) for a snapshot of the World Trade Web in 2000 (the solid line is again pronn (k) curve for the subset of the undirected
portional to k0.5 ); the inset reports the k
network defined only by the reciprocated links (after Ref. [17]).
From the above expression we recover the expected constant trend for the uncorrelated networks described by Eq. (4.20), which inserted into Eq. (4.22) yields
knn (k) = k 2 /k independently of k.
The knn (k)-curve is particularly interesting when it displays the empirical
form
knn (k) k .
(4.23)
For instance, the Internet topology displays the above trend with = 0.5
(see Fig. 4.6a) [16] and is therefore a disassortative network, meaning that highdegree autonomous systems are on average connected to low-degree ones and
viceversa.
Relations similar to (4.21) hold for directed networks as well. More specifically, it is possible to define the average nearest-neighbour in-degree kinn,in and
the average nearest-neighbour out-degree kinn,out as
P P
P P
aij ajk
j6=i
k6=j aji akj
j6=i
nn,in
nn,out
P
P k6=j
ki
,
ki
(4.24)
j6=i aji
j6=i aij
respectively, and correspondingly the knn,in (k in )-curve and the knn,out (k out )curve. However, it is also possible to regard the directed network as undirected
(using the mapping you have found in exercise 4.1) and then consider the undirected ANND defined in Eq. (4.21) and the corresponding knn (k)-curve. For
instance, the quantities knn,in (k in ), knn,out (k out ) and knn (k) calculated on a
snapshot of the World Trade Web in the year 2000 are reported in Fig. 4.6b
[17]. The power-law scaling holds for all three of them. In particular, the undirected ANND obeys Eq. (4.23) with = 0.5, just like the Internet. The inset
of the same figure shows the knn (k) curve computed on a subnetwork of the
undirected WTW where pairs of vertices are connected only if in the original
65
directed network they are joined by two reciprocated directed links pointing in
opposite directions (see Section 4.1). The trend is similar to the other trends,
and the WTW is therefore a disassortative network in all the above representations. Another extensive analysis of the WTW [18], based on a more detailed
data set than that used in ref. [17], confirms the disassortative behaviour but
questions the actual occurrence of a scaling form as described by Eq. (4.23).
Assortativity coefficient
As for the first-order properties, it is possible to define single quantities characterizing the overall second-order properties of the network as a whole. For
instance, one can introduce the assortativity coefficient [19, 20] as the correlation coefficient between the degrees at either ends of a link. To this end,
let us define kk as the average, over all links of an undirected network, of the
product of the degrees of the nodes at two ends of a link:
kk
1 XX
bij ki kj .
L i j<i
(4.25)
kk k2
.
k 2 k2
(4.26)
Reciprocity
We conclude our discussion of the second-order properties with the notion of
reciprocity, which is a characteristic of directed networks. As anticipated in
Section 4.1, a link from a vertex i to a vertex j is said to be reciprocated if the
link from j to i is also present. The number L of reciprocated links can be
defined in terms of the adjacency matrix as
L
n X
X
aij aji .
(4.27)
i=1 j6=i
L
,
L
(4.28)
66
so that 0 r 1.
The measured value of r allows us to assess if the presence of reciprocated
links in a network occurs completely by chance or not. To see this, note that r
represents the average probability of finding a link between two vertices already
connected by the reciprocal one. If reciprocated links occurred by chance, then
this probability would be simply equal to the average probability of finding a
link between any two vertices, which is the link density c. Therefore if r = c
the reciprocity structure is trivial, while if r > c (or r < c) reciprocated links
occur more (or less) often than predicted by chance.
Homework 4.6 For the two directed networks in Fig. 4.1, calculate the reciprocity r and the link density c. Compare these two numbers and conclude
whether there is a tendency towards or against reciprocation in the two networks.
Real-world networks generally exhibit a nontrivial degree of reciprocity [21].
For instance, citation networks always display c > 0 and r = 0, since recent
papers can cite less recent ones while the opposite cannot occur. Foodwebs
and shareholding networks display 0 < r < c [21], while social networks [22],
email networks [23], the WWW [12], the World Trade Web [17, 21] and cellular
networks [21] generally display c < r < 1. Finally, the extreme case c <
r = 1 corresponds to (not fully connected) undirected networks, where all links
are reciprocated (such as the Internet, where information always travels both
ways along computer cables). In conclusion, real-world networks systematically
display a nontrivial degree of reciprocity.
4.2.3
Third-order properties
The third-order topological properties of a network are those which go the next
step beyond the second-order ones, since they regard the structure of the connections between a vertex and its first, second and third neighbours. The computation of third-order properties involves products of three adjacency matrix
elements bij bjk bkl . In the general language of conditional degree distributions,
the relevant quantity for an undirected network is now the three-vertices probability P (k 0 , k 00 |k) that a vertex with degree k is simultaneously connected to
a vertex with degree k 0 and to a vertex with degree k 00 . In this case too, the
analysis of real networks reveals interesting properties that we report below.
Local clustering coefficient
The most studied third-order property of a vertex i is the (local) clustering
coefficient Ci , defined (for an undirected network) as the number of links connecting the neighbours of vertices i to each other, divided by the total number
of pairs of neighbours of i (therefore 0 Ci 1). In other words, Ci is the link
density (see Section 4.2.1) of the subnetwork defined by the neighbours of i, and
can therefore be thought of as a local link density. It can also be regarded as
the probability of finding a link between two randomly chosen neighbours of i.
The clustering coefficient is a third-order property since it measures the number of triangles a vertex belongs to, and is therefore related to the occurrence
of (closed) paths of three links. Indeed, if bij denotes an entry of the adjacency matrix of the network, then the number of interconnections between the
67
P P
neighbours of i is given by j6=i k6=i,j bij bjk bki /2. The clustering coefficient
Ci is then obtained by dividing this number by the number of possible pairs of
neighbours of i, which equals ki (ki 1)/2 if ki is the degree of i. It follows that
P P
j6=i
k6=i,j bij bjk bki
Ci
.
(4.29)
ki (ki 1)
The above expression is a local (vertex-specific) version of Eq. (2.5).
Homework 4.7 Show that Eq. (4.29) can be rewritten as
P P
j6=i
k6=i,j bij bjk bki
Ci = P P
,
j6=i
k6=i,j bij bki
(4.30)
where it becomes manifest that the numerator counts the number of triangles in
which vertex i participates, and the denominator counts the number of wedges
in which vertex i participates. [Hint: use the fact that b2ij = bij ]. Compute the
value of Ci for each vertex of the network shown in Fig. 4.1b and compare the
calculated values with the (single) value obtained using Eq. (2.5).
For directed networks, the computation of the clustering coefficient can be
carried out on the undirected version of the network. Therefore Eq. (4.29)
holds for directed networks as well, with bij given by the expression you found
in Homework 4.1.
Clustering coefficient versus degree
A statistical way to consider the clustering properties of real networks is similar
to that introduced for the degree correlations. By computing the average value
of Ci over all vertices with a given degree k and plotting it versus k, it is
68
Remarkably, the analysis of real networks reveals that in many cases the
average clustering of k-degree vertices decreases as k decreases, and that this
trend is sometimes consistent with a power-law behaviour of the form
C(k)
k .
(4.31)
For instance, the word network of English synonyms [24] and the aforementioned
(incomplete) representation of the World Trade Web [17] display the above
power-law trend with = 1 and = 0.7 respectively (see Fig. 4.7). For the
WTW we note however that, as for the knn (k)-curve, the analysis of a more
1X
Ci .
C
n i=1
(4.32)
This quantity represents the average probability to find a link between two
randomly chosen neighbours of a vertex (clearly 0 C 1). Note that it
is different from the (also network-wide) definition of clustering coefficient in
Eq. (2.5) of Chapter 2.
The empirical analysis of most real networks reveals a large (i.e. finite
An analysis of some real networks also reveals that
for large n) value of C.
u displays an approximate linear dependence on the
the rescaled quantity C/c
number of vertices n:
u n.
C/c
(4.33)
u n0.96 .
This is shown in Fig. 4.8, reporting the data with best power-law fit C/c
Exercise 4.8 Show that, for regular D-dimensional lattices with up to l-th
u = 0 if l = 1 and C/c
u = 3(n1)(z2D) n if l > 1.
neighbour connections, C/c
4Dz(z1)
In conclusion, just like regular lattices with l > 1, most real networks are on
average highly clustered. Both classes of networks display a qualitative linear
u with n.
scaling of C/c
69
Figure 4.8: Log-log plot of the ratio between the average clustering coefficient C and
the link density cu as a function of the size n of the network. Full circles represent data
from the 18 networks summarized in ref. [2]: 2 food webs, the substrate network and
the reaction network of the bacterium E. coli, the neural network of the nematode C.
elegans, the collaboration network between movie actors, the power grid, 6 scientific
coauthorship data sets, 2 maps of the Internet, the WWW, the networks of word cooccurrence and word synonymy. Empty circles represent data from 16 additional food
webs [25]. The solid line represents the best power-law fit to the data, having slope
0.96 (modified from ref. [25]).
70
4.2.4
Global properties
Global properties often have the most important effect on processes taking
place on networks, since they are responsible for the way information spreads
over the network and for the possible emergence of collective behaviour of vertices (some of these aspects will be covered in Part II). Here we consider two
(out of the many) examples of global network properties: connected components
and average distance, which are intimately related to each other.
Connected components
In Section 2.1 we have already mentioned that two vertices in an undirected
network are said to belong to the same connected component (or cluster) if
a path exists connecting them through a finite number of steps. The size of a
connected component is the number of vertices in it. Note that for each of the
regular networks shown in Fig. 4.2 all vertices belong to the same connected
component.
For directed networks, it is possible that a path going from a vertex i to a
vertex j exists, while no path from j to i is there. In other words, it is possible
to define the in-component of vertex i as the set of vertices from which a path
exists to i, and the out-component of i as the set of vertices to which a path
exists from i. Finally, two vertices i and j are said to belong to the same strongly
connected component (SCC) if it is possible to go both from i to j and from j
to i. We have already encountered the SCC in our discussion of the WWW in
Subsection 1.2.3.
There is in principle no limit to the number and size of connected components
in a network. However, an empirical property of most real networks is the
presence of one very large component containing most of the vertices, plus a
number of much smaller components containing the few remaining vertices.
This means that the spread of information on real networks is efficient, since
starting from a vertex in the largest component it is possible to reach a large
number of other vertices in the same component. The presence of the largest
component is interesting also for theoretical reasons, since it is related to the
occurrence of a phase transition in models where links are drawn with a specified
probability (see Chapter 8).
Shortest distance
Another important property, which better characterizes the communication
properties in a network, is the shortest distance between vertices. For each
71
XX
1
d1
n(n 1) i j ij
(4.36)
where now i and j run over the entire set of n vertices. In such a way, d will
be finite even for networks where the (strongly) connected component does not
coincide with the whole network and its value will discriminate among different
topologies.
The empirical behaviour of d is very important. It turns out that, even in
a network with an extremely large number of vertices, the average distance is
generally very small. This property, known as the small-world property, is
shown in Fig. 4.9 where a plot of dln k versus n is reported for a set of real
networks. A rough logarithmic trend is observed, meaning that d scales with n
according to the approximate law
ln n
d .
ln k
(4.37)
The above equation is usually taken as a quantitative statement of the smallworld effect (see also Chapter 2, Eq. (2.13)). Its importance lies in the remarkable deviation from the behaviour of regular networks in any Euclidean
dimension D, which instead display d n1/D and are therefore characterized
by a much larger average distance.
The small-world effect is sometimes defined (in a stronger sense) as the
simultaneous presence of a small average distance and a large average clustering
coefficient. As we mentioned above, both properties are typically observed in
real-world networks.
Betweenness centrality
So far, we have seen various ways to define a measure of importance for the
vertices in a network. Most of them are based on different versions of the
72
Figure 4.9: Log-linear plot of the product between the average distance d and the
as a function of number n of vertices for a set of
logarithm of the average degree ln k
real networks studied in Ref. [2] (see the cited reference for the symbol legend). The
dashed line represents the curve ln n, showing that real data approximately follow the
even if with some exceptions (modified from Ref. [2]).
law d = ln n/ ln k,
where the sums run from 1 to the total number n of vertices and one must
take care that j is different from k and that both j and k are different from
i. The quantity Njk counts the total number of shortest paths between j and
k, while Njk (i) counts how many such paths pass through i. Whenever two or
more shortest paths of equal length exist between the same two vertices, the
contribution to the betweenness centrality of a third vertex i will be the number
of shortest paths (between the two given vertices) that pass through i, divided
by the total number of shortest paths between the two vertices.
As shown in the example in Fig. 4.10, the vertex that is crossed the most
times is also the most central in terms of its betweenness. As a result, vertices
with high betweenness play the role of bridges across regions of the network
that are highly connected internally, and more sparsely connected among each
other. In real-world networks, the presence of such regions is typically observed
(see next subsection). Correspondingly, a few bridging vertices with very high
betweenness are typically detected, along with several internal vertices with
73
Figure 4.10: The betweenness of the central black vertex is computed by considering all shortest paths (distances) between all the possible pairs of vertices.
Between the two grey vertices (A and B) in the figure there are two different
shortest paths, one of which contributes 1/2 to the betweenness of the black
vertex and one of which does not contribute to it.
lower betweenness.
Community structure
Very important structures that can be identified in a network are communities
of densely connected vertices. Communities are subsets of vertices whose internal link density is higher than the average density across the entire network, or
higher than an expected value (obtained under certain null hypotheses). Detecting communities is a non-local task, as it typically requires the calculation
of quantities that require repeated iterations across the whole network. A community can consist of any number of vertices (from a few vertices up to a large
fraction of the network), and a network can therefore be partitioned into heterogeneously sized communities.
There is no unique definition of a community, and even when a single definition is adopted, there are various methods to identify the communities of
a particular network [26]. For instance, some definitions allow for overlapping
communities that share one or more vertices, while others do not. Similarly,
some definitions allow for hierarchical communities that can be further resolved
into smaller sub-communities, while others do not.
A simple approach employs the concept of betweenness centrality (see previous subsection) to define and detect communities in large networks [26]. This
method starts by computing the betweenness of all nodes (or of all links, via an
appropriate modification of the definition (4.38)) and iteratively removing the
nodes (or links) with the largest betweenness, recalculating all the values of the
betweenness after each removal. In such a way, the bridges between communities are cut and the network gets partitioned hierarchically into smaller and
smaller communities.
Other methods are based on the comparison of the real network with a null
model, i.e. a mathematical model where some topological property is taken as
input from the data, but where communities are absent by construction [26].
The best partition of the network into communities is sought for by maximizing
a so-called modularity function defined as a sort of difference between the real
network and its null model. Some null models of networks will be introduced
74
Bibliography
[1] S.H. Strogatz, Nature 410, 268 (2001).
[2] R. Albert and A.-L. Barab
asi, Rev. Mod. Phys. 74, 47 (2002).
[3] S.N. Dorogovtsev and J.F.F. Mendes, Advances in Physics 51, 1079 (2002).
[4] M.E.J. Newman, SIAM Review 45, 167 (2003).
[5] A.-L. Barab
asi, Linked: The New Science of Networks, Perseus, Cambridge,
MA (2002).
[6] M. Buchanan, Nexus: Small Worlds and the Ground- breaking Science of
Networks, Norton, New York (2002).
[7] D.J. Watts, Six Degrees: The Science of a Connected Age, Norton, New
York (2003).
[8] Caldarelli, G. (2007). Scale-free networks: complex webs in nature and
technology. Oxford University Press.
[9] Newman, M. (2010). Networks: an introduction. Oxford University Press.
[10] Q. Chen, H. Chang, R. Govindan, S. Jamin, S.J. Shenker and W. Willinger,
Proceedings of the 21st Annual Joint Conference of the IEEE Computer
and Communications Societies, IEEE Computer Society (2002).
[11] H. Jeong, S. Mason, A.-L. Barabasi and Z.N. Oltvai, Nature 411, 41 (2001).
[12] A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata,
A. Tomkins and J. Wiener, Computer Networks 33, 309 (2000).
[13] Newman, M. E. (2005). Power laws, Pareto distributions and Zipfs law.
Contemporary physics, 46(5), 323-351.
[14] B.B. Mandelbrot, The Fractal Geometry of Nature, Freeman, San Francisco
(1983).
[15] S. Maslov, K. Sneppen, and A. Zaliznyak, Physica A 333, 529-540 (2004).
[16] R. Pastor-Satorras, A. V
azquez and A. Vespignani, Phys. Rev. Lett. 87,
258701 (2001).
Serrano and M. Bogu
[17] M.A.
n
a, Phys. Rev. E 68, 015101(R) (2003).
[18] D. Garlaschelli and M.I. Loffredo, Phys. Rev. Lett. 93, 188701 (2004).
75
76
BIBLIOGRAPHY
Chapter 5
Network Ensembles
In Chapters 2 and 3 we have already encountered various network models. All
these models have one feature in common: they are stochastic, i.e. they are
based on some degree of randomness. If we fix the parameters of a stochastic
network model, all the possible realizations (i.e. graphs) of the model itself
define a so-called ensemble of random graphs. Such ensemble is a collection1
G {G1 , . . . , GM } of M graphs (i.e. adjacency matrices), where each graph Ga
is assigned a probability P(Ga ) such that
X
GG
P(G) =
M
X
P(Ga ) = 1.
(5.1)
a=1
77
78
5.1
p i 6= j
E(gij ) hgij i =
(5.2)
0 i=j
P
where, here and in what follows, the expectation value
x xP(X = x) of a
(discrete) random variable X is denoted by E(X) or hXi. It therefore follows
from Eq. (4.12) that the expected number of undirected links is
E(Lu ) = p
n(n 1)
.
2
(5.3)
2E(Lu )
.
n(n 1)
(5.4)
This strategy is useful if, for instance, we want to compare the predictions
of the ER model with the observed properties of a real-world network with a
given number n of vertices and a given number Lu of undirected links. In this
perspective, the empirical values of n and Lu are treated as constraints, and
the model is fitted to these constraints by choosing the same n as in the real
network and p as in Eq. (5.4), with E(Lu ) Lu . Note that n will be necessarily
finite, and we cannot use the results obtained for n in Chapter 2. However,
large real-world networks imply that many asymptotic results will hold at least
approximately.
We already know, from the results of Chapter 4, that the comparison between the real network and the ER model will be unsuccessful: the ER model
is not able to reproduce many properties of most real-world networks, in particular their broad degree distribution and their large clustering. However, the
ER model has an important and desirable property: all the graphs with the
same value of n and Lu are generated with the same probability, i.e. they are
equiprobable. The proof of this result is the goal of the following series of homeworks.
Homework 5.1 Write the cardinality Mn of the ER ensemble, when n is
the (fixed) number of vertices. Calculate the number Mn (Lu ) of simple and
undirected
graphs (without self-loops) with n vertices and Lu edges. Check that
P
M
(L
n
u ) = Mn , where the sum runs over all the possible values of Lu in
Lu
the ensemble of graphs with n vertices.
79
5.2
With this section we return to the so-called Configuration Model [15, 1, 2] (CM
for short) introduced in Chapter 3. We first recall the abstract idea behind
the model and make general considerations. We then focus on the specific
implementations that have been proposed to realize this idea.
Let us consider undirected networks first. The idea of the CM model is
to assign each vertex i a desired degree ki and then generate the ensemble of
all graphs compatible with the resulting degree sequence {ki }ni=1 , by drawing
links at random between vertices in such a way that the desired degree sequence
is realized. All graphs compatible with the desired degree sequence should be
realized with the same probability. The degree sequence can be picked from any
desired degree distribution P (k), e.g. a scale-free one with desired exponent.
However, as discussed in Chapter 3 (Subsection 3.1.3), the degree sequence
must be graphical, i.e. realizable by at least one graph.
For directed networks, the CM is easily generalized by assigning each vertex
i a given in-degree kiin and a given out-degree kiout . Now the idea is to generate
an ensemble of random directed graphs in such a way that the desired in- and
out-degree sequences are simultaneously realized.
Note that, unlike the preferential attachment model (see Chapter 3), the
CM does not make explicit hypotheses on how networks organize themselves in
a given structure. It rather is a null model, generating a suitably randomized
ensemble of graphs once some low-level information is assumed as an input. As
80
we will see in detail in Chapter 10, the CM can also be used as a benchmark
for empirical data: a comparison between the CM and a real-world network
allows us to check whether some of the higher-order properties observed in the
real-world network are consistent with those generated by the CM using the
same degree sequence (which is a first-order property, see Chapter 4) as the real
network. If this is the case, then one can conclude that the observed higher-order
properties are a mere outcome of the specified form of the degree distribution,
being consistent with a random assignment of links compatible with the degree
sequence. If this is not the case, then the observed deviations from the model
indicate interesting structural patterns that cannot be traced back to the null
hypothesis (i.e. they are not explained by the degree sequence alone).
It is important to notice that, in the above abstract formulation, the CM
model can be regarded as a generalization of the ER model in the following
sense. The ensemble of networks generated by the ER model is completely
random except for the number of links, which is specified by fixing the connection
probability p. In a similar manner, the ensemble of networks generated by
the CM model is completely random except for the degree sequence, which is
specified from the beginning.
In other words, while in the ER model the only constraint (besides the number n of vertices) is given by the number of links Lu , in the CM model the
constraint is the entire degree sequence {ki }ni=1 . However, as we will show below, it makes a big difference whether the constraint is soft, i.e. enforced only
as an average over all realizations (as in the ER) or sharp, i.e. enforced on each
individual realization separately (as in the implementation of the CM model
discussed in Chapter 3). In the first case one speaks of canonical ensembles,
while in the second case one speaks of microcanonical ensembles. It turns out
that microcanonical ensembles are much more difficult to deal with mathematically and are prone to bias. By contrast, (the simplest) canonical ensembles are
analytically tractable and unbiased.
In what follows, we are going to describe various implementations of the CM
model. These variants will gradually lead us from microcanonical to canonical
implementations of the model.
5.2.1
81
Figure 5.1: Elementary step of the local rewiring algorithm for a) undirected and
b) directed networks. Two edges, here (A, B) and (D, C), are randomly chosen from
graph G1 and the vertices at their ends are exchanged to obtain the edges (A, C) and
(D, B) in graph G2 . Note that the degree of each vertex is unchanged (in the directed
case, the in- and out-degrees are separately conserved).
5.2.2
82
method. Howeverm, the two ensembles are different, since double links and selfloops are absent here, while they might be present in the link stub reconnection
method. In particular, since two vertices cannot be connected more than once,
in the local rewiring algorithm the presence of links between high-degree vertices is suppressed, determining a certain degree of spurious disassortativity
which is not due to a basic anticorrelation between vertex degrees. This important point highlighted by Maslov, Sneppen and Zaliznyak led them to show
that much of the disassortativity observed in the Internet (see Section 4.2.2)
can be accounted for in this way, while other patterns such as the clustering
properties are instead genuine [15]. Considerations of this type, which require
the comparison between a real-world network and some implementation of the
CM, will be made extensively in Chapter 10.
Recently, it has been proved mathematically that the local rewiring algorithm is biased, i.e. it does not explore the space of graphs compatible with the
degree constraints uniformly [3, 4]. Roughly speaking, the root of the problem is the fact that the algorithm explores with higher probability the graph
configurations that are closer to the orginal network. In order to overcome
this problem, one should introduce a suitable acceptance probability for each
attempted configuration. However, the calculation of this probability is computationally demanding because it depends on the current configuration, and
should therefore be repeated at each step of the algorithm.
5.2.3
ki kj
,
2Lu
(5.5)
where Lu is the observed number of links and ki , kj are the observed degrees
of vertices i, j. This choice is (at least apparently) reasonable, because the
ensemble averages of the degrees converge to their observed values:
P
X
j kj
E(ki ) =
pij = ki
= ki .
(5.6)
2Lu
j
83
Homework 5.4 Equation (5.6) is not entirely correct. Find out why and
write the corresponding correct expression. Discuss in which limit the correct
expression reduces to Eq. (5.6).
Homework 5.5 Using the correct expression, write the relationship between
the observed number of links Lu and the expected number of links hLu i generated
by the Chung-Lu model. Rearrange this expression to write the difference Lu
hLu i as a function of only the first and second moments of the degree distribution
P (k) of the real-world network. Discuss the effects of the heterogeneity (i.e. the
breadth) of the degree distribution. Discuss these effects for a scale-free degree
distribution of the form P (k) k with 2 < 3.
Note that the factorized form of pij in Eq. (5.5) also implies that no degree
correlations are introduced: the expected average nearest-neighbour degree (see
Chapter 4) reads
P 2
P
j kj
j pij kj
nn
= P
,
(5.7)
E(ki )
ki
j kj
which is independent of i and has the expected form k 2 /k valid for uncorrelated
networks (see Section 4.2.2).
Exercise 5.1 Equation (5.7) is not entirely correct. Find out why and try
to write a more refined expression. Try to write a similar expression for the
expected value E(Ci ) of the clustering coefficient of vertex i (see definition in
Chapter 4, Section 4.2.3).
Note that the model can be formulated also for directed graphs, by establishing a directed link from vertex i to vertex j with probability
pij =
kiout kjin
,
L
(5.8)
P
P
where L = i kiin = i kiout is the observed number of directed links, and kiout , kjin
are the observed out-degree of vertex i and the observed in-degree of vertex j,
respectively. This choice ensures that hkiin i = kiin and hkiout i = kiout for all vertices, generalizing Eq. (5.6). Note, however, that this is not entirely correct and
subject to the same limitation discussed in Homework 5.4.
Homework 5.6 Use Eq. (5.8) to write the expected number E(L ) hL i of
reciprocated links (see Chapter 4), and use the result to approximate the expected
reciprocity hri in the Chung-Lu model. Compare this value with the reciprocity
of a random graph with the same number of vertices and directed links.
The Chung-Lu model avoids by construction the occurrence of multiple links
and self-loops, since each pair of (distinct) vertices is considered only once.
However, to ensure (as we should) that 0 pij 1 for all i, j in Eqs. (5.5)
and (5.8), we are forced to consider only those degree sequences satisfying the
constraint
v
uX
p
u n
max{ki } 2Lu = t
kj ,
(5.9)
i
j=1
84
and similarly maxi {kiin } L and maxi {kiout } L for directed graphs. A
connection probability pij > 1 can be regarded as the establishment of multiple
links between i and j, and this possibility is avoided only by imposing the above
constraint. Therefore, in the Chung-Lu model the problem of the occurrence
of multiple links is circumvented by restricting the possible degree sequences to
those satisfying Eq. (5.9).
Unfortunately, the constraint expressed by Eq. (5.9) is very strong and is
violated by most empirical degree distributions where a few hubs with very large
degree are present. This limitation prevents us from using the Chung-Lu model
for most empirical degree sequences. Since, as we mentioned, the violation of
Eq. (5.9) can be thought of as leading to multiple links, the problem of the
Chung-Lu method is in some sense the canonical counterpart of the problem
encountered in the link stub reconnection method.
Exercise 5.2 Consider a marginal degree distribution where maxi {ki } =
2Lu . Discuss whether this is enough to ensure that Eq. (5.6) is a good approximation to the correct expression you found in exercise 5.4.
Homework 5.7 Consider a regular network with n vertices where ki = z i
and check whether the condition (5.9) holds. Discuss what you obtain if you
use Eq. (5.5) to generate the graph ensemble in this case. Use your result in
Homework 5.5 to write the expression for the difference Lu hLu i in this case.
Homework 5.8 Now consider a star graph with n vertices, where a central
vertex is connected to all the other vertices (and these vertices are not directly
connected to each other), and check whether the condition (5.9) holds. Discuss
what you obtain if you use Eq. (5.5) to generate the graph ensemble in this case.
5.2.4
The limitation of the Chung-Lu model led Park and Newman [68] to modify
the canonical approach in such a way that no restriction on the desired degree
sequence is imposed, and at the same time no multiple links are generated.
Park and Newman started from the general problem of finding the form of the
connection probability pij that generates a canonical ensemble of graphs with
no multiple links and such that two graphs with the same degree sequence are
equiprobable, in the general spirit of the Configuration Model.
As for the Chung-Lu model, we want the connection probability to be a
function pij = p(xi , xj ) of some quantities xi , xj controlling the expected degrees
of vertices i and j. The quantities {xi }ni=1 play a role similar to that of the
desired degrees {ki }ni=1 in the Chung-Lu model, even if they turn out to be
in general very different from the expected degrees {hki i}ni=1 and are therefore
denoted by a different symbol.
The starting point is to write the probability P(G) of occurrence of a given
graph G (with adjacency matrix entries {gij }) in the ensemble as a product,
over all pairs of vertices, of either pij (if the link is realized, i.e. pij = 1), or
(5.10)
i<j|gij =0
(1 pij )
i<j
= P0
(1 pij )
85
Y
i<j|gij =1
Y
i<j|gij =1
pij
1 pij
pij
,
1 pij
where P0 i<j (1 pij ) is a product over all vertex pairs and is therefore
independent of the particular graph G.
The above expression can be used to find the form of pij warranting that two
graphs G1 and G2 with the same degree sequence are equiprobable. Looking
again at Fig. 5.1, the requirement that the graphs G1 and G2 occur with the
same probability P(G1 ) = P(G2 ) translates into the requirement
pDC
pAC
pDB
pAB
=
,
1 pAB 1 pDC
1 pAC 1 pDB
(5.11)
since the two graphs are identical except for the subgraphs defined by the
four vertices A, B, C, D. For the above expression to hold for all quadruples
A, B, C, D, the form of pij must be such that pij /(1 pij ) = fi fj , where fi is
a quantity depending on i alone. Recalling that pij = p(xi , xj ), we see that
fi = f (xi ). Rearranging for pij , we have
pij = p(xi , xj ) =
f (xi )f (xj )
.
1 + f (xi )f (xj )
(5.12)
Any form of f (x) is compatible with the requirement in Eq. (5.11). Since different choices can be mapped to each other via a redefinition of x, we can choose
the simplest nontrivial2 function f (x) = x for later convenience. This yields
p(xi , xj ) =
xi xj
.
1 + xi xj
(5.13)
86
Figure 5.2: Average degree hk(x)i of vertices versus x corresponding to the choice
(x) x for three values of . The trend is initially linear and then saturates to
the asymptotic value k n 1 (atfer ref. [68]).
X
j
pij =
X
j
xi xj
1 + xi xj
(5.14)
and the above expression can be used to either generate an ensemble of networks
with degree distribution determined by the (free) parameters {xi }, or to find the
particular values of {xi } that produce an expected degree sequence equal to the
observed one. In the latter case, the above expression is intended as a system
of n nonlinear coupled equations where {hki i} = {ki } are known quantities and
{xi } are the unknowns (we will consider this case explicitly in Chapter 10).
Equation (5.14) shows that the expected degree of a vertex with a given
value of x can be written as a function of x, after integrating out the variables
for the other vertices. This can be best appreciated by considering a continuous
approximation where the distribution of x over all vertices is assumed to be the
continuous density (x):
Z
xy
Ex (k) hk(x)i = (n 1)
dy
(y).
(5.15)
1 + xy
0
The behaviour of hk(x)i is proportional to x for small values of x and then saturates to the maximum value n 1 for large x, consistently with the requirement
of no multiple or self loops.
Park and Newman [68] studied the model assuming a power-law distribution
(x) x with various values of the exponent (see Fig. 5.2). They found
that this assumption has two important consequences on the topology: firstly,
the degree distribution P (k) behaves as a power-law with the same exponent
of (x) for small values of x, but then diplays a cut-off ensuring k n 1
(see Fig. 5.3a). Secondly, the average nearest neighbour degree turns out to be
a decreasing function of the degree (see Fig. 5.3b). As expected, the absence
87
Figure 5.3: a) Cumulative degree distribution P> (k) corresponding to the choice
nn (k) for the
(x) x for three values of . b) Average nearest neighbour degree k
same three choices of the exponent . Here isolated symbols correspond to numerical
simulations, while solid lines are the analytical predictions (after ref. [68]).
(5.16)
88
The above exercise implies that the model defined by Eq. (5.16) is equivalent to the Chung-Lu model defined by Eq. (5.5). As already discussed in
Section 5.2.3, in this limit the expected degrees converge to the specified values
of x, and no degree correlations are introduced. Therefore the spurious disassortativity disappears in this limit. It is also easy to show that the degree
distribution P (k) becomes a rescaled form of (x). Curiously, in the above
quantum analogy this regime corresponds to the classical limit where the discreteness of the quantum world can be neglected, and the Fermi distribution can
be replaced by the Boltzmann distribution which (in a suitable representation)
has the expression (5.16).
We finally briefly describe the directed case. Now the probability that a
directed link from i to j is there is a function pij = p(xi , yj ) of two quantities
xi and yj playing a role analogous to that of the desired out- and in-degrees
kiout , kjin in the directed version of the Chung-Lu model defined in Eq. (5.8). By
looking at Fig. 5.1b and requiring that graphs with the same in- and out-degree
sequence are equiprobable, we are led to a condition analogous to Eq. (5.11)
with q replaced by p. This implies that in this case pij /(1 pij ) = fi gj where
fi = f (xi ) and gj = g(yj ) are functions of xi and yj alone respectively. Again,
all nontrivial choices can be mapped onto the linear case through a suitable
redefinition of x and y. Therefore we have
pij = p(xi , yj ) =
xi yj
1 + xi yj
(5.17)
(5.18)
5.3
Maximum-entropy ensembles
All the examples in the last section show that even a conceptually simple idea
(generating a random ensemble of networks with a specified degree sequence)
can encounter big difficulties when naively implemented. The Park-Newman
approach solves the practical and conceptual problems of other implementations
of the CM by ensuring that the ensemble is properly sampled, so that any
inference about the higher-order properties (e.g. the assortativity) is unbiased,
just like for the ER model.
At this point, we might ask a natural question: if we consider a different
constraint (other than the number of links or the degree sequence), how can
we be sure that we end up with an appropriate method to generate the ensemble? Is there some constructive method to generate graph ensembles with given
constraints?
In this section, we show that such a method exists and is based on the
Maximum Entropy Principle.
5.3.1
89
In information theory, an important measure of the uncertainty, or unpredictability of a random process is provided by Shannons entropy. If P(G) is
the probability of the outcome G, Shannons entropy is defined (up to a proportionality constant which is irrelevant for our later purposes) as
X
S
P(G) ln P(G),
(5.19)
G
where the sum runs over all the M possible outcomes of the process.
A deterministic (certain) process, i.e. one for which one outcome has probability one while all other outcomes have probability zero, gives S = 0, which is
the minimum possible entropy. By contrast, a completely unpredictable (uniform) process, i.e. one where all the outcomes have exactly the same probability
P(G) = M 1 , gives the maximum value S = ln M .
Another important property of the entropy is additivity: if the event G
requires the simultaneous occurrence (intersection) of two events G1 and G2
(i.e. G = G1 G2 ) and if these events are independent, i.e. the joint probability
P(G) = P(G1 G2 ) can be factorized as P(G) = P1 (G1 )P2 (G2 ) where P1 and
P2 are the marginal probabilities
P for the individual events G1 and G2 , then
S = S1 + S2 where Si = Gi Pi (Gi ) ln Pi (Gi ) denotes the entropy of the
individual event Gi (i = 1, 2) and the sum runs over the possible outcomes of
such event.
Exercise 5.3 Prove the last statement. Use the result to calculate the entropy
of the ER model with n vertices as a function of the probability p.
If we measure Shannons entropy on a graph ensemble, this will provide us
with a measure of the degree of randomness that we are left with, once we enforce the constraints that define the model itself. If the constraints represent
structural properties taken from observations (like the number of links or the
degree sequence in the ER and CM respectively), Shannons entropy will quantify the residual uncertainty that we are left with about the network, after we
measure those properties.
An important application of Shannons entropy is the Maximum Likelihood
Principle, especially as developed by Jaynes [5]. According to this principle,
whenever we have only partial information (summarized in the knowledge of a
set of m observables {x }m
=1 ) about a system, then our least biased inference
or best guess about the (unknown) rest of the system shoul be obtained by
finding the probability P(G) that maximizes S, subject to the known constraints.
This reflects the fact that, except for the known constraints, we are maximally
ignorant about the system. The constraints are expressed in the form
X
E(x )
x (G)P(G) = x
,
(5.20)
G
where x is the observed (known) value of the -th property. Note that the constraints are enforced canonically, i.e. as ensemble averages. An additional constraint is given by the normalization of the probability, expressed by Eq.(5.1).
Introducing one Lagrange multiplier for each constraint x (plus an additional multiplier for the normalization constraint) and taking the functional
90
derivative of S with respect to P, one can show that the result of the constrained
maximization of S is
eH(G)
P(G) =
,
(5.21)
Z
where
m
X
H(G) =
x (G)
(5.22)
=1
is the so-called partition function, which is enforcing the normalization constraint for the probability.
The above expressions coincide with those of traditional statistical physics.
Indeed, the beauty of the Maximum Entropy approach is that of showing that
the entirety of statistical physics can be reformulated exactly as an inference
problem from limited information. Indeed, from the knowledge of only a few
macroscopic quantities (like the total energy) of a system, statistical physics
looks for a least biased estimate of the microscopic properties of the system,
in terms of the probability P(G) of the microscopic configurations. This consideration establishes a fascinating connection between information theory and
statistical physics.
In the context of networks, we are interested in doing the same operation.
This establishes another beautiful connection, this time to graph theory. From
the knowledge of a few aggregate properties {x }m
=1 (such as the degree sequence), we want to construct completely random ensembles of graphs. Applying the Maximum Entropy principle to graph ensembles leads to the so-called
Exponential Random Graph (ERG) models, which were first introduced in social network analysis [22, 8] to generate ensembles of graphs matching a given
set of observed topological properties. ERGs were then rediscovered within
an explicit statistical-mechanics framework [19, 6, 7] where it was showed that
traditional tools borrowed from statistical physics could successfully contribute
to investigate and sometimes even solve them explicitly [19].
Homework 5.11 Prove the following relation (useful in the following):
E(x ) =
1 Z
=
,
Z
(5.24)
(5.25)
In the following, we consider specific examples of ERGs and we show that the
ER model and the Park-Newman implementation of the CM can be recovered
as particular cases of maximum-entropy ensembles. Moreover, we will show that
the general method allows to extend the approach to different constraints.
5.3.2
91
We start by considering the rather simple, still quite general, case when the
Hamiltonian can be expressed in the form
X
H(G) =
ij gij .
(5.26)
i<j
Note that in this model the total energy H(G) of the graph is the sum of
the energies ij corresponding to its individual links. Each energy ij can be
regarded as the cost of placing a link between i and j. With this choice the
partition function reads
XY
X
X P
eij gij
(5.27)
eH(G) =
e i<j ij gij =
Z =
{gij }
{gij } i<j
{gij }
Y X
ij gij
(1 + e
ij
)=
Zij ,
i<j
i<j
(5.28)
ln Zij =
i<j
ij
(5.29)
i<j
where
ij = ln Zij .
(5.30)
Equations (5.26-5.30) completely define the model. From the free energy it
is possible to compute all the relevant quantities. For instance, the expected
occupation number of the pair of vertices i, j, representing the probability that
such vertices are connected, is
pij = E(gij ) =
ij
1
=
ij
1 + eij
X X
E(Lu ) =
gij =
pij .
i<j
(5.31)
(5.32)
i<j
ER Random Graph
We now show that the ER random graph model can be recovered as particular cases of the exponential model defined by the Hamiltonian (5.26). This is
obtained when all energies are equal:
ij = .
With such a choice, the Hamiltonian reads
X
H(G) =
gij = Lu (G).
i<j
(5.33)
(5.34)
92
1
1 + e
(5.35)
and, as expected, we recover the constant form for the connection probability
characterizing the ER model.
Configuration Model
We now consider the additive case
ij = i + j ,
(5.36)
which results in
H(G) =
X
X
X
(i + j )gij =
i gij =
i ki (G).
i<j
(5.37)
i6=j
Note that in this case we are requiring to set the expected value of each degree
hki i to any desired value ki by tuning the corresponding parameter i . In other
words, we are fixing the desired degree sequence and we expect this case to be
equivalent to the version of the configuration model described in Section 5.2.4.
Indeed, Eq. (5.31) now reads
pij =
1
1 + ei +j
(5.38)
5.3.3
Directed graphs
We now briefly consider the directed case. The Hamiltonian (5.26) becomes
X
H(G) =
ij gij
(5.39)
i6=j
and calculations analogous to those presented above allow to write the partition
function as
Y
Y
Z=
Zij =
(1 + eij )
(5.40)
i6=j
i6=j
X
i6=j
ln Zij =
X
i6=j
ij .
(5.41)
93
ij
1
,
=
ij
1 + eij
(5.42)
(5.43)
i6=j
gij = L(G)
pij = p =
i6=j
1
.
1 + e
(5.44)
X
X
(i + j )gij =
i kiout (G) + i kiin (G)
i6=j
pij =
1
.
1 + ei +j
(5.45)
This choice is equivalent to the directed version of the configuration model
defined in Eq. (5.17), where xi ei , yj ej .
5.3.4
Weighted graphs
Using the same general recipe, one might also consider other ensembles of
graphs. As a last example, we consider an exercise involving an ensemble of
weighted graphs. A weighted graph G is still described by the entries {gij }
of an adjacency matrix. However, these entries are now assumed to be nonnegative integers.
Exercise 5.4 Consider the Hamiltonian
H(G) =
ij gij ,
(5.46)
i<j
where G is a generic weighted graph. Following the derivation leading to Eq. (5.31),
write the expected weight E(gij ) of the connection between vertices i and j. Assume that the possible values of gij are non-negative integers, ranging from 0 to
+.
Bibliography
[1] M.E.J. Newman, S.H. Strogatz and D.J. Watts, Phys. Rev. E 64, 026118
(2001).
[2] M. Molloy and B. Reed, Random Structures and Algorithms 6, 161 (1995).
[3] A.C.C Coolen, A. De Martino, A. Annibale, J. Stat. Phys. B 136, 103567
(2009).
[4] E.S. Roberts, A.C.C. Coolen, Phys. Rev. E 85, 046103 (2012).
[5] E. T. Jaynes, Physical review 106(4), 620 (1957).
[6] J. Berg and M. L
assig, Phys. Rev. Lett. 89, 228701 (2002).
[7] Z. Burda, J. Jurkiewicz and A. Krzywicki, Phys. Rev. E 69, 026106 (2004).
94
Chapter 6
Random Graph
Implementation
In the previous chapters we have discussed at some length examples of graphs
and of their properties. The main idea is to come closer to an understanding
of the nature of real world graphs. In the history of our understanding of these
graphs computer models and computer analysis tools have played an important
role. In the computer science chapters we will delve a bit deeper into these
tools.
These analyses can help us to:
visualize static graphs to gain a visual impression of the shape of the graph
compute static network properties of graphs, such as the number of edges,
the shortest path between two vertices, the degree of a vertex, or the
clustering coefficient of a graph
understand dynamic properties, such as how graphs grow or shrink in time
In this chapter we will look deeper at how we can implement graphs in the
computer, so that we can perform computer analyses of the graphs. Given the
limited time that there is in a part of one-semester course, we will only cover
some basic notions. This will be, however, quite useful, as the basics are enough
to give an idea of how more elaborate systems work.
In the remainder of these notes we will use three computer tools. We will
use Gephi for graph visualization, we use a programming language such as C,
C++, Java, or Python to compute network properties, and we will introduce
Netlogo to study dynamic behavior of graphs.
Gephi and Netlogo are high level tools. C, C++, Java, and Python are lower
level tools. The level of abstraction that you work at is more detailed, and it
takes more effort to achieve things. The high level tools are built using the low
level languages. Therefore, to understand the limitations of the high level tools,
it pays to learn a bit about the low level languages. Therefore, the homework
exercises will concentrate on implementing graphs using a low level language.
In all programming homework exercises of this course, hand in the source
code, the input to the program (if any), and the output (if any). Your program
should contain documentation in the source code (comments). Provide in your
95
96
report a short descripition of what you did, and how essential aspects of your
approach work (except for very small exercises).
6.1
Random graph
We will start with implementing a small regular network so that we can compute
a few important properties of the network. In later chapters we see how to
work with larger graphs, with the Configuration Model, and with real world
graphs. We will see how the graphs can be constructed, and how certain network
properties can be checked to see if the graph exhibits regular or real world
properties.
6.1.1
Adjacency Matrix
The programs in this chapter are used to study network properties such as
counting edges or computing shortest paths. Our first program will generate a
graph. The program will store it in memory, and will compute network properties. The aim of these exercises is to gain proficiency in manipulating random
graphs, and, later on, with real world graphs. By following the exercises, you
will create a functioning program that will be able to do increasingly useful
things.
As has been noted before, graphs can be represented in a computer program
by an adjacency matrix. The data structure of an adjaceny matrix of a graph of
N vertices can be represented in a programming language is a two-dimensional
array of boolean values. The boolean values represent the presence of an edge.
TRUE means an edge is present, FALSE means there is no edge. An edge between
vertex i and vertex j can be represented as m[i][j] = TRUE in a C-like language,
where m is the variable holding the matrix. A logical choice for the datatype of
m could be an (two-dimensional) array of BOOLEAN or an int.
Exercise 6.1 Question: How much memory does a boolean data type take in
your favorite language? How much does an integer take? How much memory
does it take to store a graph of 1024 vertices? How can this be reduced? At what
cost?
Now that we have discussed the data structure, it is time to start writing
the program.
First, arrange to get access to a machine on which you can program. You
can use your own computer, or one in the computer labs.1 We will only be using
simple text-based command-line programs in this course. Your basic skills from
first year programming classes should be all you need.
Next, write a program in C, C++, Java, or Python using an adjacency
matrix.
1 Just as with your introduction to programming class, the machine should have a text
editor and a compiler or a development environment such as Eclipse, Visual Studio, or Xcode.
(Wordprocessors such as Word and Pages save their text in ways that a compiler cannot read.
Wordpad, Emacs, VI, TextEdit, are texteditors that save their output as plain text that can
be read by a compiler).
97
6.1.2
Random Graph
6.1.3
Next, we will compute a few properties of the network. We will work towards
properties that allow us to discern a difference between regular networks and
complex networks. We will start with computing the mean and variance of the
edge distribution of the networks that we have.
Exercise 6.2 What do you expect is the mean of the edge distribution of the
regular graphs for the values of p that you have used in the previous Homework
Exercise? What is the expected variance?
98
Homework 6.3 Write two routines to compute (1) the mean and (2) the variance of the number of edges in a series of graphs. (Defintions and algorithms
of mean and variance can be found in the usual places, such as your statistics
books, stackoverflow.com, or Wikipedia). Generate 10 random graphs, and compute mean and variance of the number of edges. Do not print the edges, only the
mean and variance if the number of edges for the graphs. Generate 100 random
graphs, and compute mean and variance of the number of edges. Is the mean of
100 graphs different from 10 graphs? What about the variance? Why?
Exercise 6.3 Is your answer correct? Why do you believe that your answer is
correct? Name two strategies to use so you can have more faith in the correctness
of your program.
6.1.4
Next we will consider computing the distribution of the degree of the vertices
of a graph.
Homework 6.4 Write a program to compute the degree distribution of a random graph with n = 10 vertices and p = 0, 0.05, 0.5, 1.0 edge probabilities. The
program should compute and print the following numbers: the min, max, mean,
median, and variance of the degree distribution of each graph. Repeat this exercise with n = 100. Do the values of min, max, mean, median, and variance
each increase or decrease compared to n = 10? Why do you think is that the
case (provide an argument for each case)?
Time and Space Complexity
For an algorithm to be useful it has to be correct (it computes what it is supposed to compute) and it has to be efficient (it should compute its result in as
little time and space as possible). This notion of efficiency is formally called
computational complexity, and often, when the meaning is clear, just complexity.2 For most algorithms the computational complexity depends on the
size of the input problem, and complexity is expressed in big O notation,
where the O is followed by some function, usually a linear, exponential, or logarithmic function. O(n) should be read as: the complexity of the algorithms
is of the order of n, which means that there is a constant C such that the
complexity is bounded by C n.
The time complexity of an algorithm indicates how the running time scales
with the size n of the input. A linear time complexity means that when the size
of the matrix increases by a factor of 10, then the run time of the algorithm also
increases by that factor. An algorithm with a quadratic time complexity takes
a 100 times longer to run. A single for loop, running from 1 to n has a run time
that is linear in n (assuming the operations inside the for loop take constant
time). A doubly nested for loop from 1 to n has quadratic time complexity
(since the operations inside the outer-most loop now also take linear time, not
constant).
2 Note that the word complex is used in many contexts. To put it differently, we are here
dealing with the computational complexity of complex network computations.
6.2. VISUALIZATION
99
The space complexity of an algorithm indicates how much storage an algorithm needs. An adjacency matrix is a two dimensional array, and thus has
quadratic space complexity in n, the number of vertices.
Some network properties can be computed quickly, such as counting the
number of vertices or edges. We say that the run-time of the algorithm is linearly
(or polynomially) related to the size of the network. However, some network
properties are harder to compute exactly, and require a run-time that has a
quadratic or cubic relation to the size of the network, such as computing the
shortest path between all pairs of vertices. For these properties approximation
algorithms may have been devised, that run in linear time, or even faster, in
sub-linear time (see, for example, [Russel and Norvig(1995)] for examples).
6.2
Visualization
6.2.1
Downloading Gephi
In order to use Gephi, you must download it from the ususal download site.
Google for Gephi, or go to http://gephi.github.io and download the package
that is suited for your operating system (Linux, Mac OS X, Windows). Everything should work right away under Linux and Mac OS X, for Windows you may
need to install the latest Java version (available at https://java.com/en/download/).
6.2.2
Running Gephi
Startup the program, and launch a sample network, Les Miserables. Your screen
should look like the screenshot in Figure 6.1. Your sample file is imported, just
click OK on the import report. You should now see a nice graph of the relations
between the characters in this famous play.
In order to reduce the homework workload, there are no Compulsory exercises for Gephi. Gephi is a nice tool to play around with, that can be quite
useful to get a feel for the graphs that you are working with.
Exercise 6.4 Click on Data Laboratory, go back to the Overview, right click
on the large red circle, switch to Data Laboratory, and you will see that the best
connected character in Les Miserables is Valjean.
To the right of the pane are buttons to compute network properties such as
Average Degree, Network Diameter, etc. Click on the Run buttons, and see what
happens.
100
Bibliography
[Russel and Norvig(1995)] Russel, S.J. and P. Norvig (1995): Artificial intelligence: a modern approach. Series in Artificial Intelligence, Prentice Hall.
101
Chapter 7
Configuration Model
Implementation
In the previous chapter we have discussed how to write a program to generate
and manipulate Erd
os-Renyi networks using an adjacency matrix. We will now
write some more code to compute network properties for the random networks.
These will hopefully show us the obvious, that the network is regular, and does
not display real-world properties.
In this chapter we will also take further steps in the road towards real-world
networks by writing a program to generate networks using the Configuration
Model.
7.1
7.1.1
In the previous chapter the edges were assigned randomly to the pair i and j.
As you will recall, the configuration model assigns the edges differently, using
a pre-specified sequence that defines the degree of each vertex. The edges are
then assigned randomly, provided that the degree sequence is satisfied. The
next exercise is to modify our program to generate edges randomly according
to the pre-specified degree sequence.
Exercise 7.1 Degree Sequence Histogram
Print a histogram of the degree of vertices in the graph. In a histogram there
should be a bar for each value of the degrees. You can print a histogram by
declaring an array to store the counts of the vertex degrees. The array is indexed
102
103
by the degree k of each vertex. Loop through all vertices, and increment the array
element at the right index if there is a vertex with the right degree value. Then
print the array value as a number, or print it as a series of asterisks.
Now that we have worked with a degree sequence, we are ready to implement
the configuration model.
Homework 7.1 (1) On a piece of paper, draw a graph of 10 vertices. You
can choose how you draw the edges. (2) Write down the degree sequence of this
graph. (3) In your program, declare an array of integers (of size 10) for the
degree sequence. (4) Initialize the array to the degree sequence.
The program will now, for each vertex, attempt to create the correct number
of edges, connecting to a randomly chosen destination vertex, if that destination
vertex still has room (i.e., the prespecified degree has not yet been reached).
(5) So: For each (source) vertex,
randomly choose a destination vertex whose degree is too low according to
its pre-specified degree, and create an edge between the two vertices
(if there is no destination vertex left whose degree still allows incoming
edges, stop)
repeat this process until the actual degree of the source vertex matches that
of the pre-specified degree
7.1.2
Visualization
How can you know that your program is working correctly? One way to see if
your program behaves as expected is to have your program output the edges,
and import the that into Gephi, to have it visualize the graph.
Exercise 7.2 To do so, you can write a small routine that prints (source,
destination) pairs of the vertex ids. Write the output to a file, or print it to
standard output and redirect the output to a file which you then import into
Gephi. program > file.
There are other ways to see if your program behaves correctly, such as by
writing a routine to check the consistency of the output. We will do so shortly.
104
7.2
7.2.1
Due to the way that the edge destinations were chosen, multi-edges and selfloops may have occurred. We can write a checking routine to find mismatches
between the actual and the pre-specified degree sequence. Multi-edges and
self-loops are undesirable, however, removing them would subtly change the
randomness of the graph. We need another solution, as chapter 5 discussed in
depth.
One way to address this issue is the Repeated Configuration Model, which
we will implement in this chapter. We will start with a small warmup to check
the degree and to count the number self-loops and multi-edges.
7.2.2
Check Routines
Does the output of your program match the pre-specified degree sequence for
the graphs that you tried?
Homework 7.2 Check routine
Write a routine that checks the degrees of the vertices against the degree sequence. Have the routine report the number of vertices with mismatching degrees, and the number of missing edges. Run the routine a few times on the
graph. Does the graph have the required degree sequence? If so, why? If not,
why not?
105
7.2.3
You have most likely found that the resulting degree sequences were not always
realized by the program. We will now implement the repeated configuration
model to remedy this problem.
Homework 7.4 Repeated Configuration model
Create a new copy of your program. Change your program so that when a graph
with loops and multi edges is discovered, it is discarded, and the program starts
over to try with a new graph. Run the repeat configuration model for the two
graphs (10 vertices, 20 vertices). How often does your program have to restart
(for each graph)? Is termination guaranteed?
At this point, you may want to output the graph to visualize it with Gephi,
and to compare it with the un-repeated configuration model.
The repeated configuration model addresses a shortcoming of the original
configuarion model by calling the model repeatedly. Thus, the time complexity
of the repeated configuration model is larger than of the original configuration
model.
The time complexity can be assessed in at least two ways: by inspection,
and by experiment.
Exercise 7.5 (a) Assess the time complexity of the repeated configuration
model by code inspection/logical reasoning. Does the time complexity depend
on n, the number of vertices of the graph? Does it depend on other factors?
(b) Assess the time complexity of the repeated configuration model experimentally. Run the repeated configuration model, and count how many times it
restarts, how often the original configuration model is called. Do this experiment
20 times, and average the result.
Section 5.2 contains a more elaborate description of alternative implementations. It also contains a remark about the computational complexity of the
repeated configuration model (Edge stub reconnection).
Part II
Applications of Networks
106
Chapter 8
Percolation
This chapter is devoted to percolation theory, which is the study of connectivity
in large networks. In Section 8.1 we look at ordinary percolation on an infinite
lattice, which is a model for connectedness of a large regular network in which
edges are randomly removed. In Section 8.2 we look at invasion percolation
on an infinite lattice, which is a model for the spread of a virus through a
large regular network in which edges have random transmission capacities. In
Section 8.3 we look at the configuration model and investigate how vulnerable
its connectedness is when vertices are being removed, either deterministically or
randomly. This is a model for a malicious attack on a network by a hacker.
A standard reference for percolation theory is Grimmett [2].
8.1
Ordinary percolation
108
CHAPTER 8. PERCOLATION
xy
(8.2)
if and only if there is a path connecting x and y such that w(e) p for all
e . (A path is a collection of neighbouring vertices connected by edges.) Let
Cp (0) denote the p-cluster containing the origin, and define
(p) = P(|Cp (0)| = )
(8.3)
with P denoting the law of w, i.e., the probability that the origin is connected
to infinity via edges with weight p. This is called the percolation function.
We have
C0 (0) = {0},
C1 (0) = Zd ,
p 7 Cp (0) is non-decreasing,
(8.4)
(1) = 1,
p 7 (p) is non-decreasing.
(8.5)
p > pc :
(8.7)
It is known that in the supercritical phase there is a unique infinite cluster with
P-probability 1.
109
8.2
Invasion percolation
Again consider Zd and (Zd ) with the random field of weights w. Grow a cluster
from 0 as follows (k k denotes the Euclidean distance on Zd ):
1. Invade the origin: I(0) = {0}.
2. Look at all the edges touching I(0), choose the edge with the smallest
weight, and invade both that edge and the vertex at the other end: I(1) =
{0, x}, with x = argminyZd : kyk=1 w({0, y}).
3. Repeat 2 with I(1) replacing I(0), etc. (see Fig. 8.4).
In this way we obtain a sequence of growing sets I = (I(n))nN0 with I(n) Zd
the set of invaded vertices at time n and |I(n)| n + 1. (The reason for the
inequality is that the vertex at the other end may have been invaded before.
The set of invaded edges at time n has cardinality n. An invaded edge no longer
counts for the probing of the weights, because it cannot be invaded a second
time.) The invasion percolation cluster is defined as
CIPC = lim I(n).
n
(8.8)
1
|BN CIPC | = 0 a.s.
|BN |
with BN = [N, N ]d Zd .
(8.10)
a.s.
(8.11)
110
CHAPTER 8. PERCOLATION
Figure 8.4: The first three steps of the invasion: I(n) for n = 0, 1, 2.
111
Figure 8.5: Picture of invasion percolation becoming trapped inside Cp (shaded region). All edge weights in Cp are p, all edge weights incident to the boundary of
Cp are > p. The lower black circle is the origin (the starting location of the invasion).
The upper black circle is the vertex where the invasion enters Cp , which occurs at time
p .
Homework 8.1 Work out the details of the above argument. (Note that we
tacitly use that the critical percolation threshold of the full space is the same as
that of the half-space, which is not obvious but is true.)
Now, the edge invaded at time p , being incident to Cp , has weight > p. Since
the invasion took place along this edge, all edges incident to I(p 1) (which
includes this edge) have weight > p too. Thus, all edges incident to I(p ) Cp
have weight > p. However, all edges connecting the vertices of Cp have weight
p, and so after time p the invasion will be stuck inside Cp forever. Not
only does this show that CIPC = I(p ) Cp ( Zd , it also shows that Wn p
for all n large enough a.s. Since p > pc is arbitrary, it follows that
lim sup Wn pc
a.s.
(8.13)
Next we argue why the reverse inequality holds as well. Indeed, suppose
that Wn p for all n large enough for some p (0, pc ). Then
CIPC Cp I(p)
(8.14)
p = inf m N0 : Wn p n m
(8.15)
with
112
the first
onwards
and this
Note
CHAPTER 8. PERCOLATION
1
|BN CIPC | (p) a.s.
|BN |
p > pc ,
(8.16)
(8.17)
8.3
Let us leave the world of infinite lattices and return to the world of finite random graphs. In Chapters 2 and 3 we saw that percolation may occur in the
Erd
os-Ren
yi model and in the configuration model. We found that the critical percolation thresholds in these models are = 1 and = 1, respectively.
Namely, for > 1 and > 1, respectively, the largest cluster has size (n) as
n , with n the number of vertices, while for < 1 and < 1, respectively,
it has size (log n).
113
We focus on the configuration model with n vertices and with vertex degrees
D1 , . . . , Dn that are i.i.d. random variables drawn from a prescribed probability distribution f . We will be particularly interested in choices of f having a
polynomial tail f (k) Ck , k , with exponent (3, ).
The key quantity determining the occurrence of percolation is (recall (3.12))
P
k(k 1)f (k)
P
(0, ).
(8.18)
= kN
kN kf (k)
Suppose that > 1. We ask ourselves the following question:
Given is a sequence = ((k))kN0 of probabilities, each taking values in
[0, 1]. Suppose that a hacker attacks the network by randomly removing
vertices: vertex i is retained with probability (Di ) and is removed with
probability 1 (Di ) (together with all its incident edges), independently
for different vertices. After the attack, does the the largest cluster still
have size (n) or not?
This question was answered by Janson [3]: The answer is yes if and only if
P
k(k 1)(k)f (k)
> 1.
(8.19)
= kN0P
lN0 lf (l)
Thus, if 1, then the answer is no and we say that the attack was successful.
Homework
8.2 Give a heuristic explanation why the denominator in (8.19)
P
is not kN0 k(k)f (k). Hint: Recall the argument in Section 3.1.4.
Two choices are of interest:
(1) (k) = (0, 1) for all k N. This corresponds to a random attack,
where the terrorist removes a fraction 1 of the vertices without looking
at the degrees.
(2) (k) = 0 for k > k and (k) = 1 for k k , with k N some threshold
value. This corresponds to a deterministic attack, where the terrorist
removes all vertices with degree larger than k .
In case (1) we have = , and so the attack is successful if and only if 1/,
i.e., a fraction 1 (1/) of the vertices is removed. In case (2) we have
P
1kk k(k 1)f (k)
P
=
,
(8.20)
kN kf (k)
and so the attack is successful if and only if
P
1
1kk k(k 1)f (k)
P
.
k(k
1)f
(k)
kN
(8.21)
114
CHAPTER 8. PERCOLATION
1
( 3)
k
,
( 3)( )
k .
(8.23)
This gives us the approximate criterion that the attack is successful if and only
if k / k( ) with
1/( 3)
1
.
k( ) = ( 3) ( 2) ( 1) 1
(8.24)
lim k( ) = 2.
(8.25)
Thus, for 3 the network becomes extremely vulnerable because only few vertices with a high degree need to be removed in order to take down the network,
while for the network becomes extremely robust because all vertices with
degree > 2 need to be removed in order to take down the network.
k( )
3
Figure 8.7: Plot of 7 k( ).
Exercise 8.3 Prove (8.25). Hint: Show that limu1 (u 1)(u) = 1 and
limu 2u [(u 1) (u)] = 1.
Bibliography
[1] O. Angel, J. Goodman, F. den Hollander and G. Slade, Invasion percolation
on regular trees, Ann. Probab. 36 (2008) 420466.
[2] G.R. Grimmett, Percolation, Springer, Berlin, 1989.
[3] S. Janson, On percolation in random graphs with given vertex degrees,
Electronic J. Probab. 14 (2009) 86118.
115
Chapter 9
Epidemiology
In this chapter we look at a model for the spread of an infection over a network,
called the contact process. This process is an example of a larger class of random
processes referred to as interacting particle systems. In Section 9.1 we look at
infinite lattices, in Section 9.2 at finite lattices, in Section 9.3 at finite random
graphs. In Section 9.4 we investigate a closely related problem, namely, how a
rumour spreads through a network.
A standard reference for interacting particle systems is Liggett [8].
[begin intermezzo]
The Poisson process with rate (0, ) is defined as the increasing sequence
of random times (Ti )iN0 such that T0 = 0 and Ti Ti1 , i N, are i.i.d. random
variables with common distribution EXP(), i.e.,
P(T1 > t) = et ,
t 0.
(9.1)
We may think of Ti as the time at which a random clock rings for the i-th
time. We imagine that the clock rings at rate , i.e., the probability that a
ring occurs in an infinitesimally small interval dt is dt. Indeed, by dividing up
the time interval [0, t] into pieces of length t each and letting t 0, we see
that
P(T1 > t) = P no ring occurs in [0, t]
(9.2)
= lim (1 t)t/t = et ,
t 0,
t0
9.1
9.1.1
The contact process is a Markov process (t )t0 with state space = {0, 1}Z ,
d 1, where
t = {t (x) : x Zd }
(9.3)
116
117
x Zd , ,
(9.4)
playing the role of the rate at which the state of vertex x flips in the configuration , i.e.,
x
(9.5)
with x the configuration obtained from by changing the state at vertex x
(either 0 1 or 1 0). In the contact process these rates are chosen as
(y), if (x) = 0,
yx
c(x, ) =
(0, ),
(9.6)
1, if (x) = 1,
i.e., infected vertices become healthy at rate 1 and healthy vertices become
infected at rate times the number of infected neighbours. The parameter
measures how contagious the infection is. (Because Zd is infinite, a little work
is needed to show that the contact process is well-defined. Details can be found
in Liggett [8].)
The configuration space comes with a natural partial order : we say that
is everywhere smaller than 0 , written 0 , when
(x) 0 (x)
x Zd .
(9.7)
This order is called partial because some pairs of configurations are ordered
while others are not.
9.1.2
Note that
c(x, ) = c(x + y, y )
y Zd
(9.8)
with y the shift of space over y, i.e., y is the configuration viewed relative
to vertex y:
(y )(x) = (x y),
x Zd .
(9.9)
Property (9.8) says that the flip rate at x only depends on the configuration
as seen relative to x, which is natural when the interaction between individuals
is shift-invariant. Also note that
118
CHAPTER 9. EPIDEMIOLOGY
Homework 9.1 Show that contact process preserves the partial order ,
i.e., two realisations of the contact process starting from , 0 with 0 can
be coupled in such a way that an infinitesimally small time later the two configurations are still everywhere ordered (with probability 1). Hint: Recall the
intermezzo about coupling in Chapter 2.
In what follows we will see that properties (9.8) and (9.10) allow for a number
of interesting conclusions about the equilibrium behaviour of the contact process,
as well as its convergence to equilibrium.
9.1.3
Convergence to equilibrium
(9.12)
is stochastically non-decreasing,
[1]
Pt
is stochastically non-increasing,
t 7 Pt
t 7
[0]
(9.13)
[0]
= lim Pt ,
t
[1]
= lim Pt ,
t
(9.14)
which are referred to as the lower equilibrium, respectively, the upper equilibrium.
9.1.4
Note that [0] is a trap for the dynamics (if all sites are healthy, then no infection
will ever occur), and so we have
= [0] .
(9.15)
= [0]
(extinction of infection),
> d :
6= [0]
(survival of infection).
(9.16)
119
Thus, for large there is an epidemic while for small there is not.
Let p() denote the density of the infections in . The critical infection
threshold
d = inf{ (0, ) : p() > 0} = sup{ (0, ) : p() = 0}
(9.17)
separates the phase of extinction of the infection from the phase of survival of
the infection. The function 7 p() is non-decreasing and continuous (see
Fig. 9.1). The continuity at = d is hard to prove.
2dd 1,
1 < .
(9.18)
These inequalities combine to yield that d (0, ) for all d 1, so that the
phase transition occurs at a non-trivial value of the infection rate parameter.
Sharp estimates are available for 1 , but these require heavy machinery. For
instance, it can be shown that the one-dimensional contact process survives
when
2
2
1
80
>
.
(9.19)
+1
+1
81
This yields the bound 1 1318 (see Durrett [4] for details). The true value
is 1 1.6494, which can be shown with the help of simulations and with the
help of of approximation techniques.
9.2
Suppose that we consider the contact process on a large finite lattice, say
N = [0, N )d Zd ,
N N.
(9.20)
120
CHAPTER 9. EPIDEMIOLOGY
E[1]N ([0]N ),
(9.21)
We expect this time to be growing slowly with N when < d and rapidly
with N when > d , where d is the critical infection threshold for Zd . The
following results are shown in Durrett and Liu [5], respectively, Durrett and
Schonmann [6]: There exist C (), C+ () (0, ) such that
< d :
> d :
lim
E[1]N ([0]N )
log |N |
= C (),
lim
(9.22)
= C+ ().
Thus, in the subcritical phase the time to extinction is logarithmic in the volume
of the lattice (i.e., very slowly increasing with the volume), in the supercritical
phase it is exponential (i.e., very rapidly increasing with the volume). This is a
rather dramatic difference.
Homework 9.3 Give a heuristic explanation for the scaling of the average
extinction time in the two phases.
It can be shown that in the supercritical phase
[0]N
lim P[1]N
> t = et
N
E[1]N ([0]N )
t > 0,
(9.23)
9.3
Chatterjee and Durrett [3], Mountford, Mourrat, Valesin and Yao [9] look at the
contact process on the configuration model and show that, for every (0, )
and every f with (2, ), the average time to extinction grows exponentially
fast with n (the number of vertices). This says that the contact process on the
configuration model with a power law degree distribution is always supercritical: regardless of the value of the average extinction time grows very rapidly
with the size. Apparently, the presence of vertices with large degrees makes it
easy for the infection to survive: hubs easily transmit the infection. In Hao and
Schapira [7] it is shown that the same behaviour occurs for (1, 2].
Similar results have been obtained for a selected class of other random
graphs, such as regular trees and the supercritical Erdos-Ren
yi random graph.
However, it turns out to be hard to obtain sharp estimates. It would be interesting to understand what happens for the preferential attachment model.
Partial results have been obtained by Berger, Borgs, Chayes and Saberi [1].
9.4
121
where the infimum runs over all paths from V1 to V2 . It is shown in Bhamidi,
van der Hofstad and Hooghiemstra [2] that in the supercritical regime > 1,
for every (3, ),
lim [Tn
log n] = Z
in distribution,
(9.25)
where
= 1/( 1) and Z is a non-degenerate R-valued random variable. Thus,
the rumour needs a time of order log n to spread through the network, which is
plausible because of the small-world property of the configuration model.
It is further shown in [2] that in the supercritical regime > 1, for every
(2, 3),
lim Tn = Z1 + Z2 in distribution,
(9.26)
n
,
(3, ),
1
(9.27)
=
2( 2) , (2, 3).
1
Since >
when > 1 and (3, ), we see that the rumour has a tendency
to spread along edges with an atypically small crossing time.
Interestingly, both and
decrease with and increase with . Indeed, as
decreases, the tail of the degree distribution gets thicker and thicker and the network acquires more and more hubs. Consequently both the typical distance and
the typical travel time decrease. For the special case where f (k) = k /( ),
k N, we have
( 2)
= ( ) =
1.
(9.28)
( 1)
122
CHAPTER 9. EPIDEMIOLOGY
Bibliography
[1] N. Berger, C. Borgs, J.T. Chayes and A. Saberi, Asymptotic behavior and
distributional limits of preferential attachment graphs, Ann. Probab. 42
(2014) 140.
[2] S. Bhamidi, R. van der Hofstad and G. Hooghiemstra, First passage percolation on random graphs with finite mean degrees, Ann. Appl. Probab.
20 (2010) 19071965.
[3] S. Chatterjee and R. Durrett, Contact processes on random graphs with
power law degree distributions have critical threshold 0, Ann. Probab. 37
(2009) 23322356.
[4] R. Durrett, Lecture Notes on Particle Systems and Percolation, Wadsworth
Pub. Co., 1988, Belmont CA, USA.
[5] R. Durrett and X. Liu, The contact process on a finite set, Ann. Probab.
16 (1988) 11581173.
[6] R. Durrett and R. Schonmann, The contact process on a finite set II, Ann.
Probab. 16 (1988) 15701583.
[7] C.V. Hao and B. Schapiro, Metastability for the contact process on the
configuration model with infinite mean degree.
[8] T.M. Liggett, Interacting Particle Systems, Grundlehren der mathematische Wissenschaften 276, Springer, New York, 1985.
[9] T. Mountford, J.-C. Mourrat, D. Valesin and Q. Yao, Exponential extinction time of the contact process on finite graphs, to appear.
123
Chapter 10
Pattern detection in
networks
In Chapter 5 we introduced various network ensembles built according to the
Maximum Entropy principle. In this chapter, we are going to use those ensembles as null models that allow us to detect empirical patterns in real-world networks. Such patterns are defined as statistically significant deviations from the
prediction of maximum-entropy ensembles, and reveal the presence of higherorder mechanisms that cannot be explained by the null models themselves.
Since this procedure requires maximum-entropy models to be fitted to empirical data, we will first introduce an important and powerful statistical criterion,
namely the Maximum Likelihood principle, and apply it to network models. We
will then describe a pattern detection method based on this principle.
10.1
As we have already discussed a number of times, one of the main goals in the
study of complex networks is that of reproducing the empirical topological properties of real-world networks by means of relatively simple theoretical models.
In general, given a real-world network and a mathematical model of a graph,
we need to tune the free parameters of the model to those values that optimally
reproduce the empirical properties of the network. Usually, this is done by selecting one or more target topological properties and looking for the parameter
values that make the expected value of these properties match the corresponding
observed value. But since we can target virtually as many topological properties as we want, and surely many more than the number of model parameters,
it is important to realize whether this choice is really arbitrary, or whether a
statistically correct criterion exists which selects a unique parameter value.
In this section we show that the Maximum Likelihood (ML) method, which
has a rigorous statistical basis, allows one to address this problem successfully.
We show that the ML criterion also yields an unbiased way to correctly randomize a network, overcoming the structural bias introduced by other methods.
124
10.1.1
125
Motivation
In general, any network model depends on a set of parameters that we col~ Let P(G|)
~ be the conditional probability of
lectively denote by the vector .
occurrence of a graph with adjacency matrix G, in the set of graphs spanned
~ For a given target
by the model, once the parameters are set to the value .
topological property (G) displayed by a graph G (in general a function of the
matrix G), or a set of target properties {i (G)}i , network models provide us
with the expected values hi i~ obtained as ensemble averages:
E~ (i ) hi i~
~
i (G)P(G|).
(10.1)
When comparing the model with a particular realworld network G , one might
in principle derive (analytically or via numerical simulations) the dependence
of E~ (i ) on ~ and then look for the matching value ~M of the parameters ~
that realizes the equality
E~M (i ) = i (G ) i.
(10.2)
In general, the above system of equations might not admit a (unique) solution.
And even if it does, is the criterion leading to Eq. (10.2) statistically correct?
Finally, which target properties have to be chosen anyway?
To concretely illustrate some of the above limitations, we can use again the
simple example we considered in Sec. 5.1. We assume that a real network G
with n vertices and Lu Lu (G ) undirected links (see Eq. (4.12)) is compared
with an Erd
os-Renyi random graph model where the only (unknown) parameter
is the uniform connection probability = p. In the literature, a common choice
for the matching value pM of the parameter p is the one ensuring that the
expected number of links hLu ip = n(n 1)p/2 equals the empirical value Lu ,
which yields
2Lu
(10.3)
pM =
n(n 1)
as in Eq. (5.1). Clearly, choosing the average empirical degree k = 2Lu /n (see
Eq. (4.11)) or the link density cu = 2Lu /n(n 1) (see Eq. (4.17)) as target
properties yields exactly the same value for pM . However, different choices of
target properties would in general result in a different value for pM . For instance,
if the target property was taken to be the average clustering coefficient C defined
in Eq. (4.32), then one would get
pM = C ,
(10.4)
since the expected value of the clustering coefficient in the Erdos-Renyi model
coincides with the connection probability p.1
1 We recall from Chapter 4 that the clustering coefficient C defined in Eqs. (4.29) and (4.30)
i
can be viewed as the probability that two randomly chosen neighbours of vertex i are connected
defined in Eq. (4.32) can be viewed
to each other. Thus the average clustering coefficient C
as the probability that any two vertices sharing a common neighbour are mutually connected.
126
10.1.2
Generalities
In general, consider a (discrete for simplicity) random variable V whose probability distribution f (v|) (defined as the probability that V = v) depends on
a parameter . For a physically realized outcome V = v , f (v |) represents
the likelihood that such outcome is generated by the model with parameter
choice . Therefore, for fixed v , the optimal choice for is the value
maximizing f (v |). It is often simpler to define the log-likelihood function
() log f (v |) and maximize it, which gives the same value for the maximum.
The ML approach reverses the role of data and parameters, and makes the
latter subject to the former, thus achieving optimal inference from the empirical knowledge available. This method avoids the drawbacks of other fitting
methods, such as the subjective choice of fitting curves and of the region where
the fit is performed. This is particularly important in the case of networks
and other systems exhibiting broad empirical distributions which may look like
power laws with a certain exponent (which is also subject to statistical error)
in some region, but which may be more closely reproduced by a different value
of the exponent or even by different curves as the fitting region is changed.
By contrast, the ML approach always yields a unique and statistically rigorous
parameter choice.
In the context of network modelling, where we typically have a model gener~ the log-likelihood that a real network
ating a graph G with probability P(G|),
(10.5)
~
~ ~ ) = ()
(
~
#
= ~0
(10.6)
~
~
=
and checking the second derivatives to be negative in order to ensure that this
indeed corresponds to a maximum. Among all the possible matching values
~ the one preferred by the ML principle is ~ .
{~M } for the parameters ,
Throughout the rest of this Chapter, the empirical value of a network property X(G) measured on a real network G is denoted with an asterisk, i.e.
X X(G ), and the value of the parameter that maximizes the likelihood,
given the data, is also denoted with an asterisk, i.e. ~ . This reminds us
that the parameters are fixed by the data, and simplifies the full notation
~ which illustrates that ~ is ultimately a function of
~ = arg max~ ln P(G |),
10.1.3
127
Erd
os-R
enyi random graph
10.1.4
In the rest of the chapter, we will often consider a more general class of models
obtained when the links between all pairs of vertices i, j are drawn with different
~ where 0 < pij ()
~ < 1. Note that this class
and independent probabilities pij (),
includes some examples discussed in chapter 5. In this case
Y
~ =
~ aij [1 pij ()]
~ 1aij ,
P(G |)
pij ()
(10.8)
i<j
where the
becomes
aij s
are the entries of the adjacency matrix of graph G , and Eq. (10.5)
~ =
()
X
i<j
aij log
X
~
pij ()
~
+
log[1 pij ()].
~
1 pij ()
(10.9)
i<j
A biased example
For instance, let us consider a modified version of the Chung-Lu model we
introduced in Eq. (5.5), where now (2Lu )1 is replaced by a free parameter z:
pij (z) = zki kj ,
(10.10)
In principle, we would like to find that, among the possible values for the parameter z, the optimal one is precisely z = (2Lu )1 , because we already know
from sec.5.2.3 that this choice would ensure that the expected number of links
hLu i coincides with the observed value Lu , just like in the previous example for
the Erd
os-Renyi model, and also that the expected degree sequence will coincide
with the observed one. Let us see whether this can be actually achieved.
128
Homework 10.2 For the model defined by Eq. (10.10), write the log-likelihood
(z) log P(G |z) as in Eq. (10.9) and show that the ML criterion leads to the
parameter value z defined by the equation
Lu =
X
z ki kj
,
(1 gij
)
1 z ki kj
i<j
(10.11)
pij (z ) =
i<j
z ki kj .
(10.12)
i<j
zxi xj
,
1 + zxi xj
z > 0,
xi > 0
(10.13)
where the positive values {xi } are assumed to be fixed for the moment, while z
is a free parameter.
Homework 10.3 For the model defined by Eq. (10.13), show that the ML
criterion leads to the parameter choice z defined by the equation
Lu =
X
i<j
z xi xj
.
1 + z xi xj
i<j
(10.14)
The above exercise shows that the model defined by Eq. (10.13) is unbiased: the
ML condition (10.6) and the requirement hLu i = Lu are now equivalent, just
like in the Erd
os-Renyi model.
10.2
129
Detecting patterns in real-world networks means identifying nontrivial structural properties, i.e. properties that cannot be explained by simple random
graph models and therefore indicate the presence of nontrivial mechanisms
of network formation. One way of detecting such patterns is by isolating
the higher-order empirical topological properties that are not simply explained
by lower-order ones (for a discussion of the notion of first-order properties,
second-order properties, and so on, please refer to chapter 4). In principle,
there is no a-priori preferred level of organization separating higher-order from
lower-order properties. However, an accepted criterion is that of considering
local (i.e. first-order) properties as the fundamental building blocks of the network organization, because these properties (such as node degree, see chapter
4) are likely to be directly affected by basic properties of nodes, including nontopological properties such as (depending on the nature of the network) size,
wealth, importance, popularity, etc. Since such properties are usually very heterogeneously distributed over nodes, it is important to control for their effects
by comparing the real network with a maximum-entropy ensemble having the
same local properties.
For this reason, the maximum-entropy ensembles of graphs introduced in
Chapter 5 play an important role in network analysis, and are systematically
used as a benchmark, or null model, for real-world networks. However, this
requires that we make an important step. While a purely mathematical approach to the models introduced in Chapter 5 allows us to easily generate and
characterize such graphs ensembles analytically (for instance by assuming some
probability distribution for the model parameters and studying the resulting
network properties), such an approach does not allow us to study ensembles of
graphs whose realized properties coincide precisely with the observed properties
of a real-world network. In this section we are therefore forced to reverse the
perspective and take a statistical approach, where we use the ML principle to
make rigorous inference starting from the empirical knowledge available.
This will be achieved by first fitting the model ensemble to the data using
the ML principle, and then using the fitted ensemble to provide the expectation
values and standard deviations of various higher-order properties. We will then
compare these model predictions with the empirical data. Empirical properties that are significantly (i.e. by many standard deviations) different from the
expectations are interpreted as nontrivial patterns, not explained by the local
properties defining the maximum-entropy null model. An important feature of
this method is that, although one can in principle measure the averages and
standard deviations of higher-order properties by sampling many graphs from
the maximum-entropy ensemble using the graph probability P(G|~ ), the average and standard deviation are instead calculated analytically (with no need to
sample the ensemble), resulting in a method that is both unbiased and fast.
10.2.1
The simplest ensemble where the local properties of all nodes are controlled for
is the configuration model (CM), that we have introduced in Chapters 3 and 5.
As we discussed in detail in Chapter 5, it is not easy to ensure that, starting
from an empirical degree sequence, the CM is implemented in a way that does
130
not lead to a biased construction. In the rest of this chapter, we show how the
results discussed in Chapter 5, together with the maximum likelihood principle
discussed in sec.10.1, allow us to define a method to compare the observed
properties of a real-world networks with those of a CM that is fitted precisely
on the empirical node degrees.
To this end, we consider two exercises that help us appreciate once more the
subtleties of the implementations of the CM.
Homework 10.4 Consider the model defined by
pij (~x) = xi xj ,
xi > 0
i,
(10.15)
xi xj
,
1 + xi xj
xi > 0
i.
(10.16)
Show that the ML principle leads to the parameter value ~x defined as the solution of the following set of n nonlinear coupled equations:
ki =
X
j6=i
xi xj
1 + xi xj
i,
(10.17)
thus proving that in this model the expected degree sequence h~ki coincides precisely with the empirical one ~k .
Equations (10.15) and (10.16) in the above two exercises are in some sense
a generalization, to the case of n free parameters, of eqs.(10.10) and (10.13) respectively. The presence of n parameters allows us (at least in principle) to tune
the expected value of each degree of the network to the observed value. However, the exercises above instruct us that only the model defined by Eq. (10.16)
can do so in a manner that is compatible with the ML principle. Importantly,
Eq. (10.17) is formally identical to Eq. (5.14), but (as already noted when discussing the latter), there can be opposite interpretations to such formulae. In
Chapter 5 we mainly used Eq. (5.14) as a way to infer the expected degrees,
given some theoretical probability distribution of the values of ~x. Note that this
operation is very simple, as it only requires summing or integrating, see sec.5.2
the values of the connection probability (over all nodes except i). By contrast,
in Eq. (10.17) the degrees are fixed by observation, and the parameters ~x have
to be found accordingly. This complicates the problem significantly, because
now ~x is the solution of n nonlinear coupled equations.
2 Note that this model coincides with the Chung-Lu implementation of the CM, as already
noted in our discussion of Eq. (5.16)
131
X
k0
P (k 0 )
xk xk0
(xk )2
1 + xk xk0
1 + (xk )2
(10.18)
where P (k) is the empirical degree distribution, so that nP (k) is the number of
vertices with degree k, and k, k 0 take only the empirical values of the degrees.
The last term in the above equation removes the selfcontribution of a vertex
to its own degree.
Calculating averages and standard deviations
Once the parameters ~x are found, we can put them back into pij (~x) and use the
resulting pij pij (~x ) to calculate the expected value hXi of any higher-order
property X(G) of interest. In general, this can be done analytically only if X(G)
is a linear function of the entries {gij } of the adjacency matrix of G. However, if
X(G) has some nonlinear dependence on the constraints (i.e. the degrees {ki (G)}
in the case of the CM), then we can approximate the value of such constraints as
constant and equal to the empirical value (i.e. ki (G) ki ). This is because the
maximum-ensemble has been constructed precisely in order to keep the value
of those constraints as close as possible to the empirical value. While this is
in general rigorously true only in microcanonical ensembles, one expects that
the fluctuations of the constraints in canonical ensembles, although nonzero, are
still much smaller than the fluctuations of any other (unconstrained) quantity.
Therefore, to calculate the expectation values of the higher-order properties
of interest, we will treat the value of the degrees as fixed and equal to the
empirical value. In particular, the expectation value of the ANND defined in
Eq. (4.21) will be approximated as
P P
hgij i hgjk i
j6=i
Pk6=j
E (kinn ) hkinn i
(10.19)
j6=i hgij i
where hgij i = pij . Similarly, the expectation value of the clustering coefficient
defined in Eq. (4.30) is
P P
j6=i
=i,j hgij i hgjk i hgki i
P k6P
E (Ci ) hCi i
.
(10.20)
j6=i
k6=i,j hgij i hgki i
Using the same sort of approximation, it is also possible to estimate the
standard deviation
p
[X] hX 2 i (hXi )2
of the properties of interest. This provides us with some error bar using which
we can distinguish between properties that are statistically consistent with the
expectations of the null model and properties which are not consistent, thus
representing higher-order patterns.
132
a
1.0
120
100
80
60
40
20
0
<c>, c, c
0.8
0.6
0.4
0.2
0.0
0
d
0.4
<c>, c, c
30
25
20
15
10
5
0
0.3
0.2
0.1
0.0
10
20
30
40
50
10
20
40
50
e
25
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
20
<c>, c, c
30
k
15
10
5
0
0
10
20
30
k
40
50
10
20
30
k
40
50
h
1.0
200
<k nn>, k nn, k nn
0.8
<c>, c, c
150
100
50
0.6
0.4
0.2
0.0
0
0
50
100
k
150
200
50
150
200
i
600
500
400
300
200
100
0
100
k
<c>, c
<k nn>, k nn
0.15
0.10
0.05
0.00
0
500
50
133
Empirical results
In fig.10.1 we show an application of the ML method to the analysis of various
networks, namely the network of the 500 largest US airports [33], a synaptic
network [34], two protein interaction networks [35], an interbank network [36]
and the Internet at the Autonomous Systems level [37]. These are among the
most studied networks of this type.
We compare the correlation structure of the original networks, as ordinarily
measured by the dependence of kinn (G ) and Ci (G ) on ki (G ) (see chapter 4),
with the expected values hkinn i and hCi i obtained analytically using the ML
method. We also highlight the region within one standard deviation around the
average by plotting the curves hkinn i [kinn ] and hCi i [Ci ].
For the sake of comparison, we also report the average values obtained sampling the microcanonical ensemble with the local rewiring algorithm [11, 12]
(see Chapter 5), and the expected values over the ensemble of random graphs
with the same number of links (corresponding to the Erdos-Renyi random graph
model). It should be noted that the microcanonical method requires the generation of many randomized networks (each obtained after many rewiring steps),
the measurement of kinn and ci on each network separately, plus a final averaging. By contrast, the ML method only requires the preliminary estimation
of the {xi }. Then the calculation of hkinn i and hci i is analytical and takes
exactly the same time as that of the empirical values. As can be seen, the two
approaches yield very similar results most of the time. When they differ, the
deviations are presumably due to the fact that, as we mentioned in Chapter 5,
the microcanonical implementation of the CM is biased. For the two largest
networks (the protein interactions in S. cerevisiae and the Internet), we only
report the expectations obtained using the ML method, as the microcanonical
approach would require too much computing time.
The results shown in fig.10.1 allow to interpret the effect of the degree sequence on higher-order properties. Firstly, the trends displayed by the CM
are not flat as those expected in the Erdos-Renyi case. This confirms that
residual structural correlations, simply due to the enforced constraint, are still
present after the rewiring has taken place. The presence of these correlations
does not require any additional explanation besides the existence of the constraints themselves. This is very different from the picture one would get by
using the (wrong) expectation of Eq. (10.15) which would yield flat trends as
well, naively suggesting that correlations can never be traced back to the degree
sequence alone.
Secondly, while the trends observed in all the networks considered are always decreasing, they unveil different correlation patterns when compared to
the randomized trends. The real interbank data are almost indistinguishable
from the randomized curves, meaning that structural constraints can fully explain the observed behaviour of higher-order network properties. Instead, in the
airport network the randomized curves lie below the real data (except for an
opposite trend of hkinn i for low degrees). This means that the real network is
more correlated than the baseline randomized expectation, and indicates that
additional mechanisms producing positive correlations must be present on top
of structural effects. By contrast, in the H. pylori s protein network the expected curves lie above the real data, suggesting the presence of mechanisms
producing negative correlations. The same is true for the correlation structure
134
10.2.2
Directed graphs
xi yj
,
1 + xi yj
xi > 0,
yi > 0
i.
(10.21)
Write the log-likelihood of the model (taking care of the fact that the network
is directed) and show that the ML principle requires that these parameters are
set to the particular values ~x , ~y that solve the following set of 2n coupled
nonlinear equations:
X
j6=i
X
j6=i
xi yj
1 + xi yj
= kiout (G )
xj yi
1 + xj yi
= kiin (G )
i
i
(10.22)
(10.23)
thus proving that the expected in- and out-degree sequences coincide precisely
with the empirical ones.
As in the undirected case, the quantities ~x , ~y allow us to obtain hXi and
[X] analytically and quickly, without sampling the ensemble explicitly.
Empirical results
We can now apply the method to various directed networks, by studying the
second-order topological properties measured by the outward ANND and the
inward ANND, defined in Eq. (4.24).
P P
haij i hajk i
j6=i
nn,out
nn,out
Pk6=j
E (ki
) hki
i
(10.24)
j6=i haij i
P P
haji i hakj i
j6=i
Pk6=j
E (kinn,in ) hkinn,in i
(10.25)
j6=i haji i
135
a
20
nn
nn
nn
<kout
>, kout
, kout
15
15
10
5
10
0
0
10
20
30
40
50
10
15
kin
kout
20
25
100
100
nn >, k nn , k nn
<kout
out out
80
60
40
20
80
60
40
20
0
0
0
50
100
150
50
100
kin
200
250
e
40
20
30
15
nn >, k nn , k nn
<kout
out out
150
kout
20
10
0
10
5
0
20
40
60
kin
80
100
10
20
30
kout
40
50
60
136
In fig.10.2 we plot the observed values kinn,in (G ) versus kiin (G ) and kinn,out (G )
versus kiout (G ), as well as the expectations hkinn,in i [kinn,in ] and hkinn,out i
[kinn,out ] obtained using the ML method, for three real directed networks: the
neural network of C. elegans [34] (now in its directed version), the metabolic
network of E. coli [38], and the Little Rock Lake food web [39]. As before, we
also show the microcanonical average obtained using the LRA and the expectation under the directed Erd
os-Renyi random graph model (DRG) with the same
number of links. Again, we find a very good agreement between the two approaches, but the ML method yields the correct prediction in incredibly shorter
time. We also confirm that while some networks (C. elegans and E. coli ) are
almost consistent with the null model, others (Little Rock ) deviate significantly.
However, the most interesting point for the present analysis is that, while
for the undirected networks considered above all randomized trends were decreasing, in this case we find that the three randomized trends behave in totally
different ways. In the neural network, both hkinn,in i and hkinn,out i are approximately constant. This means that the baseline behavior for both quantities is
flat and uncorrelated (as in the directed random graph, but at a different level).
By contrast, in the metabolic network the expected curves are decreasing, and
thus the ensemble of randomized networks is disassortative as for the undirected
graphs considered above. Finally, in the food web the constraints enforce unusual positive correlations, and the randomized ensemble is even assortative.
Interestingly, while it is expected that random networks with specified degrees
display a disassortative behavior [12, 15], the assortative trend is totally surprising. This is because the ML method extracts the hidden variables directly
from the specific real world network, rather than drawing them from ad hoc
distributions. The resulting values can be distributed in a very complicated
fashion, invalidating the results obtained under other hypotheses.
To further highlight this important point, we can select three more food
webs characterized by a particularly small size (see fig.10.3). Small networks
cannot be described by approximating the mass probability function of their
topological properties (such as the degree) with a continuous probability density like in the Park-Newman approach described in Chapter 5. Therefore in
this case the difference between the expectations obtained by drawing the ~x and
~y values from analytically tractable continuous distributions and those obtained
by solving eqs.(10.23) using the empirical degrees is particularly evident. As we
show in fig.10.3 (where for simplicity we omit the comparison with the LRA), we
confirm that the (directed) CM can display not only flat or decreasing trends,
but also increasing ones. Importantly, in this case all three webs do not deviate
dramatically from the null model. This means that while one would be tempted
to interpret the three observed trends as signatures of different patterns (zero,
negative and positive correlation), actually in all three cases the observed behavior can be roughly replicated by the same mechanism and almost entirely
traced back to the degree sequence only. This unexpected result highlights once
again that the measured values of any topological property are per se entirely
uninformative, and can only be interpreted in relation to a null model.
Reciprocity and motifs
So far, in our analysis of directed networks we have considered second-order
topological properties. In principle, third-order properties can be studied by in-
137
b
8
6
nn >, k nn
<kout
out
<kinnn>, kinnn
15
10
5
4
2
0
0
10
15
kin
20
25
30
8
kout
10
12
14
30
15
nn >, k nn
<kout
out
<kinnn>, kinnn
25
20
15
10
10
5
5
0
0
0
10
20
kin
30
40
10
nn >, k nn
<kout
out
<kinnn>, kinnn
20
25
4
2
4
2
0
0
0
10
20
30
40
10
kin
15
kout
20
25
20
25
nn >, k nn
<kout
out
<kinnn>, kinnn
15
kout
4
2
0
0
10
20
30
kin
40
10
15
kout
138
Figure 10.4: The 13 triadic motifs, defined as the possible non-isomorphic connected subgraphs of 3 vertices in a directed graph.
coming and outgoing links of each vertex, but also the number ki j aij aji
of reciprocated links (pairs of links in both directions) [43, 44]. This specification P
is equivalent to enforce, for each vertex i, the three quantities
9]
P [43,
ki j6=i a
(number
of
non-reciprocated
outgoing
links),
k
a
ij
i
j6=i ij
P
(number of non-reciprocated incoming links) and ki j6=i a
ij (number of
139
equations:
xi yj
+ xj yi + zi zj
= ki (G )
(10.26)
xj yi
1 + xi yj + xj yi + zi zj
= ki (G )
(10.27)
zi zj
1 + xi yj + xj yi + zi zj
= ki (G )
(10.28)
X
j6=i
X
j6=i
X
j6=i
1+
xi yj
The expectation value of any topological property, as well as its standard deviation, can now be calculated analytically in terms of the three n-dimensional
vectors ~x , ~y , ~z . For instance, in fig.10.3g-h we repeat the analysis of the
directed ANND of the St. Marks River food web, now comparing the observed
trend against the RCM. In this case, we find no significant difference with respect to the DCM considered above (fig.10.3e-f). However, as we now show, the
analysis of motifs reveals a dramatic difference between the predictions of the
two null models.
If Nm denotes the number of occurrences of a particular motif m, the ML
method allows to calculate the expected number hNm i and standard deviation
[Nm ] exactly, and thus to obtain the z-score
z[Nm ]
Nm (G ) hNm i
[Nm ]
(10.29)
analytically. This can be done for both the DCM and the RCM. The value of
z[Nm ] indicates by how many standard deviations the observed and expected
numbers of occurrences of motif m differ. Large values of z[Nm ] indicate motifs that are either over- or under-represented under the particular null model
considered, and that are therefore not explained by the lower-order constraints
enforced.
In fig.10.5 we show the z-scores for all the possible 13 non-isomorphic connected motifs with three vertices in 8 real food webs, for both null models.
We also show the two lines z = 2 to highlight the region within 2 standard
deviations from the models expectations. The food webs considered here are
from different ecosystems (lagoons, marshes, lakes, bays, estuaries, grasses),
with a prevalence of aquatic habitats. The presence of (intrinsically directed)
predator-prey relationships implies that reciprocity is a very important quantity
in food webs [17]. Thus the RCM should fluctuate less than the DCM. Indeed,
this is confirmed by our analysis. The z-scores for the motifs m = 2, 3, 13
are significantly reduced from the DCM to the RCM. Also, while the motifs
m = 1, 6, 10, 11 display large values of z with opposite signs across different
webs under the DCM, the signs of all statistically surprising motifs (i.e. when
|z| & 2) become consistent with each other under the RCM (except for m = 13).
As a consequence, under the RCM all networks display a very similar pattern,
and the most striking features of real webs become the over-representation of
motifs m = 2, 10 (plus m = 6, 11, 13 for the Little Rock Lake web) and the
under-representation of motifs m = 5, 9, 13 (plus m = 3, 7, 8 for Little Rock
Lake). In particular, the under-representation of motif m = 9 (the 3-loop) is
the most common pattern across all webs, and becomes stronger as the reciprocity of the web increases. Also note that in a network with no reciprocated
140
Chesapeake Bay
15
10
5
0
Maspalomas Lagoon
-5
-10
Florida Bay
-15
0
10
12
14
St Marks Seagrass
15
Everglades Marshes
10
5
Grassland
0
-5
-10
Ythan Estuary
-15
0
10
12
14
141
links, the number of motifs with at least a pair of reciprocated links is zero. Under the RCM, the expected number of these motifs remains zero. By contrast,
their expected number under the DCM is always positive. Thus we confirm
that the upgrade to the RCM is necessary, as its stricter constraints allow to
analyze 3-vertices motifs once 2-vertices motifs (i.e. all possible dyadic patterns)
are correctly accounted for. The possibility to treat the RCM analytically using
the ML method is an important ingredient of this analysis.
10.2.3
General case
(10.30)
~ log P(G |)
~ =
is the partition function [7]. The log-likelihood function is ()
~
~
H(G |) log Z().
Homework 10.6 Show that the ML principle implies that the optimal choice
~
for the parameter ~ in the model defined by Eq. (10.30) is given by the solution
to the following set of coupled equations:
X
~
(G ) =
(G)eH(G| ) /Z(~ ) = h i~
(10.31)
G
The above exercise shows that, in this class of models, the ML condition
is equivalent to Eq. (10.2), i.e. ~ = ~M . This means that the whole class of
maximum-entropy ensembles is unbiased. This gives us the following recipe: if
we wish to define a model whose predictions will then be matched to a set of
properties { (G )} observed in a realworld network G , we should decide
~
from the beginning what these reference properties are, include them in H(G|)
~
and define P(G|) as in Eq. (10.30). In this way we are sure to obtain an unbiased
model. The random graph is a trivial special case where (G) = Lu (G) and
H(G|) = Lu (G) [7], and this is the reason why it is unbiased, if Lu is chosen as
reference. Similarly, the hiddenvariable model defined by Eq. (10.13) is another
~ = P i ki (G) with xi ei [7],
special case where ~ (G) = ~k(G) and H(G|)
i
and so it is unbiased too. By contrast, Eq. (5.5) cannot be traced back to
Eq. (10.30), and the model is biased.
Bibliography
[1] G. Caldarelli, A. Capocci, P. De Los Rios and M.A. Mu
noz, Phys. Rev.
Lett. 89, 258702 (2002).
[2] B. S
oderberg, Phys. Rev. E 66, 066121 (2002).
[3] M. Bogu
n
a and R. PastorSatorras, Phys. Rev. E 68, 036112 (2003).
[4] F. Chung and L. Lu, Ann. of Combin. 6, 125 (2002).
[5] D. Garlaschelli and M.I. Loffredo, Phys. Rev. Lett. 93, 188701 (2004).
[6] D. Garlaschelli, S. Battiston, M. Castri, V.D.P. Servedio and G. Caldarelli,
Physica A 350, 491 (2005).
[7] J. Park and M.E.J. Newman, Phys. Rev. E 70, 066117 (2004) and references therein.
[8] P.W. Holland and S. Leinhardt, J. Amer. Stat. Assoc. 76, 33 (1981).
[9] D. Garlaschelli and M. I. Loffredo, Phys. Rev. E 73, 015101(R) (2006).
[10] R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, U. Alon
Science 298, 824-827 (2002).
[11] S. Maslov and K. Sneppen, Science 296, 910 (2002).
[12] Maslov, S., Sneppen, K. & Zaliznyak, A. profile of the Internet. Physica A
333, 529540 (2004).
[13] M.E.J. Newman, S.H. Strogatz & D.J. Watts Phys. Rev. E 64, 026118
(2001).
[14] Chung, F. & Lu, L. Ann. of Combin. 6, 125 (2002).
[15] Park, J. & Newman, M.E.J. Phys. Rev. E 68, 026112 (2003).
[16] Catanzaro, M., Boguna, M. & PastorSatorras, R. Phys. Rev. E 71, 027103
(2005).
[17] D.B. Stouffer, J. Camacho, W. Jiang & L.A.N. Amaral, Proc. R. Soc. B
274, 1931-1940 (2007).
[18] R. Guimer
a, M. Sales-Pardo L. A. N. Amaral, Nat. Phys. 3, 63 (2007).
[19] Park, J. & Newman, M.E.J. Phys. Rev. E 70, 066117 (2004).
142
BIBLIOGRAPHY
143
[20] Serrano, M.A., & Boguna, M. AIP Conf. Proc. 776, 101 (2005).
[21] Serrano, M.A., Boguna, M. & PastorSatorras, R. Phys. Rev. E 74,
055101(R) (2006).
Serrano, networks. Phys. Rev. E 78, 026101 (2008).
[22] M.-A.
[23] Barrat, A., Barthelemy, M., PastorSatorras, R. & Vespignani, A. PNAS
101, 3747-3752 (2004).
[24] T. Opsahl, V. Colizza, P. Panzarasa J.J. Ramasco, Phys. Rev. Lett. 101,
168702 (2008).
[25] Bhattacharya, K., Mukherjee, G., Saramaki, J., Kaski, K. & Manna, S. S.
modelling. J. Stat. Mech. P02002 (2008).
[26] Bianconi, G. Phys. Rev. E 79, 036114 (2009).
[27] D. Garlaschelli & M.I. Loffredo Phys. Rev. Lett. 102, 038701 (2009).
[28] D. Garlaschelli, New J. of Phys. 11, 073005 (2009).
[29] S. Melnik, A. Hackett, M. A. Porter, P. J. Mucha, J. P. Gleeson.
http://arxiv.org/abs/1001.1439.
[30] M.E.J. Newman, PRL 103, 058701 (2009).
[31] M. Bogun
a, R. Pastor-Satorras, A. Vespignani, Eur. Phys. J. B 38, 205-209
(2004).
[32] D. Garlaschelli & M.I. Loffredo, Phys. Rev. E 78, 015101(R) (2008).
[33] V. Colizza, R. Pastor-Satorras & A. Vespignani, Nat. Phys. 3, 276 - 282
(2007).
[34] K. Oshio, Y. Iwasaki, S. Morita, Y. Osana, S. Gomi, E. Akiyama, K.
Omata, K. Oka and K. Kawamura, Tech. Rep. of CCeP, Keio Future 3,
(Keio University, 2003).
[35] http://dip.doe-mbi.ucla.edu/dip/Main.cgi
[36] G. De Masi, G. Iori & G. Caldarelli, Phys. Rev. E 74, 066112 (2006).
[37] V. Colizza, A. Flammini, M.A. Serrano & A. Vespignani, Nat. Phys. 2,
110-115 (2006).
[38] H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai and A.-L. Barabasi, Nature
407, 651 (2000).
[39] N.D. Martinez, Ecological Monographs 61, 367-392 (1991).
[40] http://vlado.fmf.uni-lj.si/pub/networks/data/bio/
foodweb/foodweb.htm
[41] G. Fagiolo, Phys. Rev. E 76, 026107 (2007).
[42] S. E. Ahnert, T. M. A. Fink, Phys. Rev. E 78, 036112 (2008).
144
BIBLIOGRAPHY
[43] D. Garlaschelli and M. I. Loffredo, Phys. Rev. Lett. 93, 268701 (2004).
[44] V. Zlatic & H. Stefancic, Phys. Rev. E 80, 016117 (2009).
[45] M.E.J. Newman, Phys. Rev. E 70, 056131 (2004).
[46] S. E. Ahnert, D. Garlaschelli, T. M. Fink & G. Caldarelli, Phys. Rev. E
73, 015101(R) (2006).
[47] J. Saramaki, M. Kivela, J.-P. Onnela, K. Kaski and J. Kertesz, networks.
Phys. Rev. E 75, 027105 (2007).
[48] S. Fortunato, Physics Reports 486(3-5), 75-174 (2010).
[49] P. Holland, S. Leinhardt, in Sociological Methodology, D. Heise, Ed. (JosseyBass, San Francisco, 1975), pp. 1-45.
Chapter 11
Self-Organized Networks
11.1
Introduction
So far, we have approached networks from different points of view: i) the definition and computation of the static topological properties of real-world networks;
ii) the mathematical modelling of (either static or growing) network formation;
iii) the study of the effects that the topology has on various dynamical processes
(such as the spread of epidemics) taking place on static networks; iv) networks
as computing entities. These points of view reflect the main approaches in the
literature [1, 2, 3, 4, 5] present reviews of these results. More recently, a few
attempts to provide a unified approach to networks and their dynamics have
been proposed, exploiting the idea that all these aspects should in the end be
related to each other. In particular, it has been argued that the complexity
of realworld networks is in the most general case the result of the interplay
between topology and dynamics, because networks can process the information
about their environment and respond to it adaptively. So, while most studies
have focused either on the effects that topological properties have on dynamical
processes, or on the reverse effects that vertexspecific dynamical variables have
on network structure, it has been suggested that one should consider the mutual influence that these processes have on each other. This amounts to relax
the (often implicit) hypothesis that dynamical processes and network growth
take place at well separated timescales, and that one is therefore allowed to
consider the evolution of the fast variables while the slower ones are quenched.
Remarkably, one finds that the feedback between topology and dynamics can
drive the system to a steady state that differs from the one obtained when the
two processes are considered separately [6]. These results imply that adaptive
networks generated by this interplay may represent an entirely novel class of
self-organized complex systems, whose properties cannot be straightforwardly
understood in terms of what we have learnt so far.
In this chapter we shall present a selforganized model [6] where an otherwise static model of network formation driven by a so-called vertex fitness
[7] is explicitly coupled to a so-called extremal dynamics process [8] providing
a dynamical rule for the evolution of the fitness itself. In order to highlight
the novel phenomena that originate from the interplay between the two mech145
146
Figure 11.1: First steps in the iteration procedure defining the Sierpinski triangle.
anisms, we first review the main properties of such mechanisms when they are
considered separately. In section 11.2 we recall some aspects of scale invariance
and SelfOrganized Criticality (SOC), and in particular the biologicallyinspired
BakSneppen model [8] where the extremal dynamics for the fitness was originally defined on static graphs. In section 11.3 we briefly review the socalled
fitness model of network formation [7], where the idea that network properties
may depend on some fitness parameter associated to each vertex was proposed.
Finally, in section 11.4 we present the selforganized model obtained by coupling
these mechanisms.
11.2
11.2.1
Geometric fractals
Due to this ubiquity, scientists have tried to understand the possible origins
of fractal behaviour. The first preliminary studies have focussed on mathematical functions built by recursion (Kochs snowflake, Sierpi
nski triangle and
carpet, etc.). Based on these examples, where selfsimilar geometric objects
are constructed iteratively, mathematicians introduced quantities in order to
distinguish rigorously between fractals and ordinary compact objects.
For instance, one of the simplest fractals defined by recursion is the Sierpinski triangle, named after the Polish mathematician Waclaw Sierpi
nski who
introduced it in 1915 [17]. When the procedure shown in Fig.11.1 is iterated
an infinite number of times, one obtains an object whose empty regions extend
at any scale (up to the maximum area delimited by the whole triangle). It is
therefore difficult to measure its area in the usual way, i.e. by comparison with
another area chosen as the unit of measure. A way to solve this problem is to
147
consider a limit process not only for the generation of the fractal, but also for
the measurement of its area. Note that at the first iteration we only need three
triangles of side length 1/2 to cover the object (while for the whole triangle
we would need four of them). At the second iteration we need nine covering
triangles of side 1/4 (while for the whole triangle we would need sixteen of
them).
The (scaledependent) number of objects required to cover a fractal is at
the basis of the definition of the fractal dimension D. Formally, if N () is
the number of DE -dimensional volumes of linear size required to cover an
object embedded in a metric space of Euclidean dimension DE , then the fractal
dimension is defined as
ln N ()
,
(11.1)
D = lim
0 ln 1/
which approaches an asymptotic value giving a measure of the region occupied
by the fractal. For a compact (non-fractal) object, the fractal dimension D has
the same value as the Euclidean dimension DE .
Homework 11.1 Prove that, for a compact 2-dimensional triangle, D =
DE = 2.
Homework 11.2 Prove that, for the Sierpi
nski triangle, D =
Note that now D < DE = 2.
ln 3
ln 2
' 1.58496....
Therefore the fractal dimension measures the difference between the compactness of a fractal and that of a regular object embedded in a space of equal
dimensionality. In the present example, D is lower than 2 because the Sierpinski
triangle is less dense than a compact bidimensional triangle. D is also larger
than 1 because it is denser than a one-dimensional object (a line). Note that the
above formula can be rewritten in the familiar form of a power law by writing,
for small ,
N () D
(11.2)
This highlights the correspondence between the geometry of a fractal and scale
invariant laws.
11.2.2
SelfOrganized Criticality
Despite their importance in characterizing the geometry of fractals, purely iterative models are not helpful in order to understand whether a few common
mechanisms might be responsible for the fractal behaviour observed in so many
different, and seemingly unrelated, realworld situations. This has shifted the
interest towards dynamical models. Indeed, open dissipative systems are in
many cases associated with fractals for more than one reason. Firstly, attractors in the phase space of a nonlinear dynamical system can have a fractal
geometry; secondly, their evolution can proceed by means of scaleinvariant
bursts of intermittent activity [18] extending over both time and space. In
general, these features are obtained when a driving parameter of the nonlinear
dynamical system is set to a crossover value at which chaotic behaviour sets on.
When this occurs, the nonlinear system is said to be at the edge of chaos. Another situation where selfsimilarity is observed is at the critical point of phase
transitions. For instance, magnetic systems display a sharp transition from a
148
(11.3)
149
where
ki
2d
=
1
k=i
k
nearest neighbor of i
(11.4)
otherwise.
This process is called toppling. As the neighbouring sites acquire new grains,
they may topple in their turn, and this effect can propagate throughout the
system until no updated site is active, in which case the procedures starts again
with the addition of a new grain. While the amount of sand remains constant
when toppling occurs in the bulk, for topplings on the boundary sites (i )
some amount of sand falls outside and disappears from the system. In the steady
state of the process, this loss balances the continuous random addition of sand.
All the toppling events occurring between two consecutive sand additions
are said to form an avalanche. One can define both a size and a characteristic
time for an avalanche. The size of an avalanche can be defined, for instance, as
the total number of toppling sites (one site can topple more than once) or the
total number of topplings (it is clear that these two definitions give more and
more similar results as the space dimension increases). In order to define the
lifetime of an avalanche, one must first define the unit timestep. The latter is
the duration of the fundamental event defined by these two processes:
a set of sites becomes critical due to the previous toppling event;
all such critical sites undergo a toppling process, and the heights of their
neighbours are updated.
Then the lifetime of an avalanche can be defined as the number of unit timesteps
between two sand additions.
The fundamental result of the sandpile model is that, at the steady state,
both the size s and the lifetime t of avalanches are characterized by power law
distributions P (s) s , Q(t) t [26]. Therefore the model succeeds in
reproducing the critical behaviour, often associated to phase transitions, but
with a selforganized mechanism requiring no external fine tuning of the control parameter. Note that the grain addition can be viewed as the action of an
external field over the system. Similarly, the avalanche processes can be viewed
as the response (relaxation) of the system to this field. The spatial correlations
that develop spontaneously at all scales indicate that the system reacts macroscopically even to a microscopic external perturbation, a behaviour reminiscent
of the diverging susceptibility characterizing critical phenomena.
The BakSneppen model
A model that attempts to explain some key properties of biological evolution,
even if with strong simplifications, is the BakSneppen (BS) model [8, 28]. In
its simplest formulation, the model is defined by the following steps:
N species are arranged on the sites of a 1-dimensional lattice (a chain, or
a ring if periodic boundary conditions are enforced);
150
151
Figure 11.2: Left: plot of the probability distribution of fitness values at the
steady state in the BakSneppen model with 500 species. Right: the probability
distribution P (s) for the size of a critical -avalanche.
feedback mechanism between fitness dynamics and topological restructuring.
For this reason, it is at the basis of the adaptive model [6] that we shall present
in detail in section 11.4.
11.3
X
j6=i
pij =
f (xi , xj )
(11.5)
j6=i
For N large, the discrete sum can be approximated by an integral. Thus the
expected degree of a vertex with fitness x is
Z
k(x) = N f (x, y)(y)dy
(11.6)
152
where the integration extends over the support of (x). If one consider the
cumulative fitness distribution and the cumulative degree distribution defined
as
Z
Z
+
(x0 )dx0
> (x)
P (k 0 )dk 0
P> (k)
(11.7)
(11.8)
11.3.1
Particular cases
The constant choice f (x, y) = p is the trivial case corresponding to an ErdosRenyi random graph (see chapter 2), irrespectively of the form of (x).
The simplest nontrivial choice coincides with the configuration model (see
Chapter 5), which is obtained by requiring that all the realizations of the fitness
dependent network having the same degree sequence occur with the same probability. This leads to [68, 69]
f (x, y) =
zxy
1 + zxy
(11.11)
153
into account the requirement k N for dense networks. Equation (11.11) also
generates disassortativity and hierarchically distributed clustering, both arising
as structural correlations imposed by the local constraints. For sparse networks,
corresponding to eq.(11.12), these correlations disappear.
Another interesting choice is given by
f (x, y) = (x + y z)
(x) = ex
(11.13)
where z, which again controls the number of links, now plays the role of a positive
threshold. This choice yields again a powerlaw degree distribution P (k) k
(where now = 2), anticorrelated degrees with k nn (k) k 1 , and hierarchically
distributed clustering c(k) k 2 (times logarithmic corrections) [7, 66, 67].
Remarkably, it has been shown that both eq.(11.11) and eq.(11.13) are particular
cases of a more general expression obtained by introducing a temperaturelike
parameter [71]. Equation (11.11), with (x) x , corresponds to the finite
temperature regime, where the temperature can be reabsorbed in a redefinition
of x and z. By contrast, eq.(11.13) corresponds to the zerotemperature regime
where the structural correlations disappear and the graph reaches a sort of
optimized topology [71]. In all these cases, the average distance is small.
In summary, for a series of reasonable choices the networks generated by the
fitness model display
a scaleinvariant degree distribution;
correlations between neighbouring degrees;
hierarchically organized clustering coefficients;
a smallworld effect.
11.4
As we have anticipated, recent approaches to the modelling of complex networks have considered the idea that the topology evolves under a feedback with
some dynamical process taking place on the network itself (see for instance refs.
[6, 47, 72, 73, 74, 75, 76, 77]). Among the various contributions, three groups
have considered a possible connection with SelfOrganized Criticality [6, 73, 74].
Bianconi and Marsili [73] have defined a model where slow network growth,
defined as the gradual addition of links between randomly chosen vertices, is
combined to fast relaxation, defined as the random rewiring of links connected
to congested (toppling) vertices. To avoid the collapse to a complete graph,
dissipation is also introduced, allowing toppling nodes to lose all their links at
a given rate. The outcomes of the model depend on the dissipation rate and
on the probability density function for the toppling probabilities to be assigned
at each vertex. A particular choice of these quantities drives the system to a
stationary state characterized by a scalefree topology and a powerlaw distribution for toppling avalanches.
154
Fronczak, Fronczak and Holyst [74] have proposed a model where no parameter choice is required in order to drive the system to the critical region. They
considered the sandpile dynamics defined in section 11.2.2, but where each vertex has a different critical height equal to its degree, as in other previous studies
[61]. In addition, they assumed that after an avalanche of size A, the A ends of
links in the network that have not been rewired for the longest time are rewired
to the initiator of the avalanche. In this way, the avalanche area distribution
and the degree distribution evolve in time, and at the stationary state become
very similar and scalefree.
Garlaschelli, Capocci and Caldarelli [6] have introduced another fully self
organized model where the BakSneppen dynamics defined in section 11.2.2
takes place on a network whose topology is in turn continuously shaped by
the fitness model presented in section 11.3. They find that the mutual interplay between topology and dynamics drives the system to a state characterized
by scalefree distributions for both the degrees and the fitness values. These
unexpected properties differ from what is obtained when the two models are considered separately. The rest of the chapter is devoted to a detailed description
of this model.
11.4.1
Motivation
155
11.4.2
Definition
In order to define an improved, evolving model, one can assume that the Bak
Sneppen dynamics is combined with a fitnessdriven link updating. At the
initial state the network is generated as in the fitness model, and between all
pairs of vertices i and j a link is drawn with probability f (xi , xj ) (where the xi s
are the initial fitness values). Then, whenever a species i is assigned a new fitness
x0i , all the set of connections between i and the other vertices j 6= i are drawn
anew with updated probability f (x0i , xj ). This automatically implies that major
mutations (a large change in xi ) are associated with very different connection
probabilities, while little changes lead to almost equiprobable interactions. An
example of this evolution rule is depicted in figure 11.3.
If at time t the vertex i has the minimum fitness, at time t + 1 the fitness of
i as well as that of its neighbours is updated, i.e. drawn anew from the uniform
distribution on the unit interval. This means
xj (t + 1) = j
if j = i
or aij (t) = 1
(11.14)
11.4.3
Analytical solution
Despite its complexity, the model is exactly solvable for any choice of the connection probability f (x, y) [6]. Indeed, one can write down a so-called master
156
(11.15)
where rin (x, t) and rout (x, t) are the fractions (strictly speaking, the probability
densities) of vertices with fitness x entering and exiting the system at time t
respectively. If a stationary distribution (timeindependent) distribution (x)
exists, it is found by requiring
(x, t)
=0
t
(11.16)
where at the stationary state the quantities no longer depend on time. If one
manages to write down rin (x) and rout (x) in terms of f (x, y) and (x), then
the above condition will give the stationary form of (x) for any choice of f (x, y).
To this end, it is useful to introduce the probability density q(m) that the
minimum fitness takes the value xmin = m. For x small enough, (x) must be
very close to q(x)/N (the distribution of all fitness values must be approximated
by the correctly renormalized distribution of the minimum). The range where
(x) q(x)/N holds can be defined more formally by introducing the fitness
value such that
x
N (x) = 1
lim
(11.17)
N q(x)
>1
x>
This means that in the large size limit the fitness distribution for x < is
determined by the distribution of the minimum. After an expression for (x) is
derived, the value of can be determined by the normalization condition
Z 1
(x)dx = 1
(11.18)
0
as we show below. Note that we are not assuming from the beginning that > 0
as is observed for the BakSneppen model on other networks (with finite second
157
moment of the degree distribution). It may well be that for a particular choice
of f (x, y) eq.(11.18) yields = 0, signalling the absence of a nonzero threshold.
Also, note that
lim q(x) = 0
for x > ,
N
since eq.(11.17) implies that the minimum is surely below . Thus the normalization condition for q(x) reads
Z
q(x)dx = 1
as N .
The knowledge of q(m) allows one to rewrite rin (x) and rout (x) as
Z
rin (x) = q(m)rin (x|m)dm
and
r
out
Z
(x) =
q(m)rout (x|m)dm
respectively, where rin (x|m), rout (x|m) are conditional probabilities corresponding to the densities of vertices with fitness x which are added and removed when
the value of the minimum fitness is m.
Let us consider rin (x) first. If the minimum fitness is m, then on average
1 + k(m) new fitness values are updated, where k(m) is the expected degree of
the minimumfitness vertex, that can be calculated in a way similar to eq.(11.6).
Since each of these 1 + k(m) values is uniformly drawn between 0 and 1, one
has
1 + k(m)
rin (x|m) =
(11.19)
N
independently of x. This directly implies
rin (x) =
q(m)rin (x|m)dm =
1 + hkmin i
N
(11.20)
R
where hkmin i 0 q(m)k(m)dm is the expected degree of the vertex with minimum fitness (irrespective of the value of the minimum fitness itself), a quantity
that can be derived independently of k(m) as we show below.
Now consider rout (x), for which the independence on x does not hold. For
x , rout (x|m) = (x m)/N (where (x) is the Dirac delta function), since
the minimum is surely replaced and the probability of having other vertices with
fitness in this range is zero. For x > , the fraction of vertices with fitness x
that are removed equals (x) times the probability f (x, m) that a vertex with
fitness x is connected to the vertex with minimum fitness m [6]. This means
rout (x|m) = ( x)
(x m)
+ (x )(x)f (x, m)
N
(11.21)
158
yields
r
out
Z
(x)
q(m)rout (x|m)dm
=
0
q(x)/N
=
(x) R q(m)f (x, m)dm
0
(11.22)
x>
Finally, one can impose eq.(11.16) at the stationary state. For x , this
yields q(x) = 1+hkmin i independently of x. Combining this result with q(x) = 0
for x > as N , one finds that the distribution of the minimum fitness m
is uniform between 0 and :
q(m) = (1 + hkmin i)( m)
(11.23)
(11.24)
( m)
,
which implies
(x) =
q(x)
1
=
N
N
for x
1
N
x.
(11.25)
R
0
=
=
=
rout (x)
q(m)f (x, m)dm
rin (x)
q(m)f (x, m)dm
0
1
R
N 0 q(m)f (x, m)dm
1
R
.
N 0 f (x, m)dm
R
(11.26)
Therefore the exact solution for (x) at the stationary state is found [6]:
( N )1
x
(x) =
(11.27)
1
R
x>
N 0 f (x, m)dm
where is determined using eq.(11.18), that reads
Z 1
dx
R
=N 1
f
(x,
m)dm
(11.28)
159
The above analytical solution holds for any form of f (x, y). As a novel result,
one finds that (x) is in general no longer uniform for x > . This unexpected
result, which contrasts with the outcomes of the BakSneppen model on any
static network, is solely due to the feedback between topology and dynamics.
At the stationary state the fitness values and the network topology continue to
evolve, but the knowledge of (x) allows to compute the expected topological
properties as shown in section 11.3 for the static fitness model.
11.4.4
Particular cases
In what follows we consider specific choices of the connection probability f (x, y).
In particular, we consider two forms already presented in section 11.3. Once
a choice for f (x, y) is made, one can also confirm the theoretical results with
numerical simulations. As we show below, the agreement is excellent.
The random neighbour model
As we have noted, the trivial choice for the fitness model is f (x, y) = p, which is
equivalent to the Erdos-Renyi model. When the BakSneppen dynamics takes
place on the network, this choice removes the feedback with the topology, since
the evolution of the fitness does not influences the connection probability. Indeed, this choice is asymptotically equivalent to the socalled random neighbour
variant [28] of the BakSneppen model. In this variant each vertex has exactly d
neighbours, which are uniformly chosen anew at each timestep. Here, we know
that for an Erdos-Renyi graph the degree is peaked about the average value
p(N 1), thus we expect to recover the same results found for d = p(N 1) in
the random neighbour model.
Homework 11.3 Show that, when f (x, y) = p,
( N )1
x
(x) =
(p N )1
x>
(11.29)
where, for N ,
=
(1 + d)1
1 + pN
if
pN 0
if
pN d > 0
if
pN
(11.30)
The reason for the onset of these three dynamical regimes must be searched
for in the topological phases of the underlying network. For p large, there is
one large connected component that spans almost all vertices. As p decreases,
this giant cluster becomes smaller, and several separate clusters form. Below
the critical percolation threshold pc 1/N [4, 5], the graph is split into many
small clusters. Exactly at the percolation threshold pc , the sizes of clusters are
powerlaw distributed according to P (s) s with = 2.5 [4]. Here we find
that the dense regime pN is qualitatively similar to a complete graph,
160
1+
2
The above exercise shows that, while at the initial state (t = 0) the average
fitness is hx(0)i = 1/2 (since the distribution is uniform in the unit interval),
at the stationary state (t = ) the average fitness increases to the asymptotic
value hx()i = (1 + )/2 1/2.
The selforganized configuration model
Following the considerations in section 11.3, the simplest nontrivial choice for
f (x, y) is given by eq.(11.11). For a fixed (x), this choice generates a fitness
dependent version of the configuration model [4, 70], where all graphs with the
same degree sequence are equiprobable. All higherorder properties besides the
structural correlations induced by the degree sequence are completely random
[68, 69]. In this selforganized case, the degree sequence is not specified a priori
and is determined by the fitness distribution at the stationary state. Inserting
eq.(11.11) into eq.(11.27) one finds a solution that for N is equivalent to
[6]
( N )1
x
(x) =
(11.31)
( N )1 + 2/(zN 2 x)
x>
where , again obtained using eq.(11.28), is
1
r
p
(zN )
=
(d)/d
zN
zN 0
zN = d
(11.32)
zN
10
161
10
10
10
10
d = 1.32
d=4
d = 0.1
10
10
10
Cluster Size s
10
10
Figure 11.4: Cluster size distribution. Far from the critical threshold (d = 0.1
and d = 4), P (s) is well peaked. At dc = 1.32, P (s) s with = 2.45 0.05.
Here N = 3200. (After ref. [6]).
value dc = 1.32 0.05 [6] (see fig. 11.4), which therefore represents the percolation threshold. This behaviour can also be explored by measuring the fraction
of vertices spanned by the giant cluster as a function of d (see fig. 11.5). This
quantity is negligible for d < dc , while for d > dc it takes increasing finite values.
Also, one can plot the average size fraction of nongiant components. As shown
in the inset of fig. 11.5, this quantity diverges at the critical point where P (s)
is a power law.
The analytical results in eq.(11.31) mean that (x) is the superposition of a
uniform distribution and a powerlaw with exponent 1. The decay of (x) for
x > is entirely due to the coupling between extremal dynamics and topological
restructuring. It originates from the fact that at any time the fittest species is
also the most likely to be selected for mutation, since it has the largest probability to be connected to the least fit species. This is opposite to what happens
on fixed networks. The theoretical predictions in eqs.(11.31) and (11.32) can be
confirmed by large numerical simulations. This is shown in fig.11.6, where the
cumulative fitness distribution > (x) defined in eq.(11.7) and the behaviour of
(zN ) are plotted. Indeed, the simulations are in very good accordance with the
analytical solution. Note that, as we have discussed in section 11.3, in the sparse
regime z 1 one has f (x, y) zxy. Here, this implies a purely powerlaw behaviour (x) x1 for x > . Therefore > (x) is a logarithmic curve that looks
like a straight line in loglinear axes. In the dense regime obtained for large z,
the uniform part gives instead a significant deviation from the powerlaw trend.
This shows one effect of structural correlations.
162
N = 100
N = 200
N = 400
N = 800
N = 1600
N = 6400
0.6
2
NonGiant Component Size
0.8
0.4
0.2
1.8
1.6
1.4
1.2
1
d = Nz
10
d = Nz
Figure 11.5: Main panel: the fraction of nodes in the giant component for
different network sizes as a function of d. Inset: the non-giant component
average size as a function of d for N = 6400. (After ref. [6]).
Other effects are evident when considering the degree distribution P (k).
Using eq.(11.6) one can obtain the analytic expression of the expected degree
k(x) of a vertex with fitness x:
k(x) =
2
1 + zx
zx ln(1 + zx)
ln
+
z 2 1 + z x
z x
(11.33)
Computing the inverse function x(k) and plugging it into eq.(11.8) allows to
obtain the cumulative degree distribution P> (k). Both quantities are shown in
fig.11.7, and again the agreement between theory and simulations is excellent.
For small z, k(x) is linear, while for large z a saturation to the maximum value
kmax = k(1) takes place. As discussed in section 11.3, this implies that in the
sparse regime P (k) has the same shape as (x). Another difference from static
networks is that here remains finite even if P (k) k with < 3 [41, 42, 43].
For large z the presence of structural correlations introduces a sharp cutoff for
P (k).
Homework 11.5 Show that, in the stationary state of this specification of the
model, the average fitness over all nodes equals
hxi =
1
2
+
(1 )
2 N
zN 2
11.5. CONCLUSIONS
163
CDF
1
0.8
Tau
0.2
0.6
0.1
0.05
0.4
0.02
0.01
0.005
0.2
0.002
100
0.001
1000
0.0050.01
Nz
0.05 0.1
0.5
Figure 11.6: Main panel: cumulative density function > (x) in loglinear axes.
From right to left, z = 0.01, z = 0.1, z = 1, z = 10, z = 100, z = 1000
(N = 5000). Inset: loglog plot of (zN ). Solid lines: theoretical curves,
points: simulation results. (After ref. [6]).
The above exercise shows that at the stationary state the average fitness can
now be smaller than the initial value 1/2. This effect is due to the non-uniform
character of (x) at the stationary state, with a higher density of values just
above the threshold .
11.5
Conclusions
164
10000
CDF
1
0.8
1000
0.6
100
0.4
10
0.2
0.001 0.01
0.1
x
10
100
1000
Figure 11.7: Left: k(x) (N = 5000; from right to left, z = 0.01, z = 0.1, z = 1,
z = 10, z = 100, z = 1000). Right: P> (k) (same parameter values, inverse order
from left to right). Solid lines: theoretical curves, points: simulation results.
(After ref. [6]).
vide a more complete explanation for the spontaneous emergence of complex
topological properties in real networks.
Bibliography
[1] Caldarelli G. Scale-Free Networks Oxford University Press, Oxford (2007).
[2] Caldarelli G., Vespignani A. (eds), Large Scale Structure and Dynamics of
Complex Networks (World Scientific Press, Singapore 2007).
[3] Dorogovtsev S.N. Mendes J.F.F. Evolution of Networks: From Biological
Nets to the Internet and WWW, Oxford University Press, Oxford (2003).
[4] M.E.J. Newman, SIAM Rev. 45, (2003) 167.
[5] Albert R., Barab
asi A.-L., Rev. of Mod. Phys., 74, (2001) 4797.
[6] Garlaschelli D., Capocci A., Caldarelli G., Nature Physics, 3 813-817
(2007).
[7] Caldarelli G., Capocci A., De Los Rios P. Mu
noz M. A., Phys. Rev. Lett.,
89, (2002) 258702.
[8] Bak P., Sneppen K., Phys. Rev. Lett., 71, 4083-4086 (1993).
[9] Mandelbrot B.B. The variation of certain speculative prices. J. Business
36 394-419, (1963).
[10] Mandelbrot B.B., How Long Is the Coast of Britain? Statistical SelfSimilarity and Fractional Dimension. Science 156, 636-638 (1967).
[11] Niemeyer L., Pietronero L., and Wiesmann H.J., Fractal Dimension of
Dielectric Breakdown, Phys. Rev. Lett. 52, 1033 (1984)
[12] Rodriguez-Iturbe, I., Rinaldo A., Fractal River Networks: Chance and
Self-Organization, Cambridge University Press, New York, (1997).
[13] Brady R.M., Ball, R.C. Fractal growth of Copper electrodeposits Nature
309, 225 (1984).
[14] Batty M., Longley P.A. Fractal Cities: a Geometry of Form and Functions
Academic Press, San Diego (1994)
[15] Mandelbrot B.B., Passoja D.E., Paullay A.J. Fractal character of fracture
surface in metals, Nature 308 721 (1984).
[16] Brown J.H., West G.B. (eds.), Scaling in biology (Oxford University Press,
2000).
165
166
BIBLIOGRAPHY
[17] Sierpi
nski W., Sur une courbe dont tout point est un point de ramification,
C. R. Acad. Sci. Paris 160 302-305 (1915).
[18] Eldredge N., Gould S.J., Punctuated equilibria: an alternative to phyletic
gradualism, In T.J.M. Schopf, ed., Models in Paleobiology. San Francisco:
Freeman Cooper. pp. 82-115 (1972). Reprinted in N. Eldredge Time frames.
Princeton: Princeton Univ. Press. 1985
[19] Jensen H. J., Self-Organized Criticality Cambridge University Press, Cambridge, (1998).
[20] Rigon R., Rodrguez-Iturbe I., Rinaldo A., Feasible optimality implies
Hacks law, Water Res. Res., 34, 3181-3190 (1998).
[21] Marani M., Maritan A., Caldarelli G., Banavar J.A., Rinaldo A., Stationary self-organized fractal structures in potential force fields, J. Phys. A 31,
337-343, (1998).
[22] Caylor K.K., Scanlon T.M. Rodrguez-Iturbe I., Feasible optimality of vegetation patterns in river basins, Geoph. Res. Lett,31, L13502 (2004).
[23] Ferrer i Cancho R. and Sole R.V., Optimisation in Complex Networks,
Lect. Not. in Phys., 625, 114-126, (2003)
[24] Caldarelli G., Maritan A., Vendruscolo M., Hot sandpiles, Europhys. Lett.
35 481-486 (1996).
[25] Caldarelli G., Mean Field Theory for Ordinary and Hot sandpiles, Physica
A, 252, 295-307 (1998).
[26] Bak P., Tang C. Weisenfeld K., Phys. Rev. Lett. 59, 381 (1987).
[27] Wilkinson D. Willemsen J.F., Invasion Percolation: a new form of Percolation Theory, J. Phys. A 16, 3365-3376 (1983).
[28] Flyvbjerg H., Sneppen K., Bak P., Phys. Rev. Lett. 71, 4087 (1993).
[29] Grassberger P., Phys. Lett. A 200 277 (1995).
[30] Dickman R.,Mu
noz M.A., Vespignani A., Zapperi S., Braz. J. Phys. 30 27
(2000).
[31] Benton M.J., The fossil record 2, Chapman and Hall, London. (1993).
[32] De Los Rios P., Marsili M., Vendruscolo M., Phys. Rev. Lett. 80 5746
(1998).
[33] Dorogovtsev S.N., Mendes J.F.F., Pogorelov Y.G., Phys. Rev. E 62 295
(2000).
[34] Marsili M., Europhys. Lett. 28, 385 (1994).
[35] Mikeska B., Phys. Rev. E 55 3708 (1997).
[36] Paczuski M., Maslov S., Bak P., Europhys. Lett. 27 97 (1994).
BIBLIOGRAPHY
167
[37] Caldarelli G., Felici M., Gabrielli A., Pietronero L., Phys. Rev. E 65 (2002)
046101.
[38] M. Felici, G. Caldarelli, A. Gabrielli, L. Pietronero, Phys. Rev. Lett., 86,
(2001) 1896-1899.
[39] P. De Los Rios, M. Marsili and M. Vendruscolo, Phys. Rev. Lett., 80,
(1998) 5746-5749.
[40] Kulkarni, R. V., Almaas, E. & Stroud, D. Evolutionary dynamics in the
BakSneppen model on smallworld networks. ArXiv:cond-mat/9905066.
[41] Moreno, Y. & Vazquez, A. The BakSneppen model on scalefree networks. Europhys. Lett. 57(5), 765771 (2002).
[42] Lee, S. & Kim, Y. Coevolutionary dynamics on scale-free networks. Phys.
Rev. E 71, 057102 (2005).
[43] Masuda, N., Goh, K.-I. & Kahng, B. Extremal dynamics on complex networks: Analytic solutions. Phys. Rev. E 72, 066106 (2005).
[44] Garcia G.J.M., Dickman R. Asymmetric dynamics and critical behavior
in the Bak-Sneppen model, Physica A 342, 516-528 (2004).
[45] Middendorf M., Ziv E., Wiggins C.H. Inferring network mechanisms: The
Drosophila melanogaster protein interaction network, Proc. Nat. Acad.
Sci. 102, 3192-3197 (2005).
[46] Giot L et al, A protein interaction map of Drosophila melanogaster, Science
302 1727-36 (2003).
[47] G. Caldarelli, P.G. Higgs and A.J. McKane, Journ. Theor. Biol. 193,
(1998) 345.
[48] Garlaschelli D., Caldarelli G. Pietronero L. Universal scaling relations in
food webs, Nature 423, 165-168 (2003).
[49] Burlando B, Journal Theoretical Biology 146 99-114 (1990).
[50] Burlando B, Journal Theoretical Biology 163 161-172 (1993).
[51] Caretta Cartozo C., Garlaschelli D., Ricotta C., Barthelemy M., Caldarelli
G. J. Phys. A: Math. Theor. 41, 224012 (2008).
[52] D. Garlaschelli, S. Battiston, M. Castri, V.D.P. Servedio and G. Caldarelli,
Phys. A 350, (2005) 491-499.
[53] D. Garlaschelli and M.I. Loffredo, Phys. Rev. Lett. 93, (2004) 188701.
[54] Faloutsos M, Faloutsos P., Faloutsos C., On Power-law Relationships of the
Internet Topology, Proc. ACM SIGCOMM, Comp. Comm. Rev., 29,251262 (1999).
[55] Adamic L.A., Huberman B.A, Power-Law Distribution of the World Wide
Web, Science 287, 2115 (2000).
168
BIBLIOGRAPHY
[56] Caldarelli G., R. Marchetti R., and Pietronero L., Europhys. Lett. 52, 386
(2000).
[57] R. Pastor-Satorras, A. Vespignani, Phys. Rev. Lett. 86, 3200 (2001).
[58] S. N. Dorogovtsev, A. V. Goltsev, J. F. F. Mendes, Critical phenomena in
complex networks, arXiv:0705.0010v6.
[59] D. Garlaschelli and M.I. Loffredo, Physica A 338(1-2), 113-118 (2004).
[60] D. Garlaschelli and M.I. Loffredo, J. Phys. A: Math. Theor. 41, 224018
(2008).
[61] K.-I. Goh, D.-S. Lee, B. Kahng, D. Kim, Phys. Rev. Lett. 91, 148701
(2003).
[62] Barab
asi A.-L., Albert R. Emergence of scaling in random networks, Science 286, 509-512 (1999).
[63] Fronczak A., Fronczak P., Holyst J.A., Mean-Field theory for clustering
coefficient in Barabasi-Albert networks, Phys. Rev. E, 68, 046126 (2003).
[64] Barrat A., Pastor-Satorras R., Rate equation approach for correlations in
growing network models, Phys. Rev. E, 71, 036127 (2005).
[65] Bollob
as B., Riordan O., The diameter of a scale-free random graph, Combinatorica, 24, 5-34 (2004).
[66] M. Bogu
n
a and R. Pastor-Satorras, Phys. Rev. E 68, (2003) 036112.
[67] V.D.P. Servedio, G. Caldarelli and P. Butt`a, Phys. Rev. E 70 (2004)
056126.
[68] J. Park and M.E.J. Newman, Phys. Rev. E 68, (2003) 026112.
[69] D. Garlaschelli and M.I. Loffredo, ArXiv:cond-mat/0609015.
[70] S. Maslov, K. Sneppen, and A. Zaliznyak, Physica A 333, (2004) 529.
[71] D. Garlaschelli, S. E. Ahnert, T. M. A. Fink, G. Caldarelli, ArXiv:condmat/0606805v1.
[72] Jain, S. & Krishna, S. Autocatalytic Sets and the Growth of Complexity
in an Evolutionary Model. Phys. Rev. Lett. 81, 56845687 (1998).
[73] Bianconi, G. & Marsili, M. Clogging and selforganized criticality in complex networks. Phys. Rev. E 70, 035105(R) (2004).
[74] Fronczak, P., Fronczak, A. & Holyst, J. A. Selforganized criticality and
coevolution of network structure and dynamics. Phys. Rev. E 73, 046117
(2006).
[75] Zanette,D. H. & Gil, S. Opinion spreading and agent segregation on evolving networks. Physica D 224(1-2), 156165 (2006).
BIBLIOGRAPHY
169
Chapter 12
12.1
Network Dynamics
We will start with network dynamics. Recall from the beginning of this course
the random graph process. In the next exercise, you will implement the random
graph process.
Exercise 12.1 Question: Random graph process. Generate G, a sequence of
random graphs, for n 0, . . . , 100 vertices, using the same four edge probabilities
p = 0.0, 0.1, 0.5, 1.0. Do not print the graph, only print mean and variance of
the number of edges.
12.1.1
Netlogo
171
and integrated modeling environment. The NetLogo environment enables exploration of emergent phenomena. It comes with an extensive models library
including models in a variety of domains, such as economics, biology, physics,
chemistry, psychology, system dynamics. Netlogo is created by Uri Wilensky of
Nortwestern University, Evanston, Illinois, USA [1]. It is a wonderful system for
exploration and creating an understanding of the dynamics of complex systems,
that we gratefully acknowledge.
Installing NetLogo
You should now download and install NetLogo onto your computer. The homepage for Netlogo is http://ccl.northwestern.edu/netlogo/ NetLogo is freely
downloadable, and comes with an extensive library of example models. Now go
to the webpage (or search for Netlogo, which should get you to the homepage).
When downloading, choose version 5.0.5, fill in your name if you wish, and
choose your operating system: Linux, Mac OS X, or Windows. If you run into
problems, read the manual, look at the tutorial, or ask the course assistants
for help. You really want to have Netlogo running on your machine, and have
played a bit with it before the lecture starts.
To quote the documentation:
NetLogo is a programmable modeling environment for simulating
natural and social phenomena. It was authored by Uri Wilensky
in 1999 and has been in continuous development ever since at the
Center for Connected Learning and Computer-Based Modeling.
NetLogo is particularly well suited for modeling complex systems
developing over time. Modelers can give instructions to hundreds or
thousands of agents all operating independently. This makes it
possible to explore the connection between the micro-level behavior
of individuals and the macro-level patterns that emerge from their
interaction.
NetLogo lets students open simulations and play with them,
exploring their behavior under various conditions. It is also an
authoring environment which enables students, teachers and curriculum developers to create their own models. NetLogo is simple
enough for students and teachers, yet advanced enough to serve as
a powerful tool for researchers in many fields.
NetLogo has extensive documentation and tutorials. It also comes
with the Models Library, a large collection of pre-written simulations that can be used and modified. These simulations address
content areas in the natural and social sciences including biology
and medicine, physics and chemistry, mathematics and computer
science, and economics and social psychology.
Now please startup NetLogo (the normal version, we will not use 3D), and
go to the Models Library (Command-M, Ctrl-M, Alt-M, or File/Models) to
browse through some example models. (Please refer to figure 12.1 for a related
screenshot.)
173
12.1.2
Preferential Attachment
preferential attachment.
Pressing the GO ONCE button adds one new node. To continuously add
nodes, press GO. The LAYOUT? switch controls whether or not the layout
procedure is run. This procedure attempts to move the nodes around to make
the structure of the network easier to see. The PLOT? switch turns off the plots
which speeds up the model. The RESIZE-NODES button will make all of the
nodes take on a size representative of their degree distribution. If you press it
again the nodes will return to equal size. If you want the model to run faster,
you can turn off the LAYOUT? and PLOT? switches and/or freeze the view
(using the on/off button in the control strip over the view). The LAYOUT?
switch has the greatest effect on the speed of the model. If you have LAYOUT?
switched off, and then want the network to have a more appealing layout, press
the REDO-LAYOUT button which will run the layout-step procedure until you
press the button again. You can press REDO-LAYOUT at any time even if you
had LAYOUT? switched on and it will try to make the network easier to see.
Things to Notice
The networks that result from running this model are often called scale-free or
power law networks. These are networks in which the distribution of the number
of connections of each node is not a normal distribution instead it follows what
is a called a power law distribution. Power law distributions are different from
normal distributions in that they do not have a peak at the average, and they are
more likely to contain extreme values (see Albert & Barabasi 2002 for a further
description of the frequency and significance of scale-free networks). Barabasi
and Albert originally described this mechanism for creating networks, but there
are other mechanisms of creating scale-free networks and so the networks created
by the mechanism implemented in this model are referred to as Barabasi scalefree networks.
You can see the degree distribution of the network in this model by looking
at the plots. The top plot is a histogram of the degree of each node. The
bottom plot shows the same data, but both axes are on a logarithmic scale.
When degree distribution follows a power law, it appears as a straight line on
the log-log plot. One simple way to think about power laws is that if there is
one node with a degree distribution of 1000, then there will be ten nodes with
a degree distribution of 100, and 100 nodes with a degree distribution of 10.
Exercise 12.2 Let the model run a little while. How many nodes are hubs,
that is, have many connections? How many have only a few? Does some low
degree node ever become a hub? How often? Turn off the LAYOUT? switch and
freeze the view to speed up the model, then allow a large network to form. What
is the shape of the histogram in the top plot? What do you see in log-log plot?
Notice that the log-log plot is only a straight line for a limited range of values.
Why is this? Does the degree to which the log-log plot resembles a straight line
grow as you add more node to the network?
12.1.3
Percolation Transition
We can use our graph program to study the percolation transition of the clustering of the network that was discussed in chapter 8.
175
Exercise 12.3 For n = 100, choose three values for p: pick p < 1/n, p = 1/n,
and p > 1/n. Compute the clustering coefficient for all three values of p. Explain
what you see using the concept of percolation transition (refer to the chapter on
random graphs).
Now we turn to Netlogo.
Exercise 12.4 From the Model Library, load Earth Science/Percolation. Run
the Simulation. Read the Info tab.
Try different settings for the porosity. What do you notice about the pattern
of affected soil? Can you find a setting where the oil just keeps sinking, and a
setting where it just stops?
If percolation stops at a certain porosity, its still possible that it would percolate further at that porosity given a wider view. Note the plot of the size of
the leading edge of oil. Does the value settle down roughly to a constant? How
does this value depend on the porosity?
Give the soil different porosity at different depths. How does it affect the
flow? In a real situation, if you took soil samples, Could you reliably predict
how deep an oil spill would go or be likely to go?
12.2
Network Properties
Now that we have found a way to create a network that conforms to a prespecified degree sequence, it is time to study some network properties. We do
so in order to see if the network does indeed have the properties of real world
networks.
The system that you can use to create the plots can be either a generic
spreadsheet such as Microsoft Excel, Apple Numbers, Google Sheets, OpenOffice
Calc, or a dedicated plotting tool such as Gnuplot.
12.2.1
Gnuplot
Gnuplot is a tool that allows easy plotting of statistical plots. This subsection
contains a small Gnuplot example to get you going. Refer to your first year
introduction to programming course for more (or use Excel, OpenOffice, etc.).
The plots that we will make are quite simple, a sequence of pairs or numbers
(x, y) is plotted on a 2 dimensional plane. Gnuplot expects an input file with
some commands. As an example, let us create a command file example.gpl
that looks like the following:
set term postscript enhanced color 20
set output example.eps
set xlabel "degree"
set ylabel "frequency"
set xrange [0 : 10]
set yrange [0 : 20]
plot "example.input" using 1:2 title "degree distribution" with lines axes x1y1 ls 1
20
degree distribution
frequency
15
10
0
0
10
degree
4
8
9
3
2
When you run the command \$ gnuplot example.gpl then you should get an
output file named example.eps that looks like Figure 12.3.
Homework 12.1 In the chapter on random graphs the clustering coefficient
of a graph has been defined as the ratio of triangles to wedges in the graph
G
(CG = W
). As formula for the number of triangles and wedges we recall
G
X
G =
i1 ,i2 ,i3 V
are present} ,
WG =
1{i1 i2 ,i2 i3
are present} ,
i1 ,i2 ,i3 V
(12.1)
Now add to your program the computation of this wedge-triangle clustering coefficient.
Hint: the definitions of G and WG suggest the use of nested for-loops for a
straightforward algorithm.
12.2.2
We can now start preparing graphs of network properties. In the following exercise you will create plots for three graphs: for the 10-vertex Erdos-Renyi
177
graph, and for the 10-vertex and for the 20-vertex Repeated Configuration
Model graphs. Create plots for each of the following network properties: empirical degree distribution, empirical nearest-neighbour degree, empirical local
clustering coefficient and clustering coefficient versus degree. See section 4.2.2.
1. Empirical Degree distribution
The degree distribution is a first-order topological property. Plot on the
x-axis the value k of the degree from 0 to the highest occurring degreevalue, and on the y- axis the count of the number of nodes that have that
degree.
2. Empirical Average Nearest-Neighbour Degree
A second-order topological property is the average nearest-neighbour degree of a vertex i, or kinn . It can be computed by average kinn over all
vertices with the same degree k and plot this value on the y-axis, against
the value of k on the x-axis. Are degrees positively correlated (are highdegree vertices on average linked to high-degree vertices, does the network
display assortative mixing)?
3. Empirical Local Clustering Coefficient
The most studied third-order property of a vertex is the local clustering
coefficient Ci . It is defined as the number of links connecting the neighbours of node i to each other, divided by the total number of pairs of
neighbours of node i.
4. Empirical Clustering Coefficient versus Degree
As chapter 4 noted, a statistical way to consider the clustering properties
of real networks, is to compute the average value of Ci over all vertices
with a given degree k and plotting this value (on the y-axis) against k (on
the x-axis). Note that in many cases the average clustering decreases as
k decreases. What do your three networks do?
Homework 12.2 Plotting Network Properties
From the above list, create the program to print the four plots of the network
properties for the three networks. 12 plots in total. Answer any questions that
were asked at plot number 2 and 4.
Note that chapter 4 contains more higher order properties of networks. Some
are easy to implement, some take a bit more work.
Exercise 12.5 It is instructive to see the behavior of these other properties on
your networks (and you can of course define your own networks by defining new
degree-sequences).
Bibliography
[1] Wilensky, U. (1999). NetLogo. http://ccl.northwestern.edu/netlogo/.
Center for Connected Learning and Computer-Based Modeling, Northwestern University, Evanston, IL.
178
Chapter 13
13.1
Adjacency Lists
To do so, our programs must be able to handle real networks. The networks
that we generated so far consist of modest numbers of vertices and edges that
fitted easily in the memory of our computer. Real real-world networks consist
of larger numbers on vertices and edges. Real real-world networks are also
invariably sparse.
Storing large networks that are sparse in an adjacency matrix would mean
allocating storage for V 2 edges that would mostly be zero. A more efficient
method to use memory is to only store the non-zero edges. For sparse networks
with O(E) O(V ) edges, this would save O(V 2 ) space for large V .
Exercise 13.1 Why?
We will now change the primary data structure of the programs that we have
used so far from an adjacency matrix to an adjacency list. If you have applied the
principles of modularity and abstraction in the data structure implementation,
then it should now be easy to replace the implementation by another one.
An adjacency list is a linked list, which you may recall from your introductory programming classes.1 In C/C++ you may implement a linked list from
scratch, or you may use a library implementation such as the Boost library.2
1 You
may want to read up on adjaceny lists and linked list data structures on Wikipedia
or on programming websites such as Stackoverflow.
2 www.boost.org
179
180
Python and Java provide similar solutions. The web abounds with suitable solutions. (In fact, there are so many solutions out there that some people may
find implementing their own linked list implementation quicker.)
Exercise 13.2 Implement an adjacency list data structure for your program.
Check that it works correctly with the graphs that the program generates.
Homework 13.1 Recompile and test your Erd
os-Renyi program using adjacency lists. Use as a test-case one hundred thousand vertices, and a connection
probability of p = 0.001. Print the number of edges. Does your program work?
Use progressively larger values of p. For which p value does the program stop
working? For how many edges does it still work?
Exercise 13.3 Mean and variance of edges
Incorporate the adjacency list implementation in your Erd
os-Renyi program
from Homework Exercise 6.2 where you computed the mean an variance of 10
random graphs. Choose large values for V = 1000000 and small values for
p = 0.001. Compute and print the mean number of edges and the variance.
13.2
There are many repositories of real-world networks. One such repository is the
Stanford Network Analysis Project, or SNAP, at http://snap.stanford.edu.
Go to SNAP, and choose the Stanford Large Network Dataset Collection (the
third big blue entry).
Download cit-HepPh and roadNet-CA. The *.gz file type is gzip, you can
unzip the textfile with gunzip.
The structure of the files is simple. Use a viewer or an editor (such as
Wordpad, TextEdit, VI, or Emacs) to view the files. Note that the files contain
a few lines of commentary (preceded by a hash-mark #) and then lines in which
the edges are specified as vertex-vertex pairs.
We will now add an input routine to the program, which reads the graph
from an input file.
Homework 13.2 Write a short input routine for reading in the graph from
the input file. You can choose your own solution: to recognize comments and
skip those lines, or to edit the files and to manually remove the comments before
feeding it to your program, or another solution that works. Create a short 5 line
example graph to test your input routine. Test the program.
Go back to your programming course notes for tips on how to implement file
management and reading and parsing input.
Homework 13.3 Properties of citation network
Incorporate the input routine in your adjacency list program. Read in real real
world networks from Stanford Large Networks Project, specifically the citation
network cit-HepPh. Plot the same four network properties as for Homework Exercise 12.2 for the Repeated Configuration Model: Degree Distribution, Average
Nearest Neighbour Degree, Local Clustering Coefficient, Clustering Coefficient
versus degree.
181
Homework 13.4 Are the plots different? How are they different? Is the
Repeated Configuraton Model a good approximation of a real real-world network?
Homework 13.5 Create the same plots for the roadNet-CA network. How is
the roadmap network different from the citation network?
Homework 13.6 Now that you have seen the properties of real real-world
networks, can you create a pre-specified degree sequence for the repeated configuration model that comes closer to a real real world network than the previous
network from exercise 7.3? Draw up this sequence, and provide the four plots.
Bibliography
182