attention has been recently devoted to develop algorithms able to consider the direction and the
weight of the links, which require suitable benchmark graphs for testing. In this paper we extend the
basic ideas behind our previous benchmark to generate directed and weighted networks with built-in
community structure. We also consider the possibility that nodes belong to more communities, a
feature occurring in real systems, like, e. g., social networks. As a practical application, we show
how modularity optimization performs on our new benchmark.
any type of benchmark, we will include the possibility to smax = max{sξ } 6 N and νmax = max{νi } 6 nc ,
have overlapping communities. Sawardecker et al. have where N is the number of nodes and nc the number
recently proposed a different benchmark with overlap- of communities. At this point, we have to decide
ping communities where the probability that two nodes which communities each node should be included
are linked grows with the number of communities both into. This is equivalent to generating a bipartite
nodes belong to [22]. network where the two classes are the nc commu-
Our algorithms to create the benchmark graphs have a nities and the N nodes; each community ξ has sξ
computational complexity which grows linearly with the links, whereas each node has as many links as its
number of links and reduce considerably the fluctuations memberships νi (Fig. 1). The network can be eas-
of specific realizations of the graphs, so that they come as
close as possible to the type of structure described by the
input parameters. We use our benchmark to make some
testing of modularity optimization [23], which is well de-
fined in the case of directed and weighted networks [24].
In Section II we describe the algorithms to create the
new benchmarks. Tests are presented in Section III. Con-
clusions are summarized in Section IV.
1
P
of rounding integer numbers. where h 2m ξ
iξ = 1/νij ξ 1/2mξ , and ξ runs only over
the common memberships of the nodes.
On the other hand, if i and j do not share any mem-
bership, the probability to have a link between them is:
4000 (ext) (ext)
N = 1000 , <k> = 50, µt = 0.8
ki kj ki kj
pij ' = µt (5)
2m(ext) 2m
3000 P (ext)
where 2m(ext) = i ki
P
= µt i ki is the number of
Internal degree
N = 5000, µt = 0.1
1.5
N = 5000, µt = 0.5
Other benchmarks, like that by Girvan and Newman,
are based on a similar definition of communities, ex-
pressed in terms of different probabilities for internal and 1
external links. One may wonder what is the connection
between our benchmark and the others. It is not difficult
to compute an approximation of how the probability of 0.5
having a link between two nodes in the same community
depends on the mixing parameter µt .
In the configuration model, the probability to have a 0
15 20 25 30
connection between nodes i and j with ki and kj links <k>
ki kj
respectively is approximately pij = 2m , provided that
ki 2m and kj 2m. If the approximation holds,
our prescription to assign ki (ξ) allows us to compute the FIG. 4: Computational time to build the unweighted bench-
probability that i and j get a link in the community ξ: mark as a function of the average degree. We show the results
for networks of 1000 and 5000 nodes. µt was set equal to 0.1
ki (ξ)kj (ξ) 1 ki kj and 0.5 (the latter requires more time for the rewiring pro-
pij (ξ) ' = (1 − µt )2 , (3) cess). Note that between the two upper lines and the lower
2mξ νi νj 2mξ
ones there is a factor of about 5, as one would expect if com-
P plexity is linear in the number of links m.
where 2mξ = i ki (ξ) is the number of internal links
in the community (we recall that νi is the number of
memberships of node i). If i and j share a number νij
of memberships and all the respective pij (ξ) are small,
B. Weighted networks
the probability that they get a link somewhere can be
approximated with the sum over all the common com-
munities. The final result is In order to build a weighted network, we first generate
an unweighted network with a given topological mixing
νij 1 parameter µt and then we assign a positive real number
pij ' (1 − µt )2 ki kj h iξ , (4)
νi νj 2mξ to each link.
5
To do this we need to specify two other parameters, β value. With our procedure the value of Var({wij })
and µw . The parameter β is used to assign a strength si decreases at least exponentially with the number
to each node, si = kiβ ; such power law relation between of iterations, consisting in sweeps over all network
the strength and the degree of a node is frequently ob- links. (Fig. 5).
served in real weighted networks [18]. The parameter µw
(in) For the distribution of the weights wij , we expect the
is used to assign the internal strength si = (1 − µw ) si , (int) (in) P (in) (in)
which is defined as the sum of the weights of the links averages hwi i = 1/ki j wij κ(i, j) = si /ki
(ext) (ext) (ext)
between node i and all its neighbors having at least one and hwi i = si /ki . Note that these expressions
membership in common with i. The problem is equiv- can be related to the mixing parameters in a simple way
alent to finding an assignment of m positive numbers (Fig. 6):
{wij } such to minimize the following function:
X (in) (in) (ext) (ext) 2 (int) 1 − µw β−1 (ext) µw β−1
Var({wij }) = (si −ρi )2 +(si −ρi )2 +(si −ρi ) . hwi i = k and hwi i= k .
1 − µt i µt i
i
(6) (7)
(in,ext)
Here si and si indicate the strengths which we
(in)
would like to assign, i.e. si = kiβ , si = (1 − µw ) si ,
(out) ∗
si = µw si ; {ρi } are the total, internal and external
strengths of node i defined through its link weights, i.e. N = 5000 , <k> = 20, µt = 0.4, µw= 0.2, β = 1.5
P (in) P (out) P 1e+06
ρi = j wij , ρi = j wij κ(i, j), ρi = j wij (1 −
κ(i, j)), where the function κ(i, j) = 1 if nodes i and j
share at least one membership, and κ(i, j) = 0 otherwise. 1000
(in,ext) Var { w ij}
We have to arrange things so that si and si are
∗
consistent with the {ρi }. For that we need a fast algo- 1
rithm to minimize Var({wij }). We found that the greedy
algorithm described below can do this job well enough for
0.001
the cases of our interest.
1 1
0.95 0.95
Acknowledgments
FIG. 11: Test of modularity optimization on our benchmark
for weighted undirected networks. The topological mixing
parameter µt = 0.5. All other parameters are the same as in
Fig. 10. Each point corresponds to an average over 100 graph We thank F. Radicchi and J. J. Ramasco for useful
realizations. suggestions.
[1] H. A. Simon, Proc. Am. Phil. Soc. 6, 467 (1962). 118703 (2008).
[2] M. E. J. Newman, SIAM Review 45, 167 (2003). [18] A. Barrat, M. Barthélemy, R. Pastor-Satorras and A.
[3] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez and D.- Vespignani, Proc. Natl. Acad. Sci. USA 101, 3747 (2004).
U. Hwang, Phys. Rep. 424, 175 (2006). [19] J. Baumes, M. K. Goldberg, M. S. Krishnamoorthy,
[4] M. Girvan and M. E. J. Newman, Proc. Natl. Acad. Sci. M. M. Ismail and N. Preston, in IADIS AC, Eds. N.
99, 7821 (2002). Guimaraes and P. T. Isaias, p. 97 (2005).
[5] S. Fortunato and C. Castellano, in Encyclopedia of Com- [20] S. Zhang, R.-S. Wang and X.-S. Zhang, Physica A 374,
plexity and System Science, ed. B. Meyers (Springer, Hei- 483 (2007).
delberg, 2009), arXiv:0712.2716 at www.arXiv.org. [21] T. Nepusz, A. Petróczi, L. Négyessy and F. Bazsó, Phys.
[6] D. Lusseau and M. E. J. Newman, Proc. R. Soc. London Rev. E 77, 016107 (2008).
B 271, S477 (2004). [22] E. N. Sawardecker, M. Sales-Pardo and L. A. N. Amaral,
[7] G. W. Flake, S. Lawrence, C. Lee Giles and F. M. Coet- Eur. Phys. J. B 67, 277 (2009).
zee, IEEE Computer 35(3), 66 (2002). [23] M. E. J. Newman, Phys. Rev. E 69, 066133 (2004).
[8] R. Guimerà and L. A. N Amaral, Nature 433, 895 (2005). [24] A. Arenas, J. Duch, A. Fernández and S. Gómez, New J.
[9] G. Palla, I. Derényi, I. Farkas and T. Vicsek, Nature 435, Phys. 9, 176 (2007).
814 (2005). [25] M. Molloy and B. Reed, Random Structures & Algo-
[10] A. Condon and R. M. Karp, Random Structures & Al- rithms 6, 161 (1995).
gorithms 18, 116 (2001). [26] M. E. J. Newman and E. A. Leicht, Proc. natl. Acad. Sci.
[11] R. Albert, H. Jeong and A.-L. Barabási, Nature 401, 130 USA 104, 9564 (2007).
(1999). [27] J. J. Ramasco and M. Mungan, Phys. Rev. E 77, 036122
[12] R. Guimerà, L. Danon, A. Dı́az-Guilera, F. Giralt and (2008).
A. Arenas, Phys. Rev. E 68, 065103(R) (2003). [28] L. Danon, A. Dı́az-Guilera, J. Duch and A. Arenas, J.
[13] L. Danon, J. Duch, A. Arenas and A. Dı́az-Guilera, in Stat. Mech., P09008 (2005).
Large Scale Structure and Dynamics of Complex Net- [29] S. Fortunato and M. Barthélemy, Proc. Natl. Acad. Sci.
works: From Information Technology to Finance and USA 104, 36 (2007).
Natural Science, eds. G. Caldarelli and A. Vespignani [30] The packages can be found in
(World Scientific, Singapore, 2007), pp 93–114. http://santo.fortunato.googlepages.com/inthepress2.
[14] A. Clauset, M. E. J. Newman and C. Moore, Phys. Rev. In each package there are instruction files that enable to
E 70, 066111 (2004). easily operate the software.
[15] A. Lancichinetti, S. Fortunato and J. Kertész, New J.
Phys. 11, 033015 (2009).
[16] A. Lancichinetti, S. Fortunato and F. Radicchi, Phys.
Rev. E 78, 046110 (2008).
[17] E. A. Leicht and M. E. J. Newman, Phys. Rev. Lett. 100,