ID p
i
). Path length L
s
k
is dened as the number of peers traversed
at stage k to locate the requested resource. Dynamic validity
threshold v
s
k
denotes the average number of peers traversed
during s1 earlier stages to nd the resource location. Validity
threshold v
s
k
is initially (i.e., for the rst stage) set to the number
of peers. In RSQ message that is received at source p
d
, stack G
k
is
empty, receiver ID realine;
id
is ID of p
d
, and path length L
s
k
is set
to zero.
Upon receiving a RSQ message at each stage s, each peer p
i
calls procedure RSQ(p
i
,s,k) shown in Fig. 4. Procedure RSQ(p
i
,s,k) is
locally run at each peer p
i
for locating resource r
k
at stage s. In this
procedure, source peer p
d
rst checks its resource table stored as
the action-sets to see if resource r
k
is provided by the P2P grid
system. If that is not the case (i.e., if a
ik
= 2a
i
), peer p
d
returns an
error message stating that the requested resource is not available
and terminates the procedure. Otherwise, every peer p
i
incre-
ments path length L
s
k
by one and checks its available resources to
see if it is able to provide the requested resource itself. If so, peer
p
i
(which is hereafter called resource providing peer and denoted
as p
d
0 ) returns its location by a RLC (Resource LoCation) message
to the source peer to which user submits its resource request. RLC
message is composed of ve parts: resource providing peer p
d
0 ,
receiver ID realine;
id
, path G
s
k
including the travelled peers from
the source peer to the resource providing peer in a stack order,
updated validity threshold v
s
k1
, and reinforcement signal b
s
k
.
Fig. 3. Pseudo code of procedure DRQ (departure request).
Fig. 4. Pseudo code of procedure RSQ (resource query).
J. Akbari Torkestani / Journal of Network and Computer Applications 35 (2012) 20282036 2032
Reinforcement signal b
s
k
is used to update the internal state of
activated learning automata based on the optimality of the
travelled path to locate the resource. To set the reinforcement
signal b
s
k
, resource providing peer p
d
0 compares path length L
s
k
with validity threshold v
s
k
. If L
s
k
rv
s
k
, then b
s
k
is set to zero and all
learning automata corresponding to the peers included in G
s
k
are
rewarded. Otherwise, it is set to one and all learning automata are
penalized. At each stage k, validity threshold v
s
k1
is updated as
v
s
k1
k1 v
s
k
L
s
k
k
4
Otherwise (if peer p
i
can not provide the requested resource),
peer p
i
activates its corresponding automaton A
i
. Learning auto-
maton A
i
updates action-set a
ik
and action probability vector p
ik
by temporarily disabling the actions corresponding to the peers
selected so far (included in G
s
k
) as described earlier in Section 2
and procedure DRQ. This is to avoid the loop formation and
repetitive peers in G
s
k
. Then, learning automaton A
i
randomly
chooses one of its possible actions from a
ik
based on p
ik
, if any. If
there is no more action in action-set a
ik
, travelled path G
s
k
is
traced back to nd a peer with non-empty action-set. This is done
by sending a TRB (TRacing Back) message to the peer appended to
stack G
s
k
before current peer. This peer resumes the resource
discovery process and chooses one of its possible actions from
non-empty action-set a
ik
. Let us assume that automaton A
i
chooses action a
j
ik
. This implies that peer p
j
is the next peer to
which the task of resource location is entrusted. Selected action is
temporarily removed from the action-set a
ik
. Peer p
i
sends a RSQ
message to peer p
j
through communication link (p
i
,p
j
). This
process continuous until the resource providing peer p
d
0 is found.
As mentioned earlier, resource providing peer p
d
0 sends the
location of the requested resource to the user along traversed path
G
s
k
by a RLC message. To do so, the resource providing peer p
d
0
extracts the peer appended to stack G
s
k
before itself (e.g., peer p
i
)
and sends a RLC message to it. Upon receiving a RLC message, each
peer p
i
calls procedure RLC shown in Fig. 5. In this procedure, the
reinforcement signal b
s
k
is rst checked and the internal state of
automaton A
i
is updated by applying Eq. (1) on p
ik
if b
s
k
is zero and
on Eq. (2) otherwise. After updating the action probability vector,
the action-set must be restored again by enabling the disabled
actions. Then, peer p
i
extracts the peer appended to stack G
s
k
before
itself (e.g., p
j
) and sends a RLC message to it (see Lines 0709 of
Fig. 5). This procedure repeats until the RLC message is received at
source peer p
d
. When RLC message is received at source peer p
d
,
the resource discovery process is over and source peer p
d
can be
connected to resource providing peer p
d
0 through G
s
k
.
Upon receiving a TRB message at peer p
i
, it calls procedure
TRB(p
i
,s,k). In this procedure, learning automaton A
i
checks its
action-set to see if it is empty. If so, peer p
i
decrement the path
length L
s
k
by one and sends a TRB message to the peer p
j
that has
been added to G
s
k
before peer p
i
. This process is repeated until a
peer with non-empty action-set is found. In this case, the learning
automaton corresponding to the found peer selects one of its
actions at random according to p
ik
, and resumes the resource
discovery process by sending a RSQ message to the selected peer
(see Lines 0709 of Fig. 6).
4. Experimental results
In this section, several simulation experiments are performed to
investigate the efciency of the proposed resource discovery algo-
rithm called LARD (short for Learning Automata-based Resource
Discovery algorithm) under three different grid sizes: small, medium,
and large scale P2P grids. The small scale P2P grid system is
composed of 256 peers, and 1024 resources of 4 different resource
types (each resource type having four classs). The medium scale P2P
grid systemis composed of 2048 peers, and 8192 resources. The large
scale P2P grid system is composed of 16,384 peers, and 65,536
resources. In real scenarios, large scale P2P grids may include tens of
Fig. 5. Pseudo code of procedure RLC (resource location).
Fig. 6. The pseudo code of procedure TRB (tracing back).
J. Akbari Torkestani / Journal of Network and Computer Applications 35 (2012) 20282036 2033
thousand peers or even more. However, in this paper, large scale grid
systems are composed of 16,384 peers. Resources are generally of
4 different types: CPU, memory, disk and operating system. CPU,
memory, and disk can be of four different capacities: low, moderate,
high, and very high. Operation system can be also of four different
types on different machines. Therefore, grid resources are generally of
4 different types and 16 classes.
Resources are evenly and randomly distributed between the
peers. 1024, 8192, and 65,536 resource queries are submitted to
the randomly chosen peers of small, medium, and large scale
systems. Queries are for different resource types selected at
random. P2P network topologies are generated as follows. For
small, medium, and large scale systems, peers are randomly and
evenly distributed within the square simulation area of size
250250, 10001000, 40004000 unit, respectively. Neighbor-
ing peers are connected together if the Euclidean distance between
them is less than or equal to 20, 40, and 80 unit for small, medium,
and large scale P2P grid systems, respectively. The nominal
bandwidth of the network connecting every two peers is assumed
to be 10 Mbps. To improve the precision of the reported results,
each experiment is independently repeated 50 times and the
obtained results are averaged over these runs. The performance
of the proposed resource discovery algorithm is compared with
that of KL (a resource discovery method proposed by Kocak and
Lacks (2012) in which the network routers are responsible for
locating the grid resources) and DWC (an ant colony-based
resource discovery algorithm proposed by Deng et al. (2009)) in
terms of the following metrics of interest.
Hop count This metric is dened as the average number of
peers that are traversed to locate the requested resource. Hop
count is affected by the network routing mechanism, resource
distribution, and prior knowledge of the resource location.
Hit ratio This is dened as the percentage of the success
resource discoveries. A resource discovery is successful if at
least one resource providing peer can be found for the
requested resource before TTL expires.
Control Message Overhead This metric is dened as the number
of (extra) control messages required for resource discovery
process. This metric is measured as the number of control
messages that must be sent per second.
In our experiments, the learning algorithm is L
RP
with the
same reward and penalty parameters (learning rate). Obviously,
the effectiveness of the proposed algorithm directly depends on
the choice of a proper learning rate. By the proper choice of the
learning rate, a trade off between the cost of algorithm (control
message overhead) and the solution optimality (hit ratio and hop
count) can be made. Depending on the application nature,
different learning rates can be chosen. If an application sacrices
the cost in favor of the solution optimality, a small learning rate is
preferred, a larger one can be chosen otherwise. Several experi-
ments were initially conducted to determine the best value of the
learning rate. To nd such a proper value, different learning rates
ranging from 0.05 to 0.5 were tested on small, medium, and large
scale P2P grids. The obtained results showed that the best results
are achieved when the learning rate is set to 0.075, 0.080, and
0.090 for small, medium, and large scale systems, respectively.
Therefore, the learning rate is set to the above mentioned values
in different P2P grid scales for further experiments.
4.1. Hop count
The aim of this experiment is to show the ability of different
resource discovery algorithms to locate the nearest peer provid-
ing the requested resource. Fig. 7 represents a comparison of the
average hop count of the proposed resource discovery algorithm
with KL (Kocak and Lacks, 2012) and DWC (Deng et al., 2009) for
different grid scale scenarios. From the results shown in this
gure, it can be seen that the average hop count increases as the
system scale (network size) increases. One possible reason might
be that the resources are distributed in a wider area and so the
distance (number of hops) between the user and resource
increases. The results shown in Fig. 7 are averaged over the
number of submitted resource queries to the system. Each
experiment is repeated 50 times and the results are also averaged
over the number of runs. Comparing the results given in Fig. 7, it
is clear that the proposed algorithm signicantly outperforms the
other algorithms in terms of the number of hops, KL (Kocak and
Lacks, 2012) lags far behind LARD and DWC (Deng et al., 2009)
has the largest hop count. One reason is that the proposed
algorithm avoids appearing the cycle and redundant peer in the
constructed path. The results also show that the gap between the
proposed algorithm and the other methods becomes more sig-
nicant as the system scale increases. Contrary to KL (Kocak
and Lacks, 2012) and DWC (Deng et al., 2009), no signicant
growth can be seen in the number of hops of the proposed
algorithm as the network size increases. This is because the
proposed algorithm is fully distributed and independent from
the network size.
4.2. Hit ratio
Hit ratio is a very important measure to evaluate the effec-
tiveness of a resource discovery algorithm that represents the rate
of successful discoveries. This set of experiments is performed to
investigate the hit ratio of different algorithms under different
grid scales. The obtained results are shown in Fig. 8. Form the
results shown in this gure, it is obvious that the hit ratio of
the proposed resource discovery algorithm is higher than that of
the other approaches. Comparing the results shown in Fig. 8, we
nd that the gap between the proposed algorithm and the other
methods becomes larger as the network size grows. This shows
the higher scalability of LARD. The proposed method taking
advantage of learning automata is able to memorize the shortest
path toward the resource. This path is stored as the probability
vectors in learning automata. When a path is constructed to
connect a requesting peer to a resource provider, it can be used to
connect the intermediate peers for the same resource queries too.
Among different possible paths toward the same resources, LARD
converges to the shortest path. That is why LARD selects the more
probable paths and has a higher hit ratio.
0
2
4
6
8
10
12
14
16
18
20
Small Scale Medium Scale Large Scale
H
o
p
C
o
u
n
t
Grid Size
KL
DWC
LARD
Fig. 7. Average hop count under different grid scale.
J. Akbari Torkestani / Journal of Network and Computer Applications 35 (2012) 20282036 2034
4.3. Control message overhead
These experiments are conducted to measure and compare the
control message overhead of different resource discovery mechan-
ism. The experimental results are depicted in Fig. 9. Comparing the
results shown in this gure, it can be seen that the proposed
algorithm LARD has the lowest rate of control message overhead
and DWC (Deng et al., 2009) has the highest one. KL (Kocak and
Lacks, 2012) encapsulates the resource discovery packets within
the TCP/IP packets and so it places a considerably smaller amount
of extra control packets to the system as compared to DWC (Deng
et al., 2009). As mentioned earlier, the main objective of the
proposed algorithm is to alleviate the impact of the network-wide
broadcast storm problem (to reduce the number of broadcasts).
The proposed algorithm sends the resource query messages only
to the peers that have the requested resources with a much higher
probability. As the proposed algorithm proceeds, the resource
queries are forwarded along the shortest paths connecting the
peers with a probability as close to one as possible. This mean-
ingfully decreases the rate of extra message overhead required for
resource discoveries. As shown in Fig. 9, the rate of control
message overhead increases as the grid becomes larger. This is
clear because the hop count and so the number of rebroadcasts
increases as the P2P network size increases.
5. Conclusion
In this paper, a decentralized learning automata-based
resource discovery algorithm was proposed for large-scale
unstructured P2P grids. This algorithm was designed to relief
the negative impacts of the global ooding problem on the
network performance and to support the multi-attribute range
queries too. In this method, the resource queries are forwarded
through the shortest paths ending at the grid peers more likely
having the requested resources. In the proposed algorithm, each
peer is equipped with a learning automaton and network of
learning automata is responsible for routing the query toward
the resource provider through the shortest path. The proposed
algorithm supports the highly dynamicity of the scalable P2P
grids. Several simulation experiments were conducted on small,
medium, and large scale P2P grid environments to show the
performance of the proposed resource discovery algorithm. The
obtained results were compared with those of KL (Kocak and
Lacks, 2012) and DWC (Deng et al., 2009) in terms of average hop
count, average hit ratio and control message overhead. Numerical
results showed that the proposed algorithm outperformed KL
(Kocak and Lacks, 2012) and DWC (Deng et al., 2009) in all small,
medium, and large scale grids. The more signicant gap between
the hop count, hit ratio and message overhead of the proposed
algorithm and the others for large scale grid environments show
the higher scalability of the proposed algorithm as compared to
KL (Kocak and Lacks, 2012) and DWC (Deng et al., 2009).
References
Akbari Torkestani J. A new approach to the job scheduling problem in computa-
tional grids, Cluster Computing, in press, 2012a.
Akbari Torkestani J. LAAP: a learning automata-based adaptive polling scheme for
clustered wireless Ad-Hoc networks, Wireless Personal Communication, in
press, 2012b.
Akbari Torkestani J. An adaptive learning automqata-based ranking function
discovery algorithm, Journal of intelligent information systems, in press,
2012c.
Akbari Torkestani J. An adaptive focused web crawling algorithm based on
learning automata, Applied Intelligence, in press, 2012d.
Akbari Torkestani J. Backbone formation in wireless sensor networks, Sensors and
Actuators A: Physical, in press, 2012e.
Akbari Torkestani J. Mobility prediction in mobile wireless Networks. Journal of
Network and Computer Applications 2012f;35:163345.
Akbari Torkestani J. A stable virtual backbone for wireless MANETS, Telecommu-
nication Systems Journal, in press, 2012g.
Akbari Torkestani J. An adaptive backbone formation algorithm for wireless sensor
networks. Computer Communications 2012h;35:133344.
Akbari Torkestani J. Degree constrained minimum spanning tree problem in
stochastic graph. Journal of Cybernetics and Systems 2012i;43(1):121.
Akbari Torkestani J. An adaptive heuristic to the bounded-diameter minimum
spanning tree problem, Soft Computing, in press, 2012j.
Akbari Torkestani J. An adaptive learning to rank algorithm: learning automata
approach. Decision Support Systems, in press, 2012k.
Akbari Torkestani J, Meybodi MR. LLACA: an adaptive localized clustering algo-
rithm for wireless Ad hoc networks based on learning automata. Journal of
Computers & Electrical Engineering 2011a;37:46174.
Akbari Torkestani J, Meybodi MR. A link stability-based multicast routing protocol
for wireless mobile Ad hoc networks. Journal of Network and Computer
Applications 2011b;34(4):142940.
Akbari Torkestani J, Meybodi MR. Finding minimum weight connected dominating
set in stochastic graph based on learning automata. Information Sciences
2012;200:5777.
Andrzejak A, Xu Z. Scalable, efcient range queries for grid information services In:
Proceedings of 2nd international conference on P2P computing, pp. 3340,
2002.
Cai M, Frank M, Chen J, Szekely P. MAAN: a multi-attribute addressable network
for grid information services. in: Proceedings of 4th international workshop on
grid computing, pp. 184191, 2003.
Deng Y, Wang F, Ciura A. Ant colony optimization inspired resource discovery in
P2P grid systems. Journal of Supercomputing 2009;49:421.
Eugster PT, Guerraoui R, Kermarrec AM, Massoulie L. From epidemics to dis-
tributed computing. IEEE Computer 2004;37(5):607.
Iamnitchi A, Foster IT. A P2P approach to resource location in grid environments,
grid resource management. In: Weglarz J, Nabrzyski J, Schopf J, Stroinski M,
editors. Kluwer; 2003.
Kocak T, Lacks D. Design and analysis of a distributed grid resource discovery
Protocol. Cluster Computing 2012;15(1):3752.
Marzolla M, Mordacchini M, Orlando S. Resource discovery in a dynamic grid
environment. In: Proceedings of DEXA workshop, pp. 356360, 2005.
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Small Scale Medium Scale Large Scale
C
o
n
t
r
o
l
M
e
s
s
a
g
e
O
v
e
r
h
e
a
d
Grid Size
KL
DWC
LARD
Fig. 9. Control message overhead under different grid scale.
0.8
0.85
0.9
0.95
1
Small Scale Medium Scale Large Scale
H
i
t
R
a
t
i
o
Grid Size
KL
DWC
LARD
Fig. 8. Average hit ratio under different grid scale.
J. Akbari Torkestani / Journal of Network and Computer Applications 35 (2012) 20282036 2035
Mastroianni C, Talia D, Verta O. A super-peer model for building resource
discovery services in grids: design and simulation analysis. In: Proceedings
of European grid conference, LNCS, vol. 3470, pp. 132143, 2005a.
Mastroianni C, Talia D, Verta O. A super-peer model for resource discovery services
in large-scale grids. Future Generation Computer Systems 2005b;21:123548.
Merz P, Gorunova K. Fault-tolerant resource discovery in P2P grids. Journal of Grid
Computing 2007;5:31935.
Narendra KS, Thathachar MAL. Learning automata: an introduction. New York,
Printice-Hall; 1989.
Puppin D, Moncelli S, Baraglia R, Tonelotto N, Silvestri F. A grid information service
based on P2P. In: Proceedings of 11th Euro-Par conference, LNCS, vol. 3648,
pp. 454464, 2005.
Ratnasamy S, Hellerstein JM, Shenker S. Range queries over DHTs, IRB-TR-03-009,
Intel Corporation, 2003.
Ratnasamy S, Francis P, Handley M, Karp RM, Shenker S. A scalable content-
addressable network. In: Proceedings of ACM SIGCOMM 2001 conference on
applications, technologies, architectures, and protocols for computer commu-
nication, pp. 161172, 2001.
Rowstron A, Druschel P. Pastry: Scalable, decentralized object location and routing
for large scale P2P systems. In: Proceedings of IFIP/ACM international
conference on distributed systems platforms, middleware, LNCS, vol. 2218,
pp. 329350, 2001.
Schmidt C, Parashar M. Flexible information discovery in decentralized distributed
systems. In: Proceedings of 12th international symposium on high-
performance distributed computing, pp. 226235, 2003.
Spence D, Harris T, XenoSearch. Distributed resource discovery in the XenoServer
open platform. In: Proceedings of the 12th IEEE international symposium on
high performance distributed computing, pp. 216225, 2003.
Stoica I, Morris R, Karger DR, Frans Kaashoek M, Balakrishnan H. Chord: a scalable
P2P lookup service for internet applications. In: Proceedings of ACM SIGCOMM
2001 conference on applications, technologies, architectures, and protocols for
computer communication, pp.149160, 2001.
Talia D, Truno P. P2P protocols and grid services for resource discovery on grids.
In: Grandinetti L, editor. Grid computing: the new frontier of high perfor-
mance computing, advances in parallel computing, Vol. 14. Elsevier Science;
2005. p. 83105.
Thathachar MAL, Harita BR. Learning automata with changing number of actions.
IEEE Transactions on Systems, Man, and Cybernetics 1987;SMG17:1095100.
Truno P, Talia D, Papadakis H, Fragopoulou P, Mordacchini M, Pennanen M, Popov
K, Vlassov V, Haridif S. P2P resource discovery in grids: models and systems.
Future Generation Computer Systems 2007;23:86478.
Yu H, Bai X, Marinescu DC. Workow management and resource discovery for an
intelligent grid. Parallel Computing 2005;31:797811.
J. Akbari Torkestani / Journal of Network and Computer Applications 35 (2012) 20282036 2036