Anda di halaman 1dari 7

MIT International Journal of Computer Science & Information Technology Vol. 1 No. 1 Jan. 2011 pp.

8-14
ISSN 2230-7621 (Print Version) 2230-763X (Online Version) MIT Publications

A Comparative Study of Soft Computing


Approaches for Mapping Tasks to Grid
Heterogeneous System
Smitha Jha

Assistant Professor, BIT, Noida, India


Abstract- Scheduling of tasks to Grid heterogeneous
system is a NP-Complete problem, so to solve them
various heuristic methods are applied, which give an
approximate optimal solution. There are various
heuristic methods like Min-min, Max-min, OLB
(Opportunistic load balancing), UDA (User directed
assignment), and Fast Greedy etc. Heuristic methods
using Soft Computing approaches like Genetic
algorithm and Neural Network are used in scheduling
problem in Grids for large problem space. Neural
network approach can be used to solve scheduling
problem when communication delay does matter in
finding the completion time of a task on a particular
machine. This can be modelled as a single or multi-layer
neural network where tasks are taken as neurons at the
input layer and machines are neurons at output layer,
weights at the edges are sums of machine availability
time, completion time and communication delay. In this
paper two Neural Network models have been proposed
for processing phase of how the tasks can be scheduled
to machines so that tasks can be computed in minimum
make span. A comparative study has been done w.r.t.
proposed Neural Network models and Genetic
algorithm [6] to schedule tasks on Grid heterogeneous
machines, in terms of time complexity. Here in all cases
number of machines (resources) is kept equal to number
of tasks.

1. INTRODUCTION
Grid computing is the combination of computer
resources from multiple administrative domains applied to a
common task, usually to a scientific, technical or business
problem that requires a great number of computer
processing cycles or the need to process large amounts of
data. One of the main strategies of grid computing is using
software to divide and apportion pieces of a program
among several computers, sometimes up to many thousands.
Grid is a form of distributed computing whereby a super
and virtual computer is composed of a cluster of
networked loosely coupled computers acting in concert to
perform very large tasks. What distinguishes grid
computing from conventional cluster computing systems is
that grids tend to be more loosely coupled, heterogeneous,
and geographically dispersed. Also, while a computing grid
may be dedicated to a specialized application, it is often

constructed with the aid of general-purpose grid software


libraries and middleware [4].
The task scheduling is an important concept in parallel
and distributed system performance. Several algorithms
have been developed for studying the performance of
scheduling to the grid heterogeneous system, some of them
are giving good performance in terms of time complexity.
In grid environment, assignments of tasks to machines in
a heterogeneous system have been categorized in two ways
i.e.
Independent task schedulingIn this a set of independent
tasks assigned to machines according to the load of
resources in order to achieve high system throughput.
Dependent task schedulingIn this, tasks composing a job
have precedence orders i.e. if a job consists of 3 tasks
T1,T2,T3 such that order is T1->T2->T3 then T1 must be
completed first, then T2 and after T1 and T2 ,T3 task
should be completed. In independent task scheduling, tasks
executions are independent, while in dependent task
scheduling, tasks are interdependent. So to model
dependent task scheduling, we need to put extra constraints
or conditions over the model of independent task
scheduling, for e.g. directed acyclic graph is to be created
for a job with different dependent tasks. [3]
Several techniques[1] have been developed in this area
like OLB (Opportunistic load balancing)which assigns each
task, in arbitrary order, to the next available machine,
regardless of the tasks expected execution time on that
machine, UDA (User Directed Assignment) assigns each
task, in arbitrary order, to the machine with the best
execution time for that task, regardless of that machines
availability, Fast Greedy assigns each task in arbitrary order,
to the machine with the minimum completion time for that
task and so on. There exists a heuristic method using soft
computing approaches. In this paper two models have been
proposed for scheduling using neural network approach and
compared with scheduling through GA approach [6], for
independent task scheduling.
In first proposed Neural network model (Neural Network
model 1) the tasks are taken as the input to the neurons at
the input layer. The output neurons at the output layer
represent machines. The model is initialized by assigning
initial value of tasks to all the machines with weights equals
to completion time of task on a particular machine. The
model is trained for a set of independent tasks (i.e. when the

MIT International Journal of Computer Science & Information Technology Vol. 1 No. 1 Jan. 2011 pp. 8-14
ISSN 2230-7621 (Print Version) 2230-763X (Online Version) MIT Publications

outputs at all neurons are nearly equal to target values). In


trained system the task with minimum weight is assigned to
the corresponding machine. This task is removed from the
list of unmapped tasks (U) and machine is removed from
list of free machines. This procedure is repeated till all
tasks from U are mapped to machines.
In second proposed Neural Network model
(NeuralNetworkmodel_2) the features of tasks are taken as
the training set to train the neural network. In trained
system, task with corresponding features are applied at the
inputs, and depending on output value, it is assigned to
machine.
Section 2 discusses general working of independent
scheduling through GA approach and its time complexity.
Section 3 discusses proposed Neural Network model_1 and
the time complexity of corresponding pseudo code. Section
4 discusses Neural Network model_2 and the time
complexity of corresponding pseudo code. Time
complexity is in terms of asymptotic notation (BIG O).
Section 5 gives the comparison of time complexity of GA
approach, Neural Netwok model_1 and Neural Network
model_2. Section 6 gives the conclusion and Section 7
gives the references.

2. GENETIC ALGORITHM
The steps in genetic algorithm as follows,
Step 1-Initial population generation (Initial population of
chromosomes is generated using uniform distribution,
where each chromosome represents a particular sequence of
assigning tasks to machines.)
Step 2-Evaluation (Calculating the total execution time for
each sequence of assignment).
Step 3-While (stopping criteria not met)
{
selection( )
crossover( )
mutation( )
evaluation( )
}
The detailed descriptions of the above methods are as
follows.
Step1 .Population creation() [A set of 200 chromosomes is
generated from uniform distribution for a given
2

3
Fig. 1

metatask. Each chromosome is a t1 vector, where position


i (0<i<t) represents task I and the entry in the position I is
the machine to which task has been assigned].
Step 2. Evaluation (population, all the chromosomes) [the
population are evaluated(ranked) based on their fitness
value(makespan) with a smaller fitness value having the
better mapping
Step 3.
Do
{
Step 3.1 Selection() [This is done to narrow down the
solution space with better fitness value chromosomes].

Step 3.2 Cross-over() [selects a pair of chromosomes, find


a random point in the first chromosome and exchanges the
assignment of machines to the tasks].
Step 3.3 Mutation() [After crossover, any task is randomly
selected and assigned it to machine randomly.]
Step 3.4 Evaluation() [Finally the chromosomes after these
operations are evaluated again].
} while(No. of total iterations<100)
If no stopping criterion is met, again the new population
is chosen, and process repeats.

2.1 GENETIC PSEUDOCODE


The pseudocode of the Genetic algorithm is as follows,
Population generation() [A set of choromosomes are
generated representing different arrangement of assignment
of tasks to machines] To implement this, each chromosome
is represented by a single dimensional array suppose c[5]=
Fig. 1.
Here indices represent tasks and the values corresponding
to these indices in the array represent machines. So a
structure is defined as:
Struct chromosome
{
int c[10];
int ct;

[Completion time]

} chromosome[200];
Where c[10] represents a particular chromosome with
maximum 10 no. of tasks.
So to create 200 chromosomes,
The code is
For i=1 to 200 do
{
For j=1 to n (10) do
{
Chromosome[i].c[j] = read the values from the user.
}
}
Evaluation():Consider a chromosome represented by Fig. 1.
Fitness value of this chromosome=Completion time of all
tasks, with the above assignment on machines
For i=1 to 200 {
For j=1 to n
(where n is no. of tasks=5 here)
{
chromosome[i].ct += j*c[j]
}
}
Selection():
Fitnessvalue of the ith chromosome=chromosome[i].ct
For i=1 to 200 {
Totalfitnessvalue += chromosome[i].ct
}
For i=1 to 200 {
P[i]=chromosome[i].ct/totalfitnessvalue
If(Random()<p[i])

MIT International Journal of Computer Science & Information Technology Vol. 1 No. 1 Jan. 2011 pp. 8-14
ISSN 2230-7621 (Print Version) 2230-763X (Online Version) MIT Publications

{
Read from user no-of-copy;
for j=1 to no-of-copy{

O j f (net j )
net j

for k=1 to n
{
Chromosome[j].c[k]=chromosome[i].c[k]
}
}
}
}
Cross-Over():
Let us consider a single-point crossover at index=4
For (i=1; i<200;i+=2)
{
For j=3 to n {
Store-value=chromosome[i].c[j]
Chromosome[i]-c[j]= Chromosome[i+1].c[j]
Chromosome[i+1].c[j]=store-value
}
}
Mutation(): Let us consider a chromosome
presented in Figure-1.
To get better completion time of the system,1st and 3rd
positions are muted
i.e. c[1] is replaced by 3 and c[3] is replaced 2.
For(i=1;i<200;i++)
{
For(j=1;j<n;j++)
{
If(random()<p(chromosome[i]).c[j])), where p is the
probability of mutation associated with each position of the
array.
Chromosome[i].c[j] = any arbitrary value from the
machines.
}
}
Time complexity of GA = k(T(Population creation) +
T(Evaluation) + T(selection) + T(cross-over)+T(Mutation)),
where k is no. of iterations.
T(Population creation)=200*n=O(n)
T(Evaluation)= 200*n=O(n)
T(selection)= 200+200*no-of-copy*n=O(n)
T(cross-over)= 100*n= O(n)
T(Mutation)=O(n)
So T(GA)=k*O(n)=O(kn)
When k is a constant,
T(GA)=O(n)

3. NEURALNETWORKMODEL_1
The proposed model can be depicted by block diagram
shown in Fig. 2.
In the proposed neural network model of Grid scheduling,
the output at any particular neuron is given by equation 1.

10

Where

i1

I iW

(1)

i, j

(2)

O j is the output at the jth neuron at output layer.

net j is the summation of multiples of inputs(I) and


weights( wi , j )
In proposed model tasks are taken as inputs at the input
layer. So input vector is assigned the task value.

I T

(3)

for e.g. I[1]=1,I[2]=2, and I[n]=n,where task values 1,2,n


may be according to hardness or resource plateform
needed.

Back propagation
Network

Input layer
(A set of tasks)

Fig. 2

Output layer
(A set f of machines)

wi , j = cti , j = cdi , j + mat j + etci , j

(4)

Where,

cti , j is the completion time of ith task on the jth machine.


cdi , j is the communication delay of ith task to reach the
jth machine.

mat j is the machine availability time of the jth machine,


which tells the time to execute the previously allocated
tasks.

etci , j is the execution time to compute ith task on jth


machine.
f is a output function[5].

3.1 NEURALNETWORKMODEL_1 ALGORITHM


Here we present a Neural network algorithm for
independent task scheduling
(I/P layer consists of tasks. O/P layer consists of machines.
The edges from tasks to the machines is represented by
wi , j . The purpose is to assign tasks to appropriate machine

MIT International Journal of Computer Science & Information Technology Vol. 1 No. 1 Jan. 2011 pp. 8-14
ISSN 2230-7621 (Print Version) 2230-763X (Online Version) MIT Publications

in minimum completion time.


for minimum makespan)

Ta is a optimal value taken

Step 1: All tasks in I/P layer are assigned to all machines in


O/P layer with weights wi , j .

then
{
for(a=0;a<I;a++)
{
for(b=0;b<I;b++)
{

Step 2: At all jth machine calculate

Ti wi, j
i 1

where all
Step 3:

if

Ti =i
f (net j ) Ta then

E
w
1
2
is calculated where E= (Ta O )
2
and O=f(netj)
w

Step 4: Tasks with changed weights are assigned again to


all machines.
Step 5: The steps 2-4 is repeated till the model is trained,
i.e. at all output neuron the output is nearly equal to target
value.
Step 6: After training the task with minimum weight is
scheduled to corresponding machine i.e. ith task is
scheduled to jth machine if

wi , j is minimum for all j.

3.2 NEURALNETWORKMODEL_1 PSUDOCODE


Here we present the Pseudocode of proposed neural
network model 1 for scheduling .
NeuralNetwork_approach_scheduling(T[i],M[j],w[i][j])
[ T[] is an array of tasks, M[] is an array of machines,w[][]
ia matrix representing weights]
{
Training (T[i],w[i][j]);
Assignment (T[i],M[j]);
}
Procedure Training (T[i],w[i][j])
[To train the system till (O j Ta ) j , a, b are the
variables for task and machine index.]
{
b=0;
do
{
while(b++<j)
{
for(a=1;a<=j;a++)
net[b]+ =T[a]*w[a][b];
}
for(b=0;b<j;b++)
{
if ( f (net[b]) Ta )

w[a][b]+ = w
}
}}
while(false)

11

E
;
w

Procedure assignment (T[i],M[j]) [After the network is


trained with a given set of tasks and weights, the task with
minimum weight is assigned to corresponding machine]
{
while(a++<=i)
{ min=w[a][1]; s=1;

for(k=2;k<=j;k++)
{

if(min>w[a][k]) then
}

min=w[a][k]; s=k;

T[a]=>M[s]; [Task a is assigned to machines]

}
}

Timecomlexity of pseudocode of NeuralNetworkModel_1


= T(training) + T(assignment).
2

T(training)=O(kmnm)= (nm ) = (n ), when k is a


constant value and n=m;
Where n is no. of tasks and m is no. of machines.
2

T(assignment)= (n )
T(proposed model_1)=T(training)
3

T(assignment)=

(n ) + (n ) = (n )
When k is not a constant
3

T(proposed model_1)= O(kn

4. NEURALNETWORK MODEL_2
Here features of a task are taken at the input layer. The
number of neurons at the input layer is equal to number of
features of a task. The output layer contains a single neuron.
Different machines are assigned a particular value. Hidden
layer may have number of neurons. The system is trained
with various samples of tasks. On this trained system, the
test task is applied. The output value tells which machine to
assign.
This model is depicted by following block diagram.
In this Proposed Model 2, features of task to be scheduled
are taken at the input layer.
Features can be categorized as-[Values of features are
kept between (0-1)]

MIT International Journal of Computer Science & Information Technology Vol. 1 No. 1 Jan. 2011 pp. 8-14
ISSN 2230-7621 (Print Version) 2230-763X (Online Version) MIT Publications

1. Hardness- 0-minimum hardness. 1-maximum hardness.


2. Required platform (Machine Heterogeneity) Different values for different platform required, e.g.
0.1 for Linux o/s, 0.2 for Windows, 0.3 for Windows
0.2 with some specific RAM size and so on.
3. Dependency-Independent-0.0, Less dependent (0.00.4), Fully dependent (1).
4. Size of the task-Small size (containing less data)(00.5), Large size (Containing more data) (0.6-1.0).
5. Application Type-Computation Centric (0-0.5), Data
centric (0.6-1.0).
Machines are scheduled according to the output at output
layer.

Backpropagation
Network
Input layer
(A set of tasks)

Output layer
(One output m/c)

Fig. 3
Table 1

Output at the output


layer.
0.1

Machine scheduled

0.2

M2

0.3

M4

So on

M1

From the table it is clear that on applying features of a


particular task if output comes to 0.1 then this task is
scheduled to machine M1 and so on.
Now we define dataset as follows:
A set of training data is taken, in Table 2 shown at last page
where headings of the columns defined as follows.
F1- Hardness.
F2-Platform dependency
F3- Task dependency
F4-Size-of-task
F5-Type of Application
With these data a backpropagation network (Two layer) is
trained with initial weighted matrix w[] for input layer, and
weighted matrix v[] for hidden layer. At input layer features
Fis coded values are required. The processing is depicted
in the following diagram.
While training, change in weights are given by

E
Fi
w

Let
this
trained
NeuroGridSchedular.

12

system

is

named

as

Using this scheduler, suppose a test task with feature


values,
F1=0.2
F2=0.5
F3=0.4
F4=0.1
F5=0.6
is to be scheduled.
These feature values are applied at the inputs.
Output is calculated using the backpropagation network.
In this backpropagation network the initial values of
weights may be taken as:
0 .0
0 .2

W[ ] = .

.
.

0 .4

0 .1
0 .4

...
...

.
.

...

.
0 .3

0 .0

0 .5
0 . 3
.

.
.

0 .9

0 .1
V[ ] = 0 . 5
0 .7

0 .2

Suppose for the given values of features the output = 0.24,


it is very near to 0.2, so the task is scheduled to machine
M2 from Table 1.

4.1 PSEUDOCODE OF NEURALNETWORKMODEL_2


NeuralNetwork_approach_scheduling(T[i],M[j],
w[i][j],v[i][j]), where,
T[] is an array of tasks, M[] is an array of machines, w[][]
and v[][] are matrices representing weights, s is no. of
training data.)
{
no_of_trainingdata) = 1;
do{
Training (T[i], w[i][j], v[i][j]);
}
While ((no_of_trainingdata++)<s)
Assignment (T[i], M[j]);
}
Procedure Training (T[i], w[i][j],v[i][j])
(To train the system till (O j Ta ) j , a, b are the
variables for task and machine index.)
{
do

MIT International Journal of Computer Science & Information Technology Vol. 1 No. 1 Jan. 2011 pp. 8-14
ISSN 2230-7621 (Print Version) 2230-763X (Online Version) MIT Publications

{
b=1;
do
{
while(b++<m)
{
for(a=1;a<=f;a++)
net[b]+ =T[a]*w[a][b];
}
for(b=0;b<m;b++)
{
if ( f (net[b]) Ta )
then
{
for(a=0;a<f;a++)
{
for(b=0;b<m;b++)
{

w[a][b]+ = w
}

T(N) = O(NOI * NOH) =O(1) where NOI and NOH have


constant values.
So
T(assignment) = O(1)
2

T(training)= s*k*O(mfm)= (kn ) where s and f are a


constant value and n=m.
T(Neuralnetworkmodel_2)= T(training) + T(assignment)
= (kn

O(n )

5.

COMPARISON OF GA,NEURAL NETWORK


MODEL_1, NEURAL NETWORK MODEL_2, BASED
ON TIME COMPLEXITY.
The time complexity of algorithms of above models are
calculated as follows:

E
;
w

GA- ( Kn) ,

O( Kn3 )

NN_2

O( Kn )

NOH

NoI
IPH [i ] OPI [i ]*W [ j ]
OPH [i ] f ( IPH [i ])

Comparison of the above models depicted in Graph1-(N is


constant=5) and
Graph2-(K is constant, maximum no. of
iterations=kmax=100) are shown in Figure 4 and Figure 5
at the end of paper..

6. CONCLUSION
The simulation study of this work is done by writing
pseudocode
for
Genetic
algorithm
approach,
Neuralnetworkmodel_1 and Neuralnetworkmodel_2 and
finding time- complexity. This time-complexity gives the
average completion time in terms of maximum no. of
iterations, no. of tasks and no. of machines. So in these
models we need to know the Grid environment information
i.e. machine heterogeneity, task heterogeneity, size of tasks
and so on. From the comparative study it has been found
that GA gives better performance than the proposed Neural
Network models for independent task scheduling.

ACKNOWLEDGEMENT
I would like to acknowledge and extend my heartfelt
gratitude to Dr. P.C. Saxena, Rtd. Professor, J.N.U. New
Delhi for his valuable guidance and support.

REFERENCES

For i=1 to NOH


{

[1]

R. Armstrong, D. Hensgen, T. Kidd, The relative performance of


various mapping algorithms is independent of sizable variance in
run-time predictions, 7th IEEE Heterogeneous Computing
Workshop (HCW98), Mar. 1998, pp. 79-87.

[2]

R.F. Freund and H.J. Siegel, Heterogeneous Processing, IEEE


Computer, Vol. 26, No. 6, June 1993, pp.78-86.

[3]

Fangpeng Dong and Selim G. Akl Scheduling Algorithms for Grid


Computing: State of the Art and Open Problems Technical Report
No. 2006-504

IPO [i ] OPH [i ]*W [ j ]

OPO [i ] f ( IPO [i ])

}
Test Task=>m[normalised value(

NN_1-

matrix representing Table 1.

=
when k is constant.

OPI [i] IPI [i]

) + O(n )

Procedure assignment (T[i],M[j]) [After the network is


trained with a given set of tasks and weights, the features
of test task are applied to the trained system, NoI - No. of
input neurons, IPI[i] -inputs at input layer, F[i]-feature
values, OPI - outputs at input layer, OPH - outputs at hidden
layer,]
{
For i=1 to f
{
IPI [i] F [i] ;

{
For j=1 to

= (kn

}}
while(false)

For i=1 to

13

OPo )] where m is a

MIT International Journal of Computer Science & Information Technology Vol. 1 No. 1 Jan. 2011 pp. 8-14
ISSN 2230-7621 (Print Version) 2230-763X (Online Version) MIT Publications

[4]
[5]
[6]

Bart Jacob, Michael Brown, Kentaro Fukui, Nihar Trivedi,


Introduction to Grid Computing.
S. Rajsekaran, G.A. Vijayalakshmi Pai, Neural Networks, Fuzzy
logic, and Genetic Algorithms, Synthesis and Applications.
A Comparison Study of Static Mapping Heuristics for a class of
Meta-tasks on Heterogeneous Computing Systems. Tracy D. Braun,
Howard Jay Siegel, Noah Beck,Ladislau L. Boloni Mathucumaru
Maheswaran, Albert I. Reuther, James P. Robertson,Mitchell D.
Theys,Bin Yao,Debra Hansgen, and Richerd F. Freund.

F1

F2

F3

F4

F5

Machines

0.2
0.5
0.5
.
.
.

0.5
0.6
0.8
.
.
.

0.1
0.1
0.2
.
.
.

0.6
0.9
0.3
.
.
.

0.7
0.6
0.4
.
.
.

0.1(M1)
0.2(M2)
0.3(M3)
.
.
.

50000
Time
Comple 40000
xity 30000
GA
NNM1
NNM2

20000
10000

10000000
Time 8000000
complex
ity 6000000
4000000
2000000

GA
NNM1
NNM2

0
5 10 20 25 30 35 50
No. of tasks
Fig. 5: This graph shows NeuralNetworkmodel_2 and GA have less
timecomplexity than NeuralNetworkmodel_1 for a fixed no. of
iteration. For smaller values of tasks, all the three have nearly same
order of timecomplexity.

Graph-1 represents Timecomplexity of


GA,Neuralnetworkmodel_1 and
Neuralnetworkmodel_2 with respect to
No. of iterations,when no. of tasks is a
constant value=5)
60000

0
100

(Graph-2 represent Timecomplexity of


GA,Neuralnetworkmodel_1,Neuralnet
workmodel_2 withrespect to no. of
tasks when No. of iterations a constant
value=100)
14000000
12000000

Table 2

Training
Data no.
1
2
3
.
.
.

14

200
300
400
No. of iterations

Fig. 4: This graph shows GA gives best performance for all


values of iterations, for a fixed no. of tasks, with respect to
NeuralNetworkmodel_1 and NeuralNetworkmodel_2.. For
smaller value of iterations GA and NeuralNetworkmodel_2
both have same timecomplexity, but as the no. of iterations
increases
GA
gives
better
timecomplexity
than
NeuralNetworkmodel_1.

Anda mungkin juga menyukai