0 penilaian0% menganggap dokumen ini bermanfaat (0 suara)
40 tayangan6 halaman
This article represerts an extension of [rabirca, 2000aJ. A new equation for upper bOlDlds is obtained based on the Smarandache f-inferior part function. An example involving tpper diagonal matrices is given in order to illustrate that the new equation provide a better computation.
Judul Asli
A NEW EQUATION FOR THE LOAD BALANCE SCHEDULING BASED ON THE SMARANDACHE F-INFERIOR PART FUNCTION
This article represerts an extension of [rabirca, 2000aJ. A new equation for upper bOlDlds is obtained based on the Smarandache f-inferior part function. An example involving tpper diagonal matrices is given in order to illustrate that the new equation provide a better computation.
This article represerts an extension of [rabirca, 2000aJ. A new equation for upper bOlDlds is obtained based on the Smarandache f-inferior part function. An example involving tpper diagonal matrices is given in order to illustrate that the new equation provide a better computation.
Tatiana Tabirca* Sabin Tabirca** University of Manchester, Department of Computer Science Universiry College Cod:, Department of Computer Science Abstract. This article represerts an extension of [rabirca, 2000aJ. A new equation for upper bOlDlds is obtained based on the Smarandache f-inferior part function. An example involving tpper diagonal matrices is given in order to illustrate that the new equation provide a better computation. 1.INTRODUcrION Loop imbalance is the most important overhead in many parallel applications. Because loop structures represents the main source of parallelism, the scheduling of parallel loop iterations to processors can detennine its decreasing. Among the many method for loop scheduling, the load balance scheduling is a recent one and was proposed by Bull [1998] and developed by Freeman et.al. [1999,2000]. Tabirca [2000] studied this method and proposed an equation for the case when the work is distrIbuted to all the processors. Consider that there are p processors denoted in the following by Ph P 2 , , Pp and a single parallel loop (see Figure 1.). do parallel i= l,n call1oop_body(i); end do Figure 1. Single Parallel Loop We also assume that the work of the routine loop_body(i) can be evaluated and is given by the function w: N -7 R, where w(i)= Wi represents the nwnber of routine's operations or its running time (asswne that w(O)=O). The total amount of work for the parallel loop is " L w(i). The efficient loop-scheduling algorithm distributes equally this total amount of ;=1 24 work on processors sueh that a processor receives a quantity of work equal to t w(i). P i=1 Let I j and h j be the lower and upper bounds, j = 1,2, ... , P , such that processor j executes all the iterations between Ijand h j . These bounds are found distributing equally the work on processors by using h j 1" L w(i) =_. Lw(i) ('Vj= 1,2, ... ,p). ;=I J P ;=1 (1) Moreover, they satisfy the following equations 11 =1. (2.a) if we know I j , then h J is given by tw(i) I,w(i)= w .. (2.b) . i..J j P i=1 I j + 1 =h j + 1. (2.e) Suppose that Equation (2b) is computed by a less approximation. This means that if we have the value I j' then we find h j as follows: II _ '+1 h j =h <=> < Lw(i) . (3) /=IJ i=lj The Smarandache f-inferior part fimetion represents a generalisation of the inferior part function [,]: R --7 Z , [x] = k k x < k + 1. If [: Z --7 R is a strict increasing function that satisfies lim and lim[(n)=oo, then the SmarandacheJ-inferior part _--t- II-+- function denoted by fr.1 : R --7 Z is defined by [see www.gallup.unm.edu/-smarandaehe] [n(x)=k 1). (4) Tabirca [2000a] presented some Smarandaehe f-inferior part functions for whieh l f(k) = LiD. They are presented in the ;=1 l [(k) = Li 2 iii (x) = [r(x)]'Vx i=1 1 3x where r(X)= __ +3_- 2 2 25 (5) (6) ( 3'X)2 I -2- + 1728' Tabirca [2000] also proposed an equation for the upper bounds of the load balance scheduling method based on the Smarandache f-inferior part function. If the work w satisfies certain conditions [fabirca. 2000], then the upper bounds are given by h (l) -j, (, .W) '-1" j - uV J- ......... p. (7) Moreover, Tabirca [2000a] applied this method to the product between an upper diagonal matrix and a vector. It was proved that the load balance scheduling method offers the lowest running time in comparison with other static scheduling methods n:abirca, 2000b]. 2. A NEW EQUATION FOR THE UPPER BOUNDS In this section, a new equation for the upper bounds is introduced. Some theoretical considerations about the new equation and Equation (7) are also made. Consider that Ir I; N R is defined by I(k) = L, Wi' 1(0) = O. For the work w. we asswne the following [Tabirca, 2000]: . 1 AI: Wj L,w p j=I,2, ... ,n. p i=1 i=1 A2: There are equations for the fimctions I, In . Theorem I. The upper bounds of the load balance scheduling methcd are given by hy) ... ,p. (8) Proof. For easiness we denote in the following h.i = h;2) Equation (3) gives the upper bounds of the load balance scheduling method. We start
and add I( h j_l) = L Wi to all the sides ;=1 II II +1 f w(i) < fw(i) . i=1 i=1 Based on the definition of In' we find that h j = In V(h j_l) + w). The following theorem illustrates how these bounds are. Theorem 2. hy) h;ll,j = 1,2, ... ,p. Proof. Recall that these two upper bounds satisfy 26 from the equation
(9.a) (9.b) All the sums from Equation (9.b.) are added finding it.':.l i .L w. j. W 1: Wi j. w . i=1 It=!}" i=1 Because h;ll is the last index satisfying Equation (9.a) we find that hyl h;ll holds. C f(h(2l) <f(h(1l)< W '-12 onsequence. j _. j -J ,j - , , ... ,p. This consequence obviously comes from the monotony off and the definition of the bounds. Now, we have two equations for the upper bounds of the load balance scheduling method. Equation (8) was obtained naturally by starting from the definition of the load balance. It reflects that case when several load balances are performed consecutively. Equation (7) was found by considering the last partial sum that is under j. W . This option does not consider any load balance such that we expect it to be not quit efficient Moreover, it is difficuh to predict which equation is the best or is better to use it of a given compu1ation. The best practical advice is to apply both of them and to choose the one, which gives the lowest times. 3. COMPUTATIONAL RESULTS In this section we present an example for the load balance scheduling method. This example deals with the product between an upper diagonal matrix and a vector [Jaja, 1992]. All the computations have been perfonned on SGI Power Challenge 2000 parallel machine with 16 processors. The dimension of the matrix was n=300. 00 PARALLEL i=l,n Yi =a i . 1 ,x I OOj=2,i Yi = Yi +a;.i ,x i END DO END DO Figure 2. Parallel Computation for the Upper Matrix - Vector Product. Recall that a = (a;,i\i=!;. E M ,,(R) is upper diagonal if ai,j = O,i < j. The product Y = a . x between an upper diagonal matrix a = (a i J-) _ ---I EM. (R) and a vector J,J-" XE R" is given by Yi = a;.j . Xj Vi =1,2, ... n. j=1 (10) 27 The parallel computation of Equation (lO) is shown in Figure 2. The work: of iteration i is given by w(i) =i,i = 1,2, ... ,n. We have that the total work is f(n) = Ii = n (n + 1) and W n (n + 1). The Smarandache f-inferior function is ;=1 2 2p f(](X)=[ Therefore, the upper bounds of the load balance scheduling method are given by [ 1
4 . n.(n+I)] - + +. }" -'-----'- (1)_ p._ h) - 2 ,J -l,2, ... ,p or (11) [ _L+ 1+4h(2) . (h(2) +1)+4 n.(n+l)j )-4 )-1 h (2) - p '-1,2 J o - ,J - ... ,p. 2 (12) The running times for these two types of upper bounds are presented in Table 1. Figure 3 proves that these two types of bounds for the load balance scheduling are comparable the same. p=! P=2 P=3 P=6 P=8 1.847 1.347 0.987 0.750 0.482 J 1.842 1.258 0.832 0.639 0.412 J Table 1. Times of the computation. 4. FINAL CONCLUSSION An important remark that can be outlined is the Smarandache inferior part function was applied successfully to solve an important scheduling problem. Based on it. two equations for the upper bounds of the load balance scheduling methods have been found. These equations have been used to solve the product between an upper diagonal matrix and vector and the computational times were quite similar. The upper bounds given by the new equation have provided a better computation for this problem. 28 1.5111l1li 0.5- P=1 P=2 P=3 P=6 P=8 Figure 3. Graphics of the Running Times. REFERENCES - Bull MJ. (1998) Feedback Guided Loop Scheduling: Algorithm and Experiments, Proceedings ofEuro-Par'98, Lecture Notes in Computer Science, Springer Verlang. Bull MJ., R.W.Ford, T.L.Freeman and DJ.Hancock (1999) A Theoretical Investigation of Feedback Guided Loop Scheduling, Proceedings of Ninth SIAM Conference on Parallel Processing for Scientific Computing, SIAM Press. 1.M.Buil and T.L.Freeman (2000) Convergence of Feedback Guided Loop Scheduling, Technical Report, Department of Computer Science, University of Manchester. Jaja, J. (1992) Introduction to Parallel Algorithms, Adison-Wesley. Tabirca, T. and S. Tabirca, (2000) Balanced Work Loop-Scheduling, Technical Report, Department of Computer Science, Manchester University. [In Preparation] Tabirca, T. and S. Tabirca, (2oooa) A Parallel Loop Scheduling Algorithm Based on The Smarandachej-Inferior Part Function, Smarandache Notions Journal [to appear]. 29