Dynamic Load Balancing

CHAPTER 1
INTRODUCTION
1.1 Overview
1.1.1 Distributed System
The computing power of any distributed system can be realized by allowing its constituent computational elements (CEs), or nodes, to work cooperatively so that large loads are allocated among them in a fair and effective manner. Any strategy for load distribution among CEs is called load balancing (LB). An effective LB policy ensures optimal use of the distributed resources whereby no CE remains in an idle state while any other CE is being utilized. In many of todays distributed-computing environments, the CEs are linked by a delay-limited and bandwidth limited communication medium that inherently inflicts tangible delays on internode communications and load exchange. Examples include distributed systems over wireless local-area networks (WLANs) as well as clusters of geographically distant CEs connected over the Internet, such as Planet Lab . Although the majority of LB policies developed heretofore take account of such time delays, they are predicated on the assumption that delays are deterministic
In actuality, delays are random in such communication media, especially in the case of WLANs. This is attributable to uncertainties associated with the amount of traffic, congestion, and other unpredictable factors within the network. Furthermore, unknown characteristics (e.g., type of application and load size) of the incoming loads cause the CEs to exhibit fluctuations in runtime processing speeds. Earlier work by our group has shown that LB policies that do not account for the delay randomness may perform poorly in practical distributed computing settings where random delays are present. For example, if nodes have dated, inaccurate information about the state of other nodes, due to random communication delays between nodes, then this could result in unnecessary periodic exchange of loads among them. Consequently, certain nodes may become idle while loadsare in transit, a condition that would result in prolonging the total completion time of a load. Generally, the performance of LB in delay-infested environments depends upon the selection of balancing instants as well as the level of load-exchange allowed
between nodes. For example, if the network delay is negligible within the context of a certain application, the best performance is achieved by allowing every node to send its entire excess load (e.g., relative to the average load per node in the system) to less-occupied nodes. On the other hand, in the extreme case for which the network delays are excessively large, it would be more prudent to reduce the amount of load exchange so as to avoid time wasted while loads are in transit. Clearly, in a practical delay-limited distributed-computing setting, the amount of load to be exchanged lies between these two extremes and the amount of load-transfer has to be carefully chosen. A commonly used parameter that serves to control the intensity of load balancing is the LB gain.
1.2
Objective:
The novelty of this implementation (i.e., its dynamic nature) lies in the fact that it makes the
system able to balance the load not only in terms of active mobile nodes or number of serving contexts, but also based on the actual traffic characteristics of active mobile terminals.
1.3
1.3.1
Scope of Study:
Load Balancing Load balancing is defined as the allocation of the work of a single application to
Processors at run-time so that the execution time of the application is minimized. Since the speed at which a NOW-based parallel application can be completed depends on the computation time of the slowest workstation, efficient load balancing can clearly provide major performance benefits. The two major categories for load-balancing algorithms are static and dynamic. 1.3.2 Static Load Balancing Static load balancing algorithms allocate the tasks of a parallel program to workstations based on either the load at the time nodes are allocated to some task, or based on an average load of our workstation cluster. The advantage in this sort of algorithm is the simplicity in terms of both implementation as well as overhead, since there is no need to constantly monitor the workstations for performance statistics. However, static algorithms only work well when there is not much variation in the load on the workstations. Clearly, static load balancing
2
algorithms arent well suited to a NOW environment, where loads may vary signi ficantly at various times in the day, based on the issues discussed earlier.
1.3.3 Dynamic Load Balancing
Dynamic load balancing algorithms make changes to the distribution of work among workstations at run-time; they use current or recent load information when making distribution decisions .As a result, dynamic load balancing algorithms can provide a significant improvement in performance over static algorithms. However, this comes at the additional cost of collecting and maintaining load information, so it is important to keep these overheads within reasonable limits. The remainder of this report will focus on such dynamic load balancing algorithms.
CHAPTER 2
LITERATURE SURVEY
2.1
Existing System:
Many efforts have already been made to provide a more efficient GGSN selection and
anchor relocation in 3G/UMTS architectures. In this section we give a short overview of the different approaches, in particular the ones which try to present a practically applicable dynamic solution for GGSN load balancing and load sharing. The method called GPRS-Subscriber selection of multiple internet service providers was developed by Ericsson in order to allow mobile users to connect to multiple different Packet Data Communication networks (PDNs). The selection of the PDN is based on the transmission of a specific Network Indication Parameter (NIP). This parameter is sent to the Serving GPRS Support Node (SGSN) as a parameter in the PDP context activation procedure. The PDP type parameter (a two bytes long field) is used to set up the connection to the chosen PDN. Based on this method, mobile operators can distribute traffic load towards gateways of different networks, and can also assign the GGSN with the lowest load to the new entrant MNs or using other selection policies Distributed IP-pool in GPRS or Dynamically distributed IP-pool in GPRS is another development made by Ericsson. The aim of these patents is to distribute IP addresses to the users efficiently in a multi-GGSN system when the users get the address from the GPRS/UMTS network. Each GGSN has an own variable size pool from which it can assign addresses, and a central pool is given, which holds available addresses. When the pool of a GGSN is nearly empty it can get a group of addresses from the central pool. On the other hand, when a GGSN has a lot of address, these addresses can be sent back to the central pool to be used elsewhere. This scheme could help the 3G/UMTS network to balance load of GGSNs in means of MN numbers. Ciscos GGSN GTP Load Balancing is based on the Cisco Server Load Balancing (SLB) feature designed to provide IP server load balancing between Cisco devices. GGSN GTP Load Balancing is a method aiming to provide increased GGSN reliability and availability when applying
4
multiple Cisco or non-Cisco GGSNs in a GPRS/UMTS network. Using this feature the operator can define a virtual server that represents a group or cluster of real server implementations (i.e., server farm). In such an environment potential clients connect to the IP address of the virtual server. When a client starts a connection to the virtual server, the Cisco SLB feature will choose a real server for the connection from the server farm. The server -selection is based on a configured load-balancing algorithm implemented in the virtual server. The above introduced methods make operators able to distribute load to different GGSN entities and also to assign GGSN actually suffering the lowest load to the new coming MNs or using any other GGSN selection policy. However, such selected GGSN will be permanent during the whole period between the attachment and detachment of a particular MN. Therefore in such a system there is no way to perform dynamic load balancing between GGSN entities if MNs are currently in service (i.e., comprise active PDP contexts). However, in bursty packet data communication it is very likely that a load balancer assigns the same load (e.g., the same number of MNs) to the GGSNs in a GGSN server farm or pool, but the GGSNs still suffer from unequally balanced load. In current 3GPP standards it is impossible to provide such features like dynamic load balancing or load adjustment between GGSN nodes.
2.2
Proposed System:
This was the motivation of Shiao-Li Tsao who presented the basics of a dynamic GGSN load balancing and load sharing scheme. The author introduces a new device called the GGSN Controller, which is responsible for monitoring the load of the GGSNs and adjusting the GGSN load dynamically by transferring PDP contexts between GGSNs. The initiation and the process of the PDP transfers are monitored and controlled by the centralized GGSN controller. The GGSNs sending their load information periodically to the GGSN Controller and the Controller decides whether a PDP context transfer is necessary. Shiao-Li Tsao also defines novel protocol messages that make the PDP context transfer available, and gives detailed description of various cases that can occur during the context transfers, but unfortunately no performance evaluation or any kind of analysis was provided in this work. To the best of our knowledge, the only evaluation ever conducted in the area of dynamic anchor load balancing is presented by Cheng Xue et al. who introduced and simulated the inter-GW
5
load balancing approach of LTE/EPC networks. Based on various simulation tests they showed that with load balancing, the performance of the system can be enhanced dramatically. As we introduced, several schemes have already been proposed aiming to provide intelligent GGSN selection and anchor load balancing in 3G/UMTS and beyond architectures. However, only very few of them were evaluated and even the available evaluation results are based only on simulations. In our work we rely on the insights of as we refined the basic idea in order to implement and integrate the solution in a 3G/UMTS testbed for comprehensive analysis and evaluation of the scheme with extensive measurements. According to our surveys, this is the first freely available work of analyzing dynamic GGSN load balancing and load sharing in real-life wireless environment.
2.3
Comparison:
In order to evaluate our implementation and to analyze the performance characteristics of the
concept of dynamic GGSN load balancing in our testbed, we decided to execute measurements with various number of active PDP contexts. We examined the average latency of the packets, which passed through the GGSNs in order to benchmark the throughput of the anchoring gateway node(s) in the 3G/UMTS network. The tests we performed were based on artificial UDP streams synthesized per PDP context by our GTP traffic generator applying packet size of 125 bytes at each stream and using packet sending frequencies between 24004000 packet/s. The maximum number of transferred contexts was limited to 10. The results are shown on the following graphs. As it can be seen, the average packet latencies were significantly lower, when dynamic PDP context transfer was enabled. Our results are proving that transferring PDP contexts, thereby achieving dynamic load balancing between GGSNs, can be implemented in real-life environments with substantial gains . With dynamic load balancing, lower packet latencies at gateway anchor nodes are achievable, thus the overall throughput of the network can be increased significantly.
Fig.1 Comparison results based on average packet latency measurements
CHAPTER 3
ARCHITECTURE AND IMPLEMENTATION
3.1 Test Bed Evaluation of Dynamic GGSN Load Balancing for High Bitrate 3G/UMTS Networks:
Thanks to the appearance of novel, extremely practical smartphones, portable computers with easy-to-use 3G USB modems and attractive business models, mobile Internet has started to become a reality. Network operators offer flat-rate mobile data connection packages and although these packages usually have traffic limit, it is expected to disappear soon. Mobile users tend to replicate usage manner of wired broadband Internet subscriptions with long connection times and massive multimedia data transmission. The growing traffic load of single users and the growing amount of mobile subscribers will result an immense traffic explosion in the packet switched domain of mobile networks up to year 2020 . The increase of mobile Internet traffic will be higher compared to the fixed Internet traffic in the forthcoming years, most dramatically due to new entrant, data-hungry mobile entertainment services like mobile TV, video and music,and new application types, such as M2M (machine-to-machine) communications including e-health services, vehicle communications, remote control and monitoring services. In order to handle the anticipated traffic demands and maintain the profitability together, it is important to eliminate bottlenecks from the network . One of these bottlenecks in todays 3G/UMTS architectures is the GPRS Gateway Support Node (GGSN).
In UMTS networks, GGSN is the gateway towards the Internet. According to the actual 3GPP specifications, a Mobile Node (MN) requesting packet switched (PS) services should attach to the network first. When an attached MN wants to access packet switched services, it needs to activate a PDP (Packet Data Protocol) context that enables the MN to access the service based on the information stored in the Home Location Register (HLR) . Once the MN successfully activated the PDP context, it can start the packet delivery/reception procedures.During this activation procedure a suitable GGSN is to be selected in the 3G/UMTS network in order to serve the MN. The selected GGSN will be responsible to forward every single user data packet towards the outside
8
network domains and routing packets back to the MN based on the GTP tunnels linked to the MNs PDP contexts. In this way the GGSN becomes a user plane anchor for its MNs. Considering the increasingly growing traffic demand for packet services in 3G/UMTS and beyond (e.g., LTE/EPC) systems, it is obvious that serious scalability issues of such anchor nodes will be necessary to handle very soon. In order to cope with these questions, operators tend to deploy a set of GGSNs. Unfortunately, a GGSN anchor is permanent during the whole period of a PDP context activation/deactivation interval, meaning that the selected GGSN node cannot be changed until the MN deactivates the session of a particular PDP context. To put it the other way around we can say that a standard 3GPP 3G/UMTS architecture cannot perform on-the-fly or dynamic load balancing of GGSNs for MNs which own currently activated contexts. However, packet data services are often generating traffic which is quite bursty in nature, thus creating the need for a more scalable solution like dynamic load balancing between anchor points without breaking ongoing MN connections The concept of the protocol is based on Shiao-Li Tsao's work. However, we discard the GGSN Controller unit from the scheme, and our method focuses on the implementation questions and the way of deletion of the contexts.
Fig 2. The basic scheme of the implemented load balancing protocol
3.2
Dynamic GGSN Load Balancing Scheme :An important starting assumption during our implementation efforts was that the GGSNs are
connected via a high bandwidth and reliable internal network, and know each others IP addresses. We also assume that the majority of the traffic that handled by the GGSNs are best -effort in nature, therefore dont require special QoS provisioning services and mechanisms. This best effort traffic should tolerate the extra delay, and possible pocket loss that can occur during PDP transfer. Our protocol was designed not to change the size and items of IP address pools maintained in individual GGSNs after transferring and deleting PDP contexts. By every PDP context transfer, the IP address of the context moves to another GGSN only temporarily: right after the deletion of a formerly transferred PDP context, the IP address of the context is placed back into the pool of the originating GGSN. With this method the GGSNs can be prevented from running out of dynamic IP addresses. Considering two GGSNs in the GGSN cluster, the operation of our implemented dynamic GGSN load balancing scheme is the following. GGSN-1 continuously monitors and records the bandwidth usage of each PDP context, and marks the contexts with the highest load. GGSN-1 is also measuring, implements the GTPv0 and GTPv1 protocols. The main part of our implementation work was extending OpenGGSNs GTPLib to enable management of the newly defined GTP messages related to PDP context transfer (e.g., Transfer PDP Context Request/Reply). The GGSN component of the software package was also modified in order to implement the mechanisms for PDP context transfer and packet delay measurements. We extended the PDP context structure too: the PDP contexts now are able to store the address of the GGSN where they were originally created. A special logic for continuous load analysis of active PDP context is implemented with the addition of two variables to the context structure: a timer and a counter. Both the timer and the counter are reset periodically in order to reflect to the actual usage such providing capabilities for dynamic decisions. The counter can be based on different load metrics; our implementation uses current bandwidth information for differentiation between load characteristics of PDP context traffic. However the idea of this protocol is based on Shiao-Li Tsaos method the concept of the GGSN Controller which collects load information from GGSNs and centrally controls the PDP context transfers was set aside as in our implemented scheme the control of dynamic PDP context transfers is managed by decision logics of
10
individual GGSNs in a distributed way. Compared to Shiao-Li Tsaos protocol, we focused more on the issues of what happens after a transferred PDP context is deleted. We designed our protocol that PDP context transfers should not change the size of the address pools of the GGSNs. After deleting a transferred PDP context, its IP address returns to the originating GGSN. A significant improvement to Shiao-Li Tsaos work that our protocol was implemented, tested and evaluated in a real-life 3G/UMTS testing environment to be introduced in the next Section.
3.3
1)
Test bed Architecture:

In order to provide a test bed for gateway scalability researches and analyzing dynamic
GGSN load balancing in next generation multimedia-centric and high bitrate communication systems, we designed and implemented a UMTS/IMS architecture based on the existing hardware elements of Mobile Innovation Centre (MIK) located in Budapest, Hungary . In order to make the system able to use synthetic traffic of variable number of users, we applied a software GGSN implementation called OpenGGSN as a basis of our work. Our GPL licensed and publicly available OpenGGSN modification uses the same architecture as original version 0.84, but extends the GTP library with routines and components for dynamically setting up, maintain and tear down PDP contexts inside a GGSN pool for efficient load balancing. Besides packets originated from and heading towards real 3G User Equipments, also synthetic IP traffic of virtual users can be used for evaluation purposes. This is achieved by extended SGSNemu instances (note, that the original SGSNemu is also a part of the OpenGGSN 0.84 pack). As shown in our testbed has a special GGSN pool consisting of multiple GGSNs, and several SGSNs can be connected to that pool. To test the scalability of the system, several IP traffic generators (e.g., were applied. These generators are connected to the SGSNs, emulate the Gb interface, such creating an effective and flexible GTP traffic generator. The receiver side of the traffic generator is connected to the Gi interface of the GGSN Pool. SGSNemu instances map the incoming traffic to different PDP contexts, and then forward the data to the direction of the chosen GGSN. With the development of this extended SGSNemu version, our testbed became able to create and delete contexts, and generate traffic to the created contexts dynamically. Using this complex and flexible testbed architecture we evaluated the throughput of the GGSN with and without dynamic load balancing capabilities, measured the reallife effects of the dynamic load balancing scheme, and showed how this approach can improve the performance of the overall 3G/UMTS system.
11
Fig 5. 3G/UMTS testbed architecture used for evaluation
The latency of each passing packet. When this latency exceeds a certain level, GGSN-1 sends a Transfer PDP Context Request message to GGSN-2. The request contains one of the marked transferable PDP-contexts. 2) Then GGSN-2 calculates, whether it has enough resources to serve the context. If the answer is yes, it creates the context, saves the IP address of the context to its local address pool, and sends a routing update message to the inbound routers in order to indicate that the packets belonging to that context should be forwarded to GGSN-2. The transferred context then gets a new TEID (Tunnel Endpoint IDentifier) value from GGSN-2.
12
3) GGSN-2 sends a Transfer PDP Context Response message to GGSN-1. The message contains the new and the old TEID of the transferred PDP context. GGSN-1 releases the resources occupied by the transferred context, except the IP address belonging to the PDP context under transfer: this address should not be reallocated during the whole life of the context. 4) As a next step, GGSN-1 sends a Change GGSN Request message, to SGSN. This message contains the received identifiers (i.e., the old and new TEID of the transferred context) from GGSN-2. 5) The SGSN responds with a Change GGSN Response message in order to confirm the request, and updates its copy of the transferred PDP context. From now the packets which belong to the transferred PDP context will be forwarded to GGSN-2, meaning that the context transfer is completed. 6) In case of UE initiated removal of the previously transferred context the SGSN receives a Delete PDP Context Request message. The SGSN has no information about that the PDP context to be deleted has been transferred before, so it sends a Delete PDP Context Request message directly to GGSN-2. 7) GGSN-2 stores information about whether the context is its own, or has been transferred under its authority. So when the deletion request arrives, GGSN-2 knows that a transferred context is to be deleted. In that case GGSN-2 sends a Delete Parent PDP Context message to GGSN-1 from where the context was originally transferred. 8) GGSN-2 sends a routing update message to the inbound routers, indicating that the address for the routing entry is deleted, and the packets from that address is to be directed towards GGSN-1 in the future. 9) and 10) GGSN-1 releases the address of the PDP context, and then sends a Delete Parent PDP Context Response message to GGSN-2 indicating the successful release of the address. GGSN-2 removes the PDP context, and then sends a Delete PDP Context Response message as an acknowledgment to SGSN.
13
The implementation of the above protocol shell is based on the OpenGGSN open source software package . This package has three components: 1) a fully functional GGSN module, 2) an SGSN emulator named SGSNemu, which can be used to test the GGSN (it only includes basic functionality for SGSN-GGSN communication, and 3) the GTPLib.
14
CHAPTER 4
METHODOLOGY
4.1
Load Balancing Algorithm

Most load balancing algorithms are designed based on the performance requirements of
some specific application domain. For example, applications that exhibit lengthy parallel jobs usually benefit in the presence of a job migration system. However, applications with shorter tasks usually dont warrant the expense of job migration and thus are better handled with clever loop scheduling algorithms where the task granularity changes dynamically, as defined in. As a result, only one algorithm will be described here in order to provide an overview of the various issues that a typical algorithm must take into consideration. However, all algorithms closely follow the four basic load balancing steps outlined at the beginning of Implementation
4.1.1 Single Program Multiple Data Computation Model
The Single Program Multiple Data (SPMD) paradigm implies that all the Workstations run the same code, but operate on different sets of data. The motivation for sing SPMD programs is that they can be designed and implemented easily, and they can be applied to a wide range of applications such as numerical optimization problems and solving coupled partial differential equations. The SPMD computation model is depicted in Figure 3. Each task is divided into operations or iterations. Workstations execute the same operation asynchronously, using data available in the workstations own local memory. This is followed by a data exchange phase where information can be exchanged between workstations (if required), after which all workstations wait for synchronization. Thus each lock-step of an SPMD program contains.
15
Three phases: 1. Calculation Phase: each task will do the required computation. There is no communication between workstations at this point.
2. Data Distribution Phase: each task will distribute the relevant data to other tasks that need it for the next lock-step.
3. Synchronization Phase: this phase ensures that all tasks have completed the same lock-step. Otherwise there will be problems with tasks using the wrong data.
Fig 4. SPMD Computation Model [LEE95]
If an SPMD program were to be executed on a homogeneous multiprocessor system, the workload would be balanced for the entire computation (assuming that all tasks were initially evenly distributed). However, in a NOW, there are various other factors that can affect the load and thus contribute to load imbalances.
16
Thus, within the SPMD paradigm in a VPM, we would like to reduce the execution time of the program by dynamically shifting/migrating tasks from one workstation to another at the end of each lock-step, if required. There are 2 things that need to be considered:
1. Determine if there is a need to rebalance the load 2. Find the best distribution of tasks
SPMD Load Balancer Has developed a global, centralized, sender-initiated load balancing algorithm for large, computation-heavy SPMD programs using the following parameters:
Tcomputei :Tthe interval between the time at which the first task on workstation i starts execution and the time at which the last task on the same workstation completes the computation and waits for the synchronization. This value is thus the dynamic workload index for the algorithm, since there is a direct relation between Tcomputei and a workstations load. Assuming a program can be decomposed into N tasks and there are P workstations, then we have N ni (for i = 1 to P). Thus ni is the number of tasks on workstation i. Figure 3 SPMD Computation Model
Ttaski the average computation time for each task on a workstation, defined as: Ttaski = Tcomputei / ni (equation 1) Thigh the maximum of Tcompute over all workstations, defined as: Thigh = max { Tcomputei } (1 <= i <= P) Tlow the minimum of Tcompute over all workstations, defined as: Tlow = min { Tcomputei } (1 <= i <= P)
A common approach taken for dynamic load balancing on a NOW is to predict
17
future performance based on past information [ZAKI96]. In the SPMD algorithm, Ttask can be used to update Tcompute. Thus if m tasks are moved to workstation i, we can solve equation1 to give us an estimation of Tcomputei: Ttaski x (ni + m) The estimation is based on the current workload of the workstation, and it is valid because all tasks in an SPMD program are executing the same code. Tcompute will be recalculated after each task reallocation, with Thigh and Tlow updated accordingly.
4.2
Meeting the Rebalancing Criteria:
Therefore, in order to balance the load, tasks from workstations that have a longer Tcompute will be moved to the workstations with a shorter Tcompute. However, the algorithm must also take into account the rebalancing criteria as discussed in earlier For the first rebalancing criteria, assume that we have a workstation k that has the current highest Tcompute. Further assume that L represents the number of lock steps remaining, and mi represents the number of tasks workstation i transmits or receives. Therefore, in order to guarantee that moving a task from workstation k to a new workstation doesnt cause the new workstations Tcompute to be greater than ks, we check the following condition:
Ttaskk x nk > min {Ttaski x (ni + 1)} (1 <= i <= P, i != k)
If this is true, then moving one task from workstation k to another workstation will not cause the oscillating effect that was mentioned in Section 3.3. For the second rebalancing criteria (where we need to verify that attempting to balance the load provides some performance gain), we can compute the following: OldThigh represents the previous Thigh value before the current load balancing decision NewThigh represents the new Thigh value that is computed after meeting criteria 1 (which is guaranteed to be lower than OldThigh) 13 Therefore the gain associated with performing load balancing would be:
18
Gain = (OldThigh NewThigh) X L Assuming that our load balancer knows Toverhead, the cost of performing job migration of the SPMD tasks, we can now check criteria 2 using the following:
Gain >= Toverhead x max {mi} (1 <= i <= P) If this is true, then we can conclude that it is worthwhile to perform the load balancing and job migration.
4.3 Iterative Algorithm

The final iterative algorithm, as outlined by [LEE95], can be summarized as follows: OldThigh = max {Tcomputei} (1 <= i <= P) WHILE criteria 1 and 2 are TRUE DO FOR i = 1 to P Tcompute = (ni + 1) x Ttaski ENDFOR //move a task from workstation with Thigh to the one with the smallest Tcompute //Update Tcompute of the workstations involved in task migration NewThigh = max {Tcomputei} (1 <= i <= P) ENDWHILE
Each iteration of the while loop attempts to move one task from the most heavily loaded workstation to the most lightly loaded node, as long as the rebalancing criteria are being met. After some movement, the load monitoring variables are recomputed and the while loop repeats. This continues until the algorithm detects that there are no more tasks that can be redistributed without degrading performance. In other words, the while loop iterates until the system is as balanced as possible given the current timing information.
19
5 ADVANTAGES AND DISADVANTAGES
5.1Global vs. Local Strategies Global or local policies answer the question of what information will be used to make a load balancing decision In global policies, the load balancer uses the performance profiles of all available workstations. In local policies workstations are partitioned into different groups. In a heterogeneous NOW, the partitioning is usually done such that each group has nearly equal aggregate computational power. The benefit in a local scheme is that performance profile information is only exchanged within the group. The choice of a global or local policy depends on the behavior an application will exhibit. For global schemes, balanced load convergence is faster compared to a local scheme since all workstations are considered at the same time. However, this requires additional communication and synchronization between the various workstations; the local schemes minimize this extra overhead. But the reduced synchronization between workstations is also a downfall of the local schemes if the various groups exhibit major differences in performance. notes that if one group has processors with poor performance (high load), and another group has very fast processors (little or no load), the latter will finish quite early while the former group is overloaded.
20
5.2Centralized vs. Distributed Strategies A load balancer is categorized as either centralized or distributed, both of which define where load balancing decisions are made. In a centralized scheme, theb load balancer is located on one master workstation node and all decisions are made there. In a distributed scheme, the load balancer is replicated on all workstations. Once again, there are tradeoffs associated with choosing one location scheme over the other. For centralized schemes, the reliance on one central point of balancing control could limit future scalability. Additionally, the central scheme also requires an all-to-one exchange of profile information from workstations to the balancer as well as a one-to-all exchange of distribution instructions from the balancer to the workstations. The distributed scheme helps solve the scalability problems, but at the expense of an all-to-all broadcast of profile information between workstations. However, the distributed scheme avoids the one-to-all distribution exchange since the distribution decisions are made on each workstation
21
CONCLUSION
Load balancing is an important issue in a virtual parallel machine built using a low-cost network of workstations. The most difficult aspect of load balancing in a network of workstations involves deciding on which algorithm to use. Hundreds of various algorithms have been proposed, and each one has its own specific motivations and design decisions that result in trade-offs that arent always suited to every imaginable task. This report described many of the design issues that are commonly considered when deciding on a load balancing algorithm (such as global/local, centralized/distributed, etc.), as well as the tradeoffs associated with the various parameters and strategies. Additionally, this report outlined the details of an algorithm targeted towards SPMD style programs in order to present a concrete example of the details associated with an actual load balancing implementation. Finally, the load balancing features for two cluster management software packages (Load Leveler and Condor) were described briefly.
22
FUTURE SCOPE
While great progress has been made in dynamic load balancing for parallel, unstructured and/or adaptive applications, research continues to address issues arising due to application and architecture requirements. Existing algorithms, such as the geometric algorithms RCB and HSFC, are being augmented to support special needs of complex applications. New models using hypergraphs are being developed to more accurately represent highly connected, on-symmetric, and/or rectangular systems arising in density functional theory, circuit simulations, and integer programming. On heterogeneous computer architectures, software such as DRUM dynamically detects the available computing, memory and network resources, and provides the resource information to both existing partitioning algorithms and new hierarchical partitioning strategies. Software toolkits such as Zoltan deliver these capabilities to applications; enable comparisons of methods within applications, and serves test-beds for further research and development. While we present some solutions to these issues, our work represents only small sample of continuing research into load balancing. For adaptive nit element methods, data movement from an old decomposition to a new one can consume orders of magnitude more time than the actual computation of a new decomposition; highly incremental partitioning strategies that minimize data movement are important for high performance of adaptive simulations.In overlapping Schwartz preconditioning, the work to be balanced depends on data in both the processor's subdomain and the overlap region, while the size of the overlap region depends on the subdomain generated by the partitioned. In such cases, standard partitioning models that assume work per processor is the total weight of objects assigned to the processor are insouciant; strategies that treat workloads as a function of the subdomain are needed . Very large scale semantic networks place additional demands on practitioners, due to both their high connectivity and irregular structure; highly selective partitioning techniques for these networks are in their infancy. These examples of research in partitioning, while still not exhaustive, demonstrate that, indeed, the load-balancing problem is not yet solved.
23
REFERENCES
[1]. Szabolcs Kustos, Lszl Bokor, Gbor Jeney, Testbed Evaluation of Dynamic GGSN Load Balancing for High Bitrate 3G/UMTS Networks, Budapest University of Technology and Economics (BME), Department of Telecommunications (HT) Mobile Communication and Computing Laboratory (MC2L) Mobile Innovation Centre (MIK) Magyar Tudsok krt.2, H-1117, Budapest, Hungary 2011 [2]. Van Albada, G.D., Clinckemaillie, J. Dynamite-blasting Obstacles to Parallel Cluster Computing, Technical Report, Department of Computer Science, University of Amsterdam, The Netherlands, 1996. [3]. Baker, M., Fox, G., Yau, H. Review of Cluster Management Software, NHSCReview, 1996 Volume, First Issue, July 1996. [4]. Dandamudi, S., Piotrowski, A. A Comparative Study of Load Sharing on Networks of Workstations, Proceedings of the International Conference on Parallel and Distributed Computing Systems, New Orleans, October 1997.
[5]. Dandamudi, S. Sensitivity Evaluation of Dynamic Load Sharing in Distributed Systems, Technical Report TR 97-12, Carleton University, Ottawa, Canada.
24

Dynamic Load Balancing

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Dynamic Load Balancing

Diunggah oleh

Hak Cipta:

Format Tersedia

CHAPTER 1

1.1.1 Distributed System

Fig.1 Comparison results based on average packet latency measurements

ARCHITECTURE AND IMPLEMENTATION

Fig 2. The basic scheme of the implemented load balancing protocol

Test bed Architecture:

Fig 5. 3G/UMTS testbed architecture used for evaluation

Load Balancing Algorithm

4.1.1 Single Program Multiple Data Computation Model

Fig 4. SPMD Computation Model [LEE95]

A common approach taken for dynamic load balancing on a NOW is to predict

Meeting the Rebalancing Criteria:

Ttaskk x nk > min {Ttaski x (ni + 1)} (1 <= i <= P, i != k)

4.3 Iterative Algorithm

5 ADVANTAGES AND DISADVANTAGES

Anda mungkin juga menyukai