Clustering Algorithms for Scale-free Networks and Applications to Cloud Resource
Management Author: Ashkan Paya and Dan C. In this paper we introduce algorithms for the construction of scale-free networks and for clustering around the nerve centers, nodes with a high connectivity in a scale-free networks. We argue that such overlay networks could support self-organization in a complex system like a cloud computing infrastructure and allow the implementation of optimal resource management policies. System scalability should be an ab-initio concern in the design of any large-scale system and in particular of a cloud computing infrastructure. A very large number of servers have to work in concert and they have to communicate effectively. The topology of the interconnection network which allows the servers to communicate with one another is critical for ensuring system scalability and for creating the conditions for the implementation of optimal resource management policies. We argue that the constraints of a physical interconnection topology of a system can be overcome by designing an overlay network that enables scalability and allows the system to perform its functions in an optimal way. Scale-free networks enjoy a set of desirable properties, they are non- homogeneous, resilient to congestion, robust against random failures, and have a small diameter and short average path length, as discussed in Section 2. The analysis and the results presented in Sections 3 and 4 show that efficient algorithms to construct such networks and to assemble clusters of servers around the core nodes of the scale-free network can be implemented with relative ease. Centralized algorithms can be used when the number of system components is relatively small, in the range of 103 , while distributed algorithms are useful when the number of components is several orders of magnitude larger, e.g., 106 or larger. Distributed algorithms based on biased random walks support self-organization in a large-scale system; they allow us not only to construct scalefree networks, but also to select components with a set of desirable properties. The algorithms for the construction of scale-free networks and for clustering discussed in this paper are particulary useful for the self-management in a computing cloud where individual core nodes of the global scale-free network could serve as cloud access points. Once such clusters are formed one can implement optimally the five classes of resource management policies; for example, each core node could request the creation of level-2 clusters in response to different requirements imposed by Service Level Agreements. Level-2 clusters can be assembled through a random walk from the servers in a Level 1 cluster which satisfy security, location, QoS, and other types of constraints. The biased random walk process discussed in Section 4.1 could be used to assemble the hybrid clouds with some servers in the private cloud of an organization while others are in public clouds. For example, a smart power grid application would require multiple electric utility companies to use a public cloud to trade and transfer energy from one to another, while maintaining confidential information on servers securely located in their own private cloud. Similar configurations are likely for a unified health care system or any other application involving multiple organizations required to cooperate with one another but with strict privacy concerns. The solutions we propose represent a major departure from the organization of existing computing clouds. The significant advantages of self-organization and self-management, in particular the ability to implement effective admission control and QoS policies reflecting stricter SLA requirements, seem important enough to justify the need for a paradigm shift in cloud computing organization and management. Indeed, over-provisioning used by exiting clouds is not a sustainable strategy to guarantee QoS. It seems reasonable to expect that in the future the providers of Infrastructure as a Service (IaaS) cloud delivery model will support applications with strict security and response time constraints, while lowering the cost of services through better resource utilization; we believe that this can only be done in a self-organizing system and a scale free virtual interconnection network seems ideal for such a system. 2. A Client-Aware Dispatching Algorithm for Web Clusters Providing Multiple Services Author: Casalicchio , Colajanni The typical Web cluster architecture consists of replicated back-end and Web servers, and a network Web switch that routes client requests among the nodes. In this paper, we propose a new scheduling policy, namely client-aware policy (CAP), for Web switches operating at layer-7 of the OSI protocol stack. Its goal is to improve load sharing in Web clusters that provide multiple services such as static, dynamic and secure information. CAP classifies the client requests on the basis of their expected impact on main server resources, that is, network interface, CPU, disk. At run-time, CAP schedules client requests reaching the Web cluster with the goal of sharing all classes of services among the server nodes. We demonstrate through a large set of simulations and some prototype experiments that dispatching policies aiming to improve locality in server caches give best results for Web publishing sites providing static information and some simple database searches. When we consider Web sites providing also dynamic and secure services, CAP is more effective than state-of-the-art layer-7 Web switch policies. The proposed client-aware algorithm is also more robust than server-aware policies whose performance depends on optimal tuning of system parameters, very hard to achieve in a highly dynamic system such as a Web site.Web cluster architectures are becoming very popular for supporting Web sites with large numbers of accesses and/or heterogeneous services. In this paper, we propose a new scheduling policy, called client-aware policy (CAP), for Web switches operating at layer-7 of the OSI protocol stack to route requests reaching the Web cluster. CAP classifies the client requests on the basis of their expected impact on main server components. At run-time, CAP schedules client requests reaching the Web cluster with the goal of sharing all classes of services among the servers, so that no system resource tends to be overloaded. We demonstrate through simulation and experiments on a real prototype that dispatching policies that improve Web caches hit rates give best results for traditional Web publishing sites providing most static information and some lightly dynamic requests. On the other hand, CAP provides scalable performance even for modern Web clusters providing static, dynamic and secure services. Moreover, the pure CAP has the additional benefit of guaranteeing robust results for very different classes of Web services because it does not require a hard tuning of system parameters as many other server-aware dispatching policies do. 3. Scalable web server clustering technologies. Author: T. Schroeder, S. Goddard, and Bno Ramamurthy The exponential growth of the Internet, cou led with the increasing populari of dynamically generated content on the WorlJWide Web, has created the nee2 for more and faster Web servers capable of serving the over 100 million Internet users. Server clustering has emerged as a promising technique to build scalable Web servers. In this article we examine the seminal work, early products, and a sample of contemporary commercial offerings in the field of transparent Web server clustering. We broadly classify transparent server clustering into three categories. Web server clustcring has receivcd much attcntion in reccnt years from both industry and acadcmia. In addition to traditional custom-built solutions to clustering, transparcnt server clustering technologies have emerged that allow the use of commodity systems in server rolcs. We broadly classified transparcnt servcr clustcring into thrcc categories: L4/2, L4/3, and L7. Tablc 2 summarizcs these tcclinologies as well as their advantages and disadvantagcs. Each approach discussed has bottlenccks that limit scalability. For L4/2 dispatchers, system performance is constrained by the ability of the dispatchcr to set up, look up, and tear down entries. Thus, the most telling performance metric is the sustainable request rate. L4/3 dispatchers are more immediately limited by their ability to rewrite and rccalculate the checksums for the massive numbers of packets they must process. Thus, in the absence of dedicated checksumming hardware, the most telling performance metric is the throughput of the dispatcher. Finally, L7 solutions are limited by the complexity of their content-based routing algorithm and the size of their cache (for those that support caching). However, by localizing the request space each server must service and caching the results, L7 dispatching should provide higher performance for a given number of back-end servers than L4/2 or L4/3 dispatching alone. It seems clear that in thc future, L7 hardware solutions such as the ArrowPoint switches will continue to dominate software products in terms of performance. The question one must ask is, how much performance is needed from the Web server for a given application and network configuration? As we have seen, even the L412 switch LSMAC - a software application running in user-space on COTS hardware and software - is capable of saturating an OC-3 (155 Mb/s) link. Apart from the Internet backbone itself, few sites have wide-area connectivity at or above this level. In boosting server performance to the levels supported by L7 hardware solutions (e.g., ArrowPoint switches), the bottleneck is no longer the ability of the server to generate data, but rather the ability of the network to get that data from the server to the client. New research on scalable Web servers must take into account wide area network bandwidth as well as server performance. Industry and academic researchers have just begun to examine this problem. Cisco's Distributed Director is an early example of a product that exploits geographic distribution of servers to achieve high aggregate bandwidth with low latency over the wide area in addition to a greater degree of fault tolerance.