Anda di halaman 1dari 14

Autonomic Computing-Self Optimization Systems

ABSTRACT
The proliferation of Internet technologies, services and devices, have made the
current networked system designs, and management tools incapable of designing re
liable, secure networked systems and services. In fact, we have reached a level
of complexity, heterogeneity, and a rapid change rate that our information infra
structure is becoming unmanageable and insecure. This had led researchers to con
sider alternative designs and management techniques that are based on strategie
s used by biological systems to deal with complexity, heterogeneity and uncertai
nty. The approach is referred to as autonomic computing. An autonomic computing
system is the system that has the capabilities of being self-defining, self-heal
ing, self-configuring, self-optimizing, etc. Autonomic Computing that provides d
ynamically programmable control and management services to support the developme
nt and deployment of smart (intelligent) applications. This environment provides
the application developers with all the tools required to specify the appropria
te control and management schemes to maintain any quality of service(QoS) requir
ement or application attribute/functionality (e.g., performance, fault, security
, etc.) and the core autonomic middleware services to maintain the autonomic req
uirements of a wide range of network applications and services. The proposed wor
k concentrate on the Autonomic computing with Self optimization systems.

1. INTRODUCTION
Autonomic computing has become a significant research area since IBM launched i
ts initiative in 2001 [1]. Computer systems are becoming so complex to manage th
at it is throttling their development. It seems clear that the only way to overc
ome this complexity is to make the systems manage themselves. Highly skilled hum
an administrators usually cost a lot more than the systems themselves, and peopl
e with the right kind of expertise are often hard to find. Human error is the la
rgest single source of failure in computer systems, accounting for more than 40%
of failures according to some sources [2]. This may suggest that human administ
rators are often not as competent as they should be, and even those whoare, will
make errors. Errors can be very difficult for humans to locate and fix, and the
process can be very time-consuming.
The way forward is to make computers do all the low level management and to have
humans only specify the high level policies that represent the business require
ments of the enterprise. When only the autonomic system itself has to know the f
ine details of its configuration, it is easier for human operators to learn how
to administer it and configuration errors are less likely. Even in the case of a
n error, an autonomic system can heal itself. The four major capabilities that a
n autonomic computing system must possess are self-configuration, self-healing,
self-optimization and self-protection. There is list other self-* capabilities
as well but some of them can be thought of as sub-capabilities of the four list
ed here. For example, a self-optimizing system has to be, in some ways, self-tun
ing and self-adaptive[1].
What is autonomic computing? It is the ability of systems to be more self-managi
ng. The term autonomic comes from the autonomic nervous system, which controls m
any organs and muscles in the human body. Usually, we are unaware of its working
s because it functions in an involuntary, reflexive manner -- for example, we do
n't notice when our heart beats faster or our blood vessels change size in respo
nse to temperature, posture, food intake, stressful experiences and other change
s to which we're exposed. And, by the way, our autonomic nervous system is alway
s working. But most significantly, it does all this without any conscious recogn
ition or effort on your part. This allows you to think about what you want to do
, and not how you’ll do it: you can make a mad dash for the train without having
to calculate how much faster to breathe and pump your heart, or if you’ll need
that little dose of adrenaline to make it through the doors before they close. I
t’s as if the autonomic nervous system says to you, Don ’t think about it—no nee
d to. I’ve got it all covered
Designing truly autonomic computing systems is a huge challenge to the research
community and it will require expertise from many fields. E.g., by studying the
existing research in the fields of biology and economics we can find ideas that
can be used in the development of autonomic systems. The real world with its com
plex social systems can serve as an example of a huge autonomic system comprisin
g a myriad of autonomic elements. Possible advances in the autonomic computing r
esearch might also contribute something valuable to other fields as well, so eve
ryone may benefit in the end. Just like using human-like intelligence as a goal
in artificial intelligence, we can use the whole human body as a near-perfect ex
ample of an autonomic system. Simple examples of the self-* properties can be fo
und in software that is already widely in use. For example, modern operating sys
tems such as Ubuntu Linux [3] and Mac OS X [4] are able to configure themselves
without almost any input from the user at installation time, which is a major im
provement from the operating systems of the early 1990s. It used to be the case
that the administrator had to specify detailed information about the computer to
the operating system’s installer, such as the geometry of the hard disk and the
IRQs used by the devices attached to the computer. Likewise, when new devices a
re attached to a system that is already installed, the operating systems usually
detect and configure them autonomically. Operating systems are also self-protec
tive in some ways. For example, the OS kernel with the assistance of the CPU mak
es sure that low-level hardware access is protected from user level processes an
d that the processes are protected from each other.
Internet routing is a good example of an existing self-optimizing system. IP pac
kets usually find their way to the right destination via the optimal (or near-op
timal) path. However, true self-optimization is mostly only found inside smaller
network segments that use the Routing Information Protocol (RIP) [5] or the Ope
n Shortest Path First protocol (OSPF) [6] for internal routing, since the Border
Gateway Protocol (BGP) [7] (which is used for routing across larger network seg
ments) requires a lot of configuration from human administrators. Routers using
RIP and OSPF, on the other hand, can usually build routing tables entirely by th
emselves. In case of network outages, routers can relatively quickly update thei
r routing tables so that packets can go via a different route. This means that I
P networks are also self-healing, which is no coincidence. Packet switching netw
orks were largely motivated by the need for a communications system that could n
ot be brought down by a nuclear war [8], which was a serious concern in the 1960
s as the relationship between the United States and the Soviet Union was very te
nse at the time. This meant that the network had to be designed with a capabilit
y for self-healing. Although there are examples of systems that implement differ
ent kinds of self-* properties, the fully autonomic systems that the research ai
ms for are still a thing of the future.
2. Classifications
The autonomic computing classification can b based on the eight element classifi
cation given by IBM and the four basic components of self management.
2.1 The 8 ELEMENT
While the definition of autonomic computing likely transform the as contributing
technologies mature. The following list suggests eight defining characteristics
[16] of an autonomic computing system.
1. Self-defining: An autonomic computing system needs to "know itself" - it
s components must also possess a system identity. Since a "system" can exist at
many levels, an autonomic system will need detailed knowledge of its components,
current status, ultimate capacity, and all connections to other systems to gove
rn itself. It will need to know the extent of its "owned" resources, those it ca
n borrow or lend, and those that can be shared or should be isolated.
2. Self-configuring: An autonomic computing system must configure and recon
figure itself under varying (and in the future, even unpredictable) conditions.
System configuration or "setup" must occur automatically, as well as dynamic adj
ustments to that configuration to best handle changing environments
3. Self-optimizing: An autonomic computing system never settles for the sta
tus quo - it always looks for ways to optimize its workings. It will monitor its
constituent parts and fine-tune workflow to achieve predetermined system goals.
4. Self-healing: An autonomic computing system must perform something akin
to healing - it must be able to recover from routine and extraordinary events th
at might cause some of its parts to malfunction. It must be able to discover pro
blems or potential problems, then find an alternate way of using resources or re
configuring the system to keep functioning smoothly
5. Self-protecting: A virtual world is no less dangerous than the physical
one, so an autonomic computing system must be an expert in self-protection. It m
ust detect, identify and protect itself against various types of attacks to main
tain overall system security and integrity
6. Self-adaptive: An autonomic computing system must know its environment a
nd the context surrounding its activity, and act accordingly. It will find and g
enerate rules for how best to interact with neighboring systems. It will tap ava
ilable resources, even negotiate the use by other systems of its underutilized e
lements, changing both itself and its environment in the process -- in a word, a
dapting.
7. Open: An autonomic computing system cannot exist in a hermetic environme
nt. While independent in its ability to manage itself, it must function in a het
erogeneous world and implement open standards -- in other words, an autonomic co
mputing system cannot, by definition, be a proprietary solution
8. Self-hidden: An autonomic computing system will anticipate the optimized
resources needed while keeping its complexity hidden. It must marshal I/T resou
rces to shrink the gap between the business or personal goals of the user, and t
he I/T implementation necessary to achieve those goals -- without involving the
user in that implementation.
2.2 Self-Management
The essence of autonomic computing systems is self-management[17], the intent of
which is to free system administrators from the details of system operation an
d maintenance and to provide users with a machine that runs at peak performance
24/7. Like their biological namesakes, autonomic systems will maintain and adju
st their operation in the face of changing components, workloads, demands, and e
xternal conditions and in the face of hardware or software failures, both innoc
ent and malicious.The autonomic system might continually monitor its own use, an
d check for component upgrades, for example. If it deems the advertised features
of the upgrades worthwhile, the system will install them, reconfigure itself as
necessary, and run a regression test to make sure all is well. When it detects
errors, the system will revert to the older version while its automatic problem-
determination algorithms try to isolate the source of the error. Figure 1 illust
rates how this process might work for an autonomic accounting system upgrade.
IBM frequently cites four aspects of self-management, which Table 1 summarizes.
Early autonomic systems may treat these aspects as distinct, with different prod
uct teams creating solutions that address each one separately. Ultimately, these
aspects will be emergent properties of a general architecture, and distinctions
will blur into a more general notion of self-maintenance.
Ultimately, system administrators and end users will take the benefits of autono
mic computing for granted. Self-managing systems and devices will seem completel
y natural and unremarkable, as will automated software and middleware upgrades.
The detailed migration patterns of applications or data will be as uninteresting
to us as the details of routing a phone call through the telephone network.

2.2.1. Self-configuration
Installing, configuring, and integrating large, complex systems is challenging,
time-consuming, and error-prone even for experts. Most large Web sites and corpo
rate data centers are haphazard accretions of servers, routers, databases, and o
ther technologies on different platforms from different vendors. It can take tea
ms of expert programmers months to merge two systems or to install a major e-com
merce application such as SAP.
Autonomic systems will configure themselves automatically in accordance with hig
h-level policies—representing business-level objectives, for example—that specif
y what is desired, not how it is to be accomplished. When a component is introdu
ced, it will incorporate itself seamlessly, and the rest of the system will adap
t to its presence—much like a new cell in the body or a new person in a populati
on. For example, when a new component is introduced into an autonomic accounting
system, as in Figure 1, it will automatically learn about and take into account
the composition and configuration of the system. It will register itself and it
s capabilities so that other components can either use it or modify their own be
havior appropriately.
Concept Current computing In Autonomic Future
Self-configuration
Corporate data centers are multi-vendor, multi-platform. Installing, configuring
, integrating systems is time-consuming, error-prone.
Automated configuration of components, systems according to high-level p
olicies; rest of system adjusts automatically. Seamless, like adding new cell to
body or new individual to population.
Self-optimization
Systems like webSphere, DB2 have hundreds of nonlinear tuning parameters
; many new ones with each release Components and systems will continually
seek opportunities to improve their own performance and efficiency
Self-healing Problem determination in large, complex systems can take a team
of programmers weeks
Automated detection, diagnosis, and repair of localized software/hardwar
e problems.
Self-protection Manual detection and recovery from attacks and cascading failure
s.
Automated defense against malicious attacks or cascading failures; use e
arly warning to anticipate and prevent system-wide failures
Table 1. Four aspects of self-management as they are now and would be with auton
omic computing
2.2.2. Self-optimization
Complex middleware, such as WebSphere, or database systems, such as Oracle or DB
2, may have hundreds of tunable parameters that must be set correctly for the sy
stem to perform optimally, yet few people know how to tune them. Such systems ar
e often integrated with other, equally complex systems. Consequently, performanc
e-tuning one large subsystem can have unanticipated effects on the entire system
.
Autonomic systems will continually seek ways to improve their operation, identif
ying and seizing opportunities to make themselves more efficient in performance
or cost. Just as muscles become stronger through exercise, and the brain modifie
s its circuitry during learning, autonomic systems will monitor, experiment with
, and tune their own parameters and will learn to make appropriate choices about
keeping functions or outsourcing them. They will proactively seek to upgrade th
eir function by finding, verifying, and applying the latest updates.
2.2.3. Self-healing
IBM and other IT vendors have large departments devoted to identifying, tracing,
and determining the root cause of failures in complex computing systems. Seriou
s customer problems can take teams of programmers several weeks to diagnose and
fix, and sometimes the problem disappears mysteriously without any satisfactory
diagnosis.
Autonomic computing systems will detect, diagnose, and repair localized problems
resulting from bugs or failures in software and hardware, perhaps through a reg
ression tester, as in Figure 1. Using knowledge about the system configuration,
a problem diagnosis component (based on a Bayesian network, for example) would a
nalyze information from log files, possibly supplemented with data from addition
al monitors that it has requested. The system would then match the diagnosis aga
inst known software patches (or alert a human programmer if there are none), ins
tall the appropriate patch, and retest.
2.2.4. Self-protection
Despite the existence of firewalls and intrusion detection tools, humans must at
present decide how to protect systems from malicious attacks and inadvertent ca
scading failures. Autonomic systems will be self-protecting in two senses. They
will defend the system as a whole
against large-scale, correlated problems arising from malicious attacks or casca
ding failures that remain uncorrected by self-healing measures. They also will a
nticipate problems based on early reports from sensors and take steps to avoid o
r mitigate them.
3. WORKING PRINCIPLE of SELF-OPTIMIZATION SYSTEM
Optimization means "an act, process, or methodology of making something
(as a design, system, or decision) as fully perfect, functional, or
effective as possible". In other words, optimization means finding
the most effective ways to do things.
Here self-optimization systems in autonomic computing will monitor its constitue
nt parts and fine-tune workflow to achieve predetermined system goals, much as a
conductor listens to an orchestra and adjusts its dynamic and expressive charac
teristics to achieve a particular musical interpretation.
This consistent effort to optimize itself is the only way a computing system wi
ll be able to meet the complex and often conflicting I/T demands of a business,
its customers, suppliers and employees. And since the priorities that drive thos
e demands change constantly, only constant self-optimization will satisfy them.
Self-optimization will also be a key to enabling the ubiquitous availability of
e-sourcing, or a delivery of computing services in a utility-like manner. E-sour
cing promises predictable costs and simplified access to computing for I / T cus
tomers. For providers of those computing services, though, delivering promised q
uality of service (QoS) to its customers will require not only prioritizing work
and system resources, but also considering supplemental external resources (suc
h as subcontracted storage or extra processing cycles) similar to the way power
utilities buy and sell excess power in today’s electricity markets.
But to be able to optimize itself, a system will need advanced feedback control
mechanisms to monitor its metrics and take appropriate action. Although feedback
control is an old technique, we’ll need new approaches to apply it to computing
. We’ll need to answer questions such as:
• How often a system takes control actions?
• How much delay it can accept between an action and its effect?
• How all this affects overall system stability?
Innovations in applying control theory to computing must occur in tandem with ne
w approaches to overall systems architecture, yielding systems designed with con
trol objectives in mind . Algorithms seeking to make control decisions must have
access to internal metrics. And like the tuning knobs on a radio, control point
s must affect the source of those internal metrics. Most important, all the comp
onents of an autonomic system, no matter how diverse, must be controllable in a
unified manner.

3.1 A General View


Usually when we talk about optimizing computer programs, we mean optimization a
t the source code level as done by the programmer. This is a very typica
l part of writing software, although it has become a bit less of an
issue as computers have become faster. Optimizing programs in the smal
l scale by writing certain operations in the Assembly language, for exa
mple, is rarely necessary these days but optimization in the large scale is j
ust as important as always. This means that we still have to find and imp
lement effective algorithms. Effective algorithms are important in stand-a
lone programs but they are even more important when we talk about th
e algorithms that components in distributed systems use to interact wit
h each other. Ineffective algorithms in distributed systems may cause ne
twork congestion and they can disturb other users of the network, whic
h can be a lot more serious concern than ineffective operation of the
system itself.
A self-optimizing system is one that dynamically optimizes the operation of
its own components while it is running. The optimizing component can be an age
nt that is separate from the component that is being optimized: the optim
izer just continuously adjusts the control parameters that it passes to
other components. Typically the optimizer has to have intimate kno
wledge of the components being optimized but optimization can be done on differe
nt levels demonstrated in the following sections. The higher-level optimizer d
oes not need to know the details of the components at the lowest level
of the autonomic system.
3.2 Utility Functions
If we want computer systems to improve their performance by self-optimizat
ion, the systems need to have some kind of rules that they can follow. A
naive solution would be to specify a set of situation-action rules for all
situations that can occur. It would be a very bad strategy in the
case of autonomic systems as it would require the human administrator
to go into very low-level details, and, in many cases, the list of ru
les would probably never be quite complete. This kind of need for low
-level configuration does not meet the objectives set for autonomic com
puting systems since one of the main requirements is that humans shoul
d only need to specify the appropriate high-level business policies that
the autonomic systems follow [1].
Another approach would be to use goal policies that divide the states of
the system into desirable and undesirable ones. That would definitely be
a better approach than the one described above but it would not be very good
for optimization because when the system has reached a desirable state,
it would not try to improve its performance anymore. What we need is
a method that continuously aims for better performance.
The third approach is the use of utility functions. With utility functions we
can calculate the utility (i.e., business value) of even completely di
fferent kinds of systems in a common currency. In the modern society
all goods and services are assigned some monetary value that can be u
sed to compare them with each other in terms of how valuable
they are to their owners. This is exactly what utility functi
ons are used for in autonomic computing. We can specify high-level bus
iness rules that a utility function uses to evaluate the given state s business
value. When we have calculated the utility of different Autonomic Env
ironments in a single autonomic system, we can calculate their sum, whi
ch is the utility of the entire autonomic system. The goal of self-op
timization is to maximize the utility of the entire system at all tim
es. More recently utility functions have been chosen as the approach fo
r self-optimization in many cases [11] [12] [13]. Traditionally utility
functions have been used in the fields of microeconomics and artificial intel
ligence. In microeconomics utility functions are, e.g., used to model the way a
single consumer tries to gain happiness by purchasing goods and servi
ces.
3.3 Two level system of independent Autonomic-Element
3.3.1 Overall Architecture
Walsh et al. [15] show how utility functions can be used effectively in autonomi
c systems by means of a data center scenario. They present a two-level system of
independent, interacting agents, which are generally called autonomic elements.
The architecture consists of a single Resource Arbiter and a number of Autonomic
Environments. Each Autonomic Environment consists of an Application Manager, a
router and servers as depicted in Figure 1. The Resource Arbiter and the Applica
tion Managers both do resource allocation at their own respective levels. The Re
source Arbiter is not concerned about the internal workings of the Application E
nvironments, it only allocates resources according to the data it receives from
the Application Managers.
The Resource Arbiter is quite a general piece of software in this architecture.
Even if we introduce entirely new kinds of Application Environments to the data
center, we will not have to alter the Resource Arbiter component in any way beca
use it only expects to receive resource-level utility functions from the Applica
tion Managers. The resource-level utility functionˆUi(Ri), which is sent to the
Resource Arbiter, is basically a table that maps the possible resource allocati
ons to their utilities. Ri is a vector that specifies the resources allocated to
the environment i. Its components specify each individual resource, such as the
number of servers allocated etc. As an illustrative example, Table I presents a
possible resource-level utility in whole units and memory can only be allocated
in pieces of 1 GB. In this example, the resource vector Ri is of the form (N,M)
where N is the number of CPUs and M is the amount of memory allocated. By sendi
ng the resource-level utility function the Application Manager basically tells t
he Resource Arbiter how important it is for it to gain resources. As time passes
, the importance may change and, as a result, the resource allocations change a
ccordingly. Traditionally many computer systems have been built with a lot of ex
cessive power just to enable them to survive peaks in demand. However, dynamic r
esource allocation provides our system with great flexibility and helps us not t
o waste computing power.
E.g., a company can use a single cluster of computers for running a website for
its customers and a database for its business administration unit so that the cu
stomers have a higher priority. Self-optimization guarantees that possible heavy
database transactions made by the company ’s employees will not slow down the w
ebsite if there are customers using it. Research shows that the typical u ser wi
ll tolerate at most 2–10 seconds of delay when loading a website .After becoming
frustrated with the delay, the user may well move to a competitor’s site. Clear
ly it is very beneficial to have a edible optimization architecture that both p
revents customers from having to suffer long delays and is cost-effective so tha
t the excessive computing power is used for something g useful even when it is n
ot needed for serving customers.

U1(R) U2(R)

Fig. 1. Data Center Architecture [15, Figure 1].


It is the job of the Resource Arbiter to maximize the global utility by distribu
ting the system’s resources in a way that produces the optimal result. In a syst
em of n Autonomic Environments, the resource allocation is of total quantities o
f resources available. The Resource Arbiter periodically recomputes the global u
tility as high as possible. There is a variety of standard optimization algorith
ms that can be used to solve it.
In order for the Resource Arbiter to dynamically allocate resources, the Applica
tion Managers send in new resource level utility functions when they decide that
there is enough reason to do so, i.e., when the utility function has changed co
nsiderably. The Resource Arbiter may also request the Application Managers send
new data.
3.3.2 Application Manager
The Application Managers are at the lower level of the architecture. They have t
o be well aware of the details of their own Application Environments since they
must be able to do low-level tuning of control parameters. The Application Envir
onments in a single autonomic system can be very different from each other with
respect to the kind of applications they are running. Their respective utility f
unctions are used to map the Application Environment’s state into a common curre
ncy. The state of a system can be described as a vector of attributes, such as t
he number of CPUs, the amount of memory, and the amount of network bandwidth the
system has been allocated.
The utility function for environment i is of the form Ui(Si,Di) where Si is the
service level vector in i and Di is the demand vector in i. The components of th
ese vectors can be, e.g., the average response time and the average throughput f
or multiple different user classes. The goal of the entire autonomic system is t
o continually optimize the sum of the utility functions of all Autonomic Environ
ments, so that most resources are given to those environments that need them the
most. On the other hand, the goal of a single Application Manager i is to optim
ize Ui(Si,Di)i.e, utility functions of a Autonomic Environment, while it is give
n a fixed amount of resources. The Application Manager can only use the resource
s that the Resource Arbiter has allocated to it, so it will have to decide how t
o use those resources as effectively as possible. This is done by adjusting the
control parameters of the application in question. In fact, the Application Mana
ger can do its own small-scale resource allocation for different transaction cla
sses at this level. E.g., some users can be given priority in certain operations
. This is exactly why it is beneficial to have two-level architecture instead of
a centralized one where the Resource Arbiter would do all the work. The Resourc
e Arbiter would have to be updated every time a new Application Environment is i
ntroduced to the system, and it would have to be aware of all the details. The r
esulting software would be quite bloated compared to the one needed in the two-l
evel architecture.
Different types of optimization require different time scales, which is also nea
tly handled by the two-level architecture. The Application Managers can do optim
ization on a time scale of seconds or anything that is suitable for them. The Re
source Arbiter typically works on a time scale of minutes. Figure 2 shows the in
ner composition of an Application Manager and some surrounding components. The f
igure illustrates how information flows inside the Application Manager and how i
t flows between the Application Manager and the external components. The area wi
th a darker background represents the Application Manager. The rectangular objec
ts inside it are its modules and the cylindrical objects represent the knowledge
that the Manager maintains. The area inside the rounded rectangle represents th
e Application Environment, which contains the Application Manager servers and a
router. The Resource Arbiter is the only component outside the Application Envir
onment. Of course there are other Application Environments as well but we do not
have to pay attention to them here since they only interact with the Resource A
rbiter and not directly with each other.
As we have already described earlier, the Application Manager sends the resource
-level utility function to the Resource Arbiter. Next we will describe how the A
pplication Manager is able to construct the function. The Data Aggregator module
receives a continual flow of raw measurement data from the servers and the rout
er. This includes the service data S and the demand data D. The Data Aggregator
uses some method to aggregate the data into a more suitable form. It can, e.g.,
calculate their average values over a suitable time window.

Fig:2 The modules and data flow in an Application Manager. Symbols: S = service
level/service model, D = demand, D′= predicted demand, C= control parameters, Rt
= current resource level, U = utility function [15]
The Data Aggregator sends the refined demand D to the Demand Forecaster whose jo
b is to provide an estimate of the average demand in the future. The Demand Fore
caster, in turn, passes the predicted demand D′ on to the Utility Calculator. Th
e forecasting is necessary because the demand can suddenly peak for a relatively
short period of time. It is hardly a good idea to do reallocation of resources
among different Application Environments just because the demand rises (or lower
s, for that matter) considerably for a brief moment in time. The shifting of res
ources can be a time consuming. operation and doing it unnecessarily may only lo
wer the system’s performance. The Demand Forecaster has to take historical obser
ved demand D into account when it decides on the estimated future demand D′.
In addition to sending the demand D to the Demand Forecaster, the Data Aggregato
r also sends the refined demand D and the refined service level S to the Modeler
module. The Modeler’s task is to build a model S(C,R,D), which is basically a f
unction that maps a set of control parameters C, a resource level R and a demand
level D into the service level that will be acquired with the specified conditi
ons.
The Modeler gets the control parameters C from the Controller module. The Contro
ller continually adjusts the control parameters that it also sends to the router
and the servers. As can be seen from Figure 2, the Controller receives input fr
om many directions. It receives the demand D, the service model S, the current r
esource level Rt and the service-level utility function U(S,D) as input and uses
the data to determine the suitable control parameters C for the current situati
on. More formally we can say that the Controller aims to find the control parame
ters C so that U(S(C,Rt,D),D), where Rt is the current resource level, yields it
s maximum value. The control parameters may include configuration options for th
e actual application the servers are running or different kinds of settings for
the servers’ operating systems. The control parameters can contain basically any
thing because they are application-specific and not restricted by the architectu
re.
The Utility Calculator is the module that actually generates the resource-level
utility function ˆU (R) that is sent to the Resource Arbiter. It receives the pr
edicted demand D′, the service-level utility function U(S,D) and the service mod
el S(C,R,D) as input. The resource-level utility function is calculated with the
formula
ˆU (R) = U(S(C ,R,D′),D′)---------- (1)
for all possible resource levels R where C is the optimal set of control parame
ters for the resource level R. Note that the optimal control parameters may be d
ifferent for different resource levels, which means that the optimal control par
ameters must be recomputed for all possible resource levels. Also we must use th
e predicted demand D′, which was received from the Demand Forecaster, instead of
the current demand D.

Fig3. Control loop

The behavior of an Application Manager fits well into a general concept of auton
omic computing called the control loop [1], which is illustrated in Figure 3. If
we compare it to Figure 2, we can see the similarities. The measuring phase is
represented by the router and the servers sending measured service and demand da
ta to the Data Aggregator and the Aggregator passing the refined measurement dat
a on to the other modules. The Controller and the Utility Calculator make decisi
ons based on the measured data by computing the appropriate control parameters a
nd the resource-level utility function. Resource allocation is partly done by th
e Controller as it sends the control parameters to the router and the servers, a
nd partly by the Resource Arbiter when it makes decisions based on the resource-
level utility functions it receives from the Application Managers
4. Autonomic Computing Challenges
Meeting the grand challenges of autonomic computing presents fundamental and sig
nificant research challenges that span all levels, from the conceptual level to
architecture, middleware, and applications[19]. Key research issues and challeng
es are presented below.
4.1 Conceptual Challenges
Conceptual research issues and challenges include
• Defining appropriate abstractions and models for specifying, understandi
ng, controlling, and implementing autonomic behaviors.
• Adapting classical models and theories for machine learning, optimizatio
n and control to dynamic and multi agent system.
• Providing effective models for negotiation that autonomic elements can u
se to establish multilateral relationships among themselves; and
• Designing statistical models of large networked systems that will let au
tonomic elements or systems detect or predict overall problems from a stream of
sensor data from individual devices.
4.2 Architecture Challenges
Autonomic applications and systems will be constructed from autonomic elements
that manage their internal behavior and their relationships with other autonomic
elements in accordance with policies that humans or other elements have establi
shed. As a result, system/application level self-managing behaviors will arise f
rom the self-managing behaviors of constituent autonomic elements and their inte
ractions. System and software architectures in which local as well as global aut
onomic behaviors can be specified, implemented and controlled in a robust and pr
edictable manner remains a key research challenge.
4.3 Middleware Challenges
The primary middleware level research challenge is providing the core services r
equired to realize autonomic behaviors in a robust, reliable and scalable manner
, in spite of the dynamism and uncertainty of the system and the application. Th
ese include discovery, messaging, security, privacy, trust, etc. Autonomic syste
ms/applications will require autonomic elements to identify themselves, discover
and verify the identities of other entities of interest, dynamically establish
relationships with these entities, and to interact in a secure manner. Further t
he middleware itself should be secure, reliable and robust against new and insid
ious forms of attack that use self-management based on high-level policies to th
eir own advantage.
4.4 Application Challenges
The key challenges at the application level is the formulation and development o
f systems and applications that are capable of managing (i.e., configuring, adap
ting, optimizing, protecting, healing) themselves. This includes programming mod
els, frameworks and middleware services that support the definition of autonomic
elements, the development of autonomic applications as the dynamic and opportun
istic composition of these autonomic elements, and the policy, content and conte
xt driven definition, execution and management .
5. Applications
In the case of an autonomic computing system, we can think of survivability as t
he system’s ability to protect itself, recover from faults, reconfigure as requi
red by changes in the environment, and always maintain its operations at a near
optimal performance. Its equilibrium is impacted by both the internal environmen
t (e.g., excessive memory/CPU utilization) and the external environment(e.g., pr
otection from an external attack).
Sample self-managing system/application behaviors include installing software wh
en it is detected that the software is missing (self-configuration), restarting
a failed element (self-healing), adjusting current workload when an increase in
capacity is observed (self-optimization) and taking resources offline if an intr
usion attempt is detected (self-protecting). Each of the characteristics listed
above represents an active research area. Generally, self-management is addresse
d in four primary system/application aspects, i.e., configuration, optimization,
protection, and healing. Further, self-management solutions typically consists
of the steps outlined[16][17][19] :
• The application and underlying information infrastructure provide inform
ation to enable context and self awareness.
• system/application events trigger analysis, deduction and planning using
system knowledge.
• plans are executed using the adaptive capabilities of the application/sy
stem. An autonomic application or system implements self-managing attributes usi
ng the control loops described above to collect information, make decisions, and
adapt, as necessary.
• Reduction in Human intervention.
• Reduction in complexity and efficient utilization of resources in a netw
ork, web applications.
• Establishment of Dynamic behavior, self monitoring systems in every aspe
ct.

Conclusion
Realistically, such systems will be very difficult to build and will require sig
nificant exploration of new technologies and innovations. That’s why we view thi
s as a Grand Challenge for the entire I / T industry. We’ll need to make progres
s along two tracks: making individual system components autonomic, and achieving
autonomic behavior at the level of global enterprise I/T systems. That second t
rack may prove to be extremely challenging. Unless each component in a system ca
n share information with every other part and contribute to some overall system
awareness and regulation ,the goal of autonomic computing will not really be rea
ched. So one huge technical challenge entails figuring how to create this “globa
l” system awareness and management. Or to put it another way, how do we optimize
the entire stack of computing layers as a whole? It’s not something we currentl
y know how to do.
We know there are also many interim challenges: how to create theproper “adaptiv
e algorithms”—sets of rules that can take previous system experience and use tha
t information to improve the rules. Or how to balance what these algorithms “rem
ember” with what they ignore. We humans tend to be very good at the latter—we ca
ll it “forgetting”—and at times it can be a good thing: we can retain only signi
ficant information and not be distracted by extraneous data. Still another probl
em to solve: how to design an architecture for autonomic systems that provides c
onsistent interfaces and points of control while allowing for a heterogeneous en
vironment. We could go on, as the list of problems is actually quite long, but i
t is not so daunting as to render autonomic computing another dream of science f
iction.
Also, some aspects of autonomic computing are not entirely new to the I/T indust
ry. For instance, the protocols and standards used for forwarding packets of inf
ormation across the Internet (the most well-known being TCP/IP) allow for some r
elatively simple functions, such as routing, to occur with little human directio
n. And since mainframe computers have historically been entrusted with important
, “mission critical” work for businesses and governments, they have had increasi
ng levels of self-regulation and self-diagnosis built in. Such machines now boas
t “availability rates” — the percentage of time they are functioning properly—in
the 99.999 percent range. But this innovation needs to be taken to an entirely
new level.

Bibliography
[1] A. Ganek and T. Corbi, “The dawning of the autonomic computing era,” IBM Sys
tems Journal, vol. 42, no. 1, pp. 5–18, 2003.
[2] D. Patterson, “A new focus for a new century: availability and maintainabili
ty performance,” Keynote speech at USENIX FAST, January, 2002.
[3] Ubuntu Linux. Canonical Ltd. [Online]. Available: http://www.ubuntu.com/
[4] Mac OS X. Apple Inc. [Online]. Available: http://www.apple.com/macosx/
[5] G. Malkin, “Routing Information Protocol RIP version 2. Internet Engineering
Task Force,” November 1998. RFC-2453, Tech. Rep.
[6] J. Moy, “RFC2328: OSPF Version 2,” Internet RFCs, 1998.
[7] Y. Rekhter, T. Li, and S. Hares, “RFC 4271, A Border Gateway Protocol 4 (BGP
-4),” 2006.
[8] J. Abbate, Inventing the Internet. MIT Press, 1999.
[9] Optimization. Merriam-Webster’s Online Dictionary. [Online].Available: http:
//www.m-w.com/dictionary/optimization
[10] I. Sutherland, “A futures market in computer time,” Communications of the A
CM, vol. 11, no. 6, pp. 449–451, 1968.
[11] W. Wang and B. Li, “Market-based self-optimization for autonomic service ov
erlay networks,” Selected Areas in Communications, IEEE Journal on, vol. 23, no.
12, pp. 2320–2332, 2005.
[12] T. Kelly, “Utility-directed allocation,” First Workshop on Algorithms and A
rchitectures for Self-Managing Systems, pp. 2003–115, 2003.
[13] R. Das, I. Whalley, and J. Kephart, “Utility-based collaboration among auto
nomous agents for resource allocation in data centers,” Proceedings of the fifth
international joint conference on Autonomous agents and
multiagent systems, pp. 1572–1579, 2006.
[14] J. Kephart and W. Walsh, “An artificial intelligence perspective on autonom
ic computing policies,” Policies for Distributed Systems and Networks, 2004. POL
ICY 2004. Proceedings. Fifth IEEE International Workshop on, pp. 3–12, 2004.
[15] W. Walsh, G. Tesauro, J. Kephart, and R. Das, “Utility functions in autonom
ic systems,” Autonomic Computing, 2004. Proceedings. International Conference on
, pp. 70–77, 2004.
[16] Self-Optimization in Autonomic Systems,Marko Kankaanniemi Department of Com
puter Science University of Helsinki ,Email: marko.kankaanniemi@cs.helsinki.fi
[17] www.research.ibm.com/autonomic/
[18] http://www.ibm.com/research/autonomic
[19] Autonomic Computing: An Overview_Manish Parashar1 and Salim Hariri21 The Ap
plied Software Systems Laboratory, Rutgers University, Piscataway NJ, USA 2 High
Performance Distributed Computing Laboratory, University of Arizona, Tucson, AZ
, USA parashar@caip.rutgers.edu , hariri@ece.arizona.edu