Anda di halaman 1dari 5

2011 Seventh International Conference on Computational Intelligence and Security

A Markov Chain-based Availability Model of Virtual Cluster Nodes

Jianhua Che

Information & Network Security Key Lab. of State Grid State Grid Electric Power Research Institute Nanjing, China chejianhua@zju.edu.cn

Weimin Lin

Information & Network Security Key Lab. of State Grid State Grid Electric Power Research Institute Nanjing, China linweimin@sgepri.sgcc.com.cn

Tao Zhang

Information & Network Security Key Lab. of State Grid State Grid Electric Power Research Institute Nanjing, China zhangtao@sgepri.sgcc.com.cn

Houwei Xi

Information & Network Security Key Lab. of State Grid State Grid Electric Power Research Institute Nanjing, China xihouwei@sgepri.sgcc.com.cn

Abstract—Benefiting from the virtualization technology, virtual cluster system possesses a lot of advantages different from traditional cluster system. However, the availability analysis of virtual cluster system is still short of efficient methods. The availability analysis of virtual cluster node is the base of analyzing the availability of virtual cluster system. In this paper, we summarize a typical architecture paradigm of virtual cluster node by studying the overall architecture of virtual cluster system and the deployment style of virtual cluster nodes, i.e., two active virtual cluster nodes building on a physical machine and their standby virtual cluster nodes building on another physical machine, and give its state transition diagram by analyzing the complete lifecycle of virtual cluster node and the transition conditions of different node states, and design a Markov Chain-based availability model for this typical architecture paradigms of virtual cluster node. This model enables to characterize the lifecycle state and state transition of virtual cluster node and provide an efficient method to understand the availability level of each virtual cluster node in a complicate virtual cluster system or cloud data center. Finally, the practicability of the proposed model was proved by numerical simulation experimental results.

Keywords-virtual cluster node; availability modeling; Markov Chain; virtualization

I.

INTRODUCTION

The resurgence of virtual machine (VM) [13] provides a high efficient solution for many IT service demands and important business applications, e.g. cloud computing [10] and Internet Data Center(IDC) [14]. With its widespread application, virtual machine is introduced into traditional cluster system, which promotes the birth of virtual cluster system. Virtual Cluster System (VCS) is a kind of cluster systems that install cluster nodes into virtual machines and manage cluster nodes with virtualization technology [6]. Compared to traditional cluster system, the advantages of virtual cluster system are higher resource utilization, lower standby cost, simpler management work and higher level availability, etc. [11] But at the same time of offering these advantages, virtual cluster system has several defects, e.g. the existence of virtual machine monitor(VMM) [9] brings some unsteady factors and the availability of virtual cluster node is a problem of Single Point of Failure(SPOF) [2]. How

to evaluate the availability of virtual cluster system is becoming the focus of numerous researchers. Furthermore, the availability of virtual cluster node is the base of evaluating the availability of virtual cluster system. As virtual machine has many special features compared with physical machine, the availability models of traditional cluster node are not adaptive to the availability analysis of virtual cluster node. Therefore, it's very necessary to model the availability of virtual cluster node. In this paper, we proposed a Markov Chain-based model for analyzing the availability level of virtual cluster nodes, and validated its practicability with numerical simulation experiments. Specifically, the contribution of this paper is as follows:

First, we summarized a typical architecture paradigm of virtual cluster nodes by studying the overall architecture of virtual cluster systems and the deployment style of virtual cluster nodes; Second, we described the state transition diagram of this kind of typical virtual cluster nodes by analyzing their complete lifecycle state and the transition conditions of different lifecycle states; Third, we designed an availability model based on the Markov Chain theory to analyze the availability level of virtual cluster nodes in a virtual cluster system or cloud data center. The rest of this paper is organized as follows: We begin in Section 2 with related work. Then, we introduce the proposed Markov Chain-based availability model for one typical paradigm of virtual cluster nodes in Section 3. Furthermore, we validate the practicability of the proposed model with several numerical simulation experiments in Section 4. Finally, we conclude with discussion in Section 5.

II.

RELATED WORK

Although the availability evaluation of traditional computer system has been extensively studied, the availability evaluation of virtual cluster system is still short of efficient methods at current time. Allen and Miroslaw [7] gave an earlier survey on the availability analysis models and evaluation tools of traditional cluster system, and introduced the availability model elements(including fault rate, recovery time and service cost) and analysis models(including fault

978-0-7695-4584-4/11 $26.00 © 2011 IEEE DOI 10.1109/CIS.2011.118

507
507
2011 Seventh International Conference on Computational Intelligence and Security A Markov Chain-based Availability Model of Virtual
2011 Seventh International Conference on Computational Intelligence and Security A Markov Chain-based Availability Model of Virtual
2011 Seventh International Conference on Computational Intelligence and Security A Markov Chain-based Availability Model of Virtual

tree, reliability diagram, Markov Chain and Stochastic Petri). After introducing some basic concepts of availability, Alan Wood [1] explored the use of Markov model in the availability analysis. Regarding to the lifecycle state and availability level of virtual machine, Farr etc. [4] extended the state type of virtual machine in the DMTF specification with two kinds of states: Active and Inactive. Herein, the Active state includes Operational and Gemini}, the Inactive state includes Planned and Unplanned. Hence, there are six kinds of virtual machine states: Latent, Defined, , , Paused and Suspend. Le etc. [8] has studied the fault injection of virtual machine system and the application of virtual machine in the fault injection of traditional computer system. Qin and Xie etc. [12] analyzed the mutual impact between the scheduling of multiple applications and the availability in a heterogeneous system, and modeled all nodes of a heterogeneous system according to the computing power and availability data of every node. Brendan cully etc. [3] presented a solution of building general high availability service framework with virtual machine and developed a prototype system-Remus. Werner Fischer and Christoph Mitasch [5] summarized the availability problems of a virtual machine system and gave the node architecture scheme that can increase the availability of virtual cluster system. Thandar Thein and Jong Sou Park etc. [14] optimized the rejuvenation process and enhanced the tolerance ability of computer system with virtualization and software rejuvenation, designed a framework that can increase the survivability of a distributed system, and clarified the relation between the availability of virtual machine system and the number of backup virtual machines. Based on the previous work, Thandar Thein and Jong Sou Park etc. have evaluated the availability of virtual cluster system using software rejuvenation [17], provided the formulation description of multiple-virtual machine system with state transition diagram and verified their work with numerical simulation experiments [16]. The work of this paper is based on Thandar Thein's work and has different model semantic and parameter definition compared with their work.

III.

MARKOV CHAIN-BASED AVAILABILITY MODEL OF VIRTUAL CLUSTER NODES

Availability may be analyzed based on many models, for example, fault tree, reliability block diagrams, Markov chain and stochastic Petri nets, etc. Markov chain is firstly proposed by the Russian mathematician Andrey Markov in 1907, and often used to model the availability of fault- tolerant computer system, dynamic redundant computer system, sequence-dependant fault and recovery computer system. In the Markov chain model, the stochastic process of virtual cluster nodes running are denoted by a series of state transitions of virtual cluster nodes. Herein, the state of a virtual cluster node is denoted by the vertex of a state diagram, the translation between different states is denoted by the edge of a state diagram, and the conditional probability of state transition acts as the weight of edges in a state diagram. The conditional probability of virtual cluster nodes transiting from one state into its next state is

determined by the current state, and has nothing with the historic states. In addition, several important solutions of Markov chain model include steady-state solution, transient solution and decomposition method and so on.

  • A. One Typical Paradigm Of Virtual Cluster Nodes

In a traditional cluster system, all nodes are built on the physical machines. At the same time, almost every active node owns a corresponding standby node to improve its availability. However, in a virtual cluster system, many even all nodes may be built in virtual machines, including active nodes and standby nodes. One typical paradigm(we call it 2VNs/2PMs) in all kinds of virtual cluster nodes is that two active nodes are built respectively in two virtual machines dwelling on a physical machine, and their standby nodes are built respectively in two other virtual machines dwelling on another physical machine as figure 1 shown.

tree , reliability diagram , Markov Chain and Stochastic Petri ). After introducing some basic concepts
tree , reliability diagram , Markov Chain and Stochastic Petri ). After introducing some basic concepts
tree , reliability diagram , Markov Chain and Stochastic Petri ). After introducing some basic concepts

One typical fundamental paradigm of virtual cluster nodes

Figure 1.

This paradigm not only increases the resource utilization of virtual cluster nodes, and also reduces the standby cost of virtual cluster nodes. At the same time, this paradigm relieves the problem of Single Point of Failure (SPOF) to a certain extent. Hence, this paradigm is used widely.

  • B. Basic Concept and Prerequisite Condition

During the complete lifecycle of a virtual cluster node, there are usually five main states: Normal, Unsteady, Rejuvenation, Switchover and Failure. Herein, Normal means that a virtual cluster node is staying in normal work stage, Unsteady means that a virtual cluster node is staying in abnormal and unsteady stage and the virtual cluster node service is still available with a decreased performance, Rejuvenation means that a virtual cluster node is staying in the stage of transiting from Unsteady to Normal state, Switchover means that a virtual cluster nodes staying in Unsteady state is switching to its standby node for unrecovered faults, Failure means that a virtual cluster node stops working for the sake of failure. The virtual cluster node staying in Normal state may go into Unsteady state after running some time, the virtual cluster node staying in Unsteady state has three next states:

Rejuvenation, Switchover and Failure, and the recoverable virtual cluster node goes into Rejuvenation state, the unrecoverable virtual cluster node goes into Switchover state and then Failure state after migrating the run-time context and application workload into the standby virtual cluster node by live migrating. In addition, the virtual cluster node that has no time to migrate the run-time context and application workload for sudden unexpected reasons goes directly into Failure state, the virtual cluster node staying in Rejuvenation state goes into Normal state by software

508
508

rejuvenating. The virtual cluster node staying in Switchover state fails at last, its corresponding service is provided by its standby virtual cluster node, and the standby virtual cluster node has all same states and state transition. The state transition diagram of virtual cluster nodes is shown as figure

2.

rejuvenating. The virtual cluster node staying in Switchover state fails at last, its corresponding service is
rejuvenating. The virtual cluster node staying in Switchover state fails at last, its corresponding service is
rejuvenating. The virtual cluster node staying in Switchover state fails at last, its corresponding service is
rejuvenating. The virtual cluster node staying in Switchover state fails at last, its corresponding service is
rejuvenating. The virtual cluster node staying in Switchover state fails at last, its corresponding service is

Figure 2.

rejuvenating. The virtual cluster node staying in Switchover state fails at last, its corresponding service is

The state transition of virtual cluster nodes

Based on the above analysis, we define the model parameters as shown in Table 1. According to the definitions of these model parameters, we can know that the rejuvenation time of a virtual cluster node staying in the rejuvenation state is 1/ , and the switchover time of a virtual cluster node staying in the switchover state to migrate to its standby node is 1/ . At the same time, these model parameters are under the following prerequisite conditions and hypothesis:

TABLE I.

THE DEFINITION OF MARKOV CHAIN-BASED AVAILABILITY MODEL PARAMETERS

   

The definition of model parameters

N

The time ratio of a VCN staying in Normal state

U

The time ratio of a VCN staying in Unsteady state

R

The time ratio of a VCN staying in Rejuvenation state

S

The time ratio of a VCN staying in Switchover state

F

The time ratio of a VCN staying in Failure state

The frequency of a VCN changing from Normal to Unsteady state

The probability of a VCN changing from Unsteady to Normal state

The probability of a VCN changing from Rejuvenation to Normal

state The probability of a VCN migrating from active to standby node

1/

The

frequency of a

VCN migrating from Switchover state to

standby node The probability of a VCN changing from Unsteady to Failure state

The frequency of a VCN changing from Failure to Normal state

The , , , , , and of all virtual cluster nodes are same and steady; Compared to other probabilities, the probability of a virtual cluster node transiting from Normal to Failure state can be neglected; Virtual cluster node can still provide a continuous service during the rejuvenation process.

  • C. State Transition Diagram and Analysis Model

For two physical machines that each hosts two virtual cluster nodes, one physical machine is the standby machine

of the other one. When any active virtual cluster node hosted by the active physical machine fails, its run-time context and application workload will be migrated into the standby virtual cluster node hosted by the standby physical machine. When all virtual cluster nodes hosted by a physical

machine fail, then the physical machine fails.

The state transition diagram of the 2VNs/2PMs paradigm

is shown as figure 3. Herein, N denotes the Normal state, U

denotes the Unsteady state, R denotes the Rejuvenation

state, S denotes the Switchover state, and F denotes the

Failure state. In addition, the suffix 1 and 2 means the

number of virtual cluster nodes hosted by the same physical

machine, the suffix A means the virtual cluster node is an

active one, the suffix S means the virtual cluster node is a

standby one.

rejuvenating. The virtual cluster node staying in Switchover state fails at last, its corresponding service is

Figure 3.

The state transition diagram of the 2VNs/2PMs paradigm

According to the hypothesis in previous section, the balance equations of the state transition diagram are as follows:

ηπ

N

1

A

=

(

λ + ε + τ

)

π

U

1 A

 

επ

U

1 A

= γπ

R

1 A

ηπ

N

1 S

= γπ

R

1 S

ηπ

=

(

+ δπ

S

1 A

+ μπ

F

S

λ + ε + τ π

)

N

1

S

U

1 S

 

επ

U

1

S

τπ

= γπ

R

1 S

= δπ

ηπ

N

2

A

= λπ

U

ηπ

=

(

U

1

S

S

1 S

1 A

+ γπ

R

2 A

+ δπ

S

2 S

λ + ε + τ π

)

N

2

A

U

2 A

επ

U

2

A

= γπ

R

2 A

τπ

U

2

A

= δπ

S

2 A

ηπ

N

2

S

= λπ

U

1S

+ γπ

R

2 S

+ δπ

S

2 A

ηπ

N

2 S

= λ + ε + τ π

(

)

U

2 S

επ

U

2

S

= γπ

R

2 S

τπ

U

2

S

= δπ

S

2 S

1

2 3

4

5

6

7

8

9

10

11 12

13

14

509
509

By resolving

the

sum

of

all

λπ

U

2

A

= μπ

F

A

15

λπ

U

2

S

= μπ

F

S

16

state

probabilities,

the

conservation equation of the state transition diagram are as

follows:

 
 

2

2

U

π

N

iA

+

π

iA

+

2

R

π

iA

+

2

S

π

iA

+

2

N

π

iS

+

i

=

1

i

=

1

i

=

1

i

=

1

i

= 1

2

2

2

S

π

U

iS

+

R

π

iS

+

π

iS

F

+

π

F

A

+

π

S

= 1

17

i

=

1

i

=

1

i

= 1

And we can obtain the following expression of state probability by combining the above balance equations and the conservation equation. We can have the following equations by resolving the equation (47)~(58).

π

U

1 A

=

λ

+

2

τ

+

2

2 (

λ

+

τ

)

π

U

1 S

π

U

2 A

=

λ

+

4

τ

+

2

2 (

λ

+

2

τ

)

π

U

1 S

π

U

2 S

=

2

λ λ

(

+

τ

)

+

τ λ

(

+

4

τ

+

2 )

2 (

λ

+

τ

)(

λ

+

2

τ

)

π

U

1 S

18

19

20

Furthermore, we can obtain the closure formulation of virtual cluster node availability model about the 2VNs/2PMs paradigm as the following:

π

U

1

S

=

3
3

λ

+

4

τ

+

2

ε

+

τ

+

λ

+

λ

+

ε

+

τ

λ

+

τ

γ

ε

2

μ

η

(1 +

)

1

21

As the virtual cluster node in the 2VNs/2PMs paradigm is unavailable in the switchover and failure state, the steady availability of virtual cluster node in the 2VNs/2PMs paradigm is:

A =

lim

t →∞

( )

A t

=

1

(

π

S 1 A
S
1 A

+ π

S

1 S

+ π

S

2 A

+ π

S

2 S

+ π

F

A

+ π

F

S

)

So the downtime in a given time interval L is:

(

DT L

) = (

π

S

1

A

+

π

S

1

S

+

π

S

2

A

+

π

S

2

S

+

π

F

A

+

π

F

S

) ×

L

And the cost of downtime is:

C

(

L

)

=

(

π

S

1

A

×

C

S

1

A

+

π

S

1

S

×

C

S

1

S

+

π

S

2

A

×

C

S

2

A

+

π

S

2

S

×

C

S

2

S

+

π

F

A

×

C

F

A

+

π

F

S

×

C

F

S

)

×

L

IV.

ANALYSIS AND VALIDATION OF THE PROPOSED MODEL

As the precise data of seven model parameters is hard to measure, we choose six average model parameter value tuples of 10,00 real data tuples as Table 2 shown to represent six kinds of different availability level, and analyze the availability of the 2VNs/2PMs paradigm using the proposed availability model. All real data comes from one of our web servers. In the experiments, every model parameter is set to a default value: =2 times/month, =75%, =6 times/hour, =24%, 1/ =6 second, =2 times/year and =2 times/month. When every model parameter varies according to six values

in Table 2, the other model parameters are set to their own

default values.

TABLE II.

SIX TUPLES OF AVERAGE MODEL PARAMETER VALUES IN THE EXPERIMENTS

Model

 

The values of model parameters

 

parameter

The 1st

The 2nd

The 3rd

The 4th

The 5th

The 6th

tuple

tuple

tuple

tuple

tuple

tuple

(times/month)

1

2

3

4

5

6

60%

65%

70%

75%

80%

85%

(times/hour)

6

10

12

15

20

30

30%

25%

20%

15%

10%

5%

1/ (seconds)

4

6

10

30

60

300

(times/year)

1

2

3

4

5

6

(times/month)

1

2

4

8

30

60

The transition relation between seven model parameters

and the availability of the 2VNs/2PMs paradigm is illuminated in figure 4 and 5. We can find that the

availability of the 2VNs/2PMs paradigm increases with the

value of increasing in figure 4(a), and furthermore the availability of the 1VN/1PM paradigm(as a benchmark)

increases more because the increasing of means that the

probability of a virtual cluster node turning into Rejuvenation state increases and the probability of a virtual cluster node turning into Failure state degrades in the same interval. As the 1VN/1PM paradigm does not have a standby virtual machine to switch when failing, its availability is influenced

much by the failure rate( F ), but not by the model parameters

of and 1/ .

100. 000 100. 000 99. 995 99. 995 99. 990 99. 990 99. 985 99. 985
100.
000
100. 000
99.
995
99.
995
99.
990
99.
990
99.
985
99.
985
99.
980
99.
980
99.
975
99.
975
99.
970
99.
970
99.
965
99.
965
99.
960
99.
960
99.
955
99.
955
99.
950
99.
950
99.
945
99.
945
99.
940
99.
940
0. 60
0. 65
0. 70
0. 75
0. 80
0. 85
0. 2
0. 3
0. 4
0. 5
0. 6
0. 7
0. 8
(%)
(a)
( b)
100.
000
100. 000
99.
999
99.
999
99.
998
99.
998
99.
997
99.
997
99.
996
99.
996
99.
995
99.
995
0. 05
0. 10
0. 15
0. 20
0. 25
0. 30
4
6
10
30
60
300
1/ ( Second)
(c)
( d)
1VN/ 1PM
2VNs/ 2PMs
Avai l abi l i t y( %)
Avai l abi l i t y( %)
Avai l abi l i t y( %)
Avai l abi l i t y( %)

Figure 4.

The relation between the model parameters and the availability of virtual cluster node

The availability of the 2VNs/2PMs paradigm degrades with the value of increasing according to figure 4(b), in a less extent than the 1VN/1PM paradigm for the sake of higher rejuvenation rate. The availability of the 2VNs/2PMs paradigm degrades with the probability of virtual cluster nodes turning into the Switchover state increasing as figure 4(c). This mainly because the switchover between an active virtual cluster node and its standby virtual cluster node will make its unavailable and degrade its availability. From figure 4(d), we can find that the availability of virtual cluster nodes

510
510

Avai l abi l i t y( %)

Avai l abi l i t y( %)

degrades with the increasing of the switchover time (i.e. the downtime in live migrating process).

100. 00

  • 99. 98

  • 99. 96

  • 99. 94

  • 99. 92

  • 99. 90

  • 99. 88

99.

99.

99.

99.

99.

99.

99.

99.

99.

[1] [2] (ti mes/mont h) g [3] 99. 9968 99. 9936 99. 9904 99. 9872 99.
[1]
[2]
(ti mes/mont h)
g
[3]
99.
9968
99.
9936
99.
9904
99.
9872
99.
9840
99.
9808
[4]
99.
9776
99.
9744
99.
9712
99.
9680
99.
9648
99.
9616
99.
9584
[5]
99.
9552
1
2
4
8
30
60
(times/mont h)
f
[6]
1VN/ 1PM
2VNs/ 2PMs
[7]
[8]
[9]
[10]
Avai l abi l i t y( %)

123456

Avai l abi l i t y( %) Avai l abi l i t y( %)

123456

(ti mes/year ) e

The relation between the model parameters and the availability of virtual cluster node

Avai l abi l i t y( %) Avai l abi l i t y( %)

9999

9960

9921

9882

9843

9804

9765

9726

9687

Figure 5.

According to figure 5(e), the availability of the 2VNs/2PMs paradigm degrade with the increasing of , and the availability degradation degree of the 1VN/1PM paradigm is bigger. The availability of the 2VNs/2PMs paradigm degrades with the increasing of , and is influenced to a lower extent as figure 5(f). From figure 5(g), we can find the availability of the 2VNs/2PMs paradigm increase with the increasing of , and the availability of the 1VN/1PM paradigm increase with a bigger degree.

REFERENCES

Alan Wood. Availability modeling: understanding Markov models to calculate system reliability. Circuits & Devices, pp.22-27, 1994.

M. Aldinucci, M. Danelutto, and M. Torquati, and F. Polzella, and G.

Spinatelli, and M. Vanneschi, and A. Gervaso, and M. Cacitti, and P.

Zuccato. VirtuaLinux: virtualized high-density clusters with no single

point of failure. Proc. of the Int. Conference ParCo2007. Vol. 38,

pp.355-362, 2007.

B. Cully, and G. Lefebvre, and D. Meyer, and M. Feeley, and N.

Hutchinson, and A. Warfield. Remus: High Availability via

Asynchronous Virtual Machine Replication. Proceedings of the 5th

USENIX Symposium on Networked Systems Design and

Implementation. San Francisco, California, pages 161-174, 2008.

E. Farr, and R. Harper, and L. Spainhower, and J. Xenidis. A Case for

High Availability in a Virtualized Environment (HAVEN).

Proceedings of the 2008 Third International Conference on

Availability, Reliability and Security, 675-682, 2008.

W. Fischer, and C. Mitasch. High availability clustering of virtual

machines-possibilities and pitfalls. Paper for the talk at the 12th Linuxtag, Wiesbaden/Germany, May 3rd-6th, 2006.

I. Foster, and T. Freeman, and K. Keahey, and D. Scheftner, and B.

Sotomayor, and X. Zhang. Virtual clusters for grid communities. Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid. pages 513-520, 2006.

A.M. Jr. Johnson, and M. Malek. Survey of software tools for evaluating reliability, availability, and serviceability. ACM Computing Surveys (CSUR), 20(4): 227-269, 1988.

M. Le, and A. Gallagher, and Y. Tamir. Challenges and Opportunities with Fault Injection in Virtualized Systems. First International Workshop on Virtualization Performance: Analysis, Characterization, and Tools, Austin, Texas, April 2008.

M. Rosenblum and T. Garfinkel. Virtual Machine Monitors: Current Technology and Future Trends. IEEE Computer, 38(5):39-47, 2005.

M.A. Vouk. Cloud computing-Issues, research and implementations.

Journal of Computing and Information Technology. 16(4): 235-246,

  • V. CONCLUSION AND FUTURE WORK

The availability evaluation of virtual cluster system is an important issue in its promotion and application, and the availability evaluation of virtual cluster node is the base of the availability evaluation of virtual cluster system. This paper summarized one typical architecture paradigm of virtual cluster nodes by analyzing the overall architecture of virtual cluster systems and the deployment style of virtual cluster nodes, gave the state transition diagram of this typical paradigm by studying the complete lifecycle of virtual cluster nodes and the transition conditions of different node states, and proposed an availability model based on the Markov chain theory to contribute the availability analysis of virtual cluster nodes and virtual cluster systems. Finally, the numerical simulation experimental results proved the practicability of this proposed availability model. In the future, we will study the availability relation between virtual cluster node and virtual cluster system.

ACKNOWLEDGMENT

2008.

[11] H. Nishimura, and N. Maruyama, and S. Matsuoka. Virtual clusters on the fly-fast, scalable, and flexible installation. Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid, Rio de Janeiro, Brazil. Pages 549-556, 2007.

[12] X. Qin, and T. Xie. An availability-aware task scheduling strategy for heterogeneous systems. IEEE Transactions on Computers, 57(2):

188-199, 2008.

[13] M. Steinder, and I. Whalley, and D. Chess. Server virtualization in autonomic management of heterogeneous workloads. ACM SIGOPS Operating Systems Review. VOL.42NO.1:94-95. 2008.

[14] T. Thein, and M. Pokharel, and S.D. Chi, and J.S. Park. A Recovery Model for Survivable Distributed Systems through the Use of Virtualization. The Fourth International Conference on Networked Computing and Advanced Information Management (NCM’08). Gyeongju, Korea. September 2-4, 2008.

[15] T. Thein, and J.S. Park. Availability analysis of application servers using software rejuvenation and virtualization. Journal of Computer Science and Technology. 24(2): 339-346 Mar. 2009.

[16]

T. Thein, and J.S. Park, and S.D. Chi. Availability Modeling and Analysis on Virtualized Clustering with Rejuvenation. International Journal of Computer Science and Network Security. VOL.8 No.9, September 2008.

This work is supported by the State Key Development Program for Basic Research of China ("973 project", No.2007CB310900) and the 2010 Annual Funding Project of Baoding Association of society and Science (No.20100309). The authors want to thank Prof. Qinming He coming from Zhejiang University for his helpful advice.

[17] T. Thein, and S.D. Chi, and J.S. Park. Improving Fault Tolerance by Virtualization and Software Rejuvenation. Proceedings of the 2008 Second Asia International Conference on Modeling & Simulation (AMS), pages 855-860, 2008.

511
511