Anda di halaman 1dari 6

JOURNAL OF COMPUTER SCIENCE AND ENGINEERING, VOLUME 16, ISSUE 2, DECEMBER 2012 1

Future Request Predicting Disk Scheduler for Virtualization


Ashwini Meshram and Urmila Shrawankar
Abstract Virtualization technology is rapidly growing. It is being used in the field of academics and industry. It enables to put multiple operating systems under same set up. Instead of rebooting, it is possible to switch to different operating system virtually. Therefore, there is much focus on optimizing the virtual machine performance. In virtualized environment, each of the guest operating systems has virtual disk share of the underlying physical disk. In such a situation there is lack of co-ordination between the layers of system. Therefore, it is probable that different disk scheduling algorithms conflict with each other. This paper proposes an approach toward the performance improvement of disk scheduling in virtualized environment by future request arrival prediction. A comparative study of various disk scheduling performance improvement algorithms is provided. Index Terms Disk scheduler, fairness, rotational latency, throughput, virtualization.

1 INTRODUCTION

Virtual machine monitor (VMM) allows multiple virtual platforms to share the same physical machine safely and fairly. VMM provides separation among the virtual machines and manages their access to hardware resources [1]. The disk scheduler within the VMM plays a very im- portant role in determining the overall fairness and per- formance characteristics of the virtualized system. The disk is a resource which is shared by different applica- tions. Thus, there must be some scheduling criteria to use the disk properly. Therefore, Disk scheduling can be de- fined as a method of selecting threads or processes from the request queue and scheduling the ready processes. There occurs advancement in the disk technology. Virtual- ization has brought several challenges to disk scheduling. Nowadays, disk schedulers designed for Operating sys- tem are based on the latency characteristics of the underly- ing disk technology. In the virtualized system the disk latency characteristics are found to be different than the traditionally assumed characteristics. Disk latency charac- teristics of the guest operating system depends on the un- derlying disk as well as the additional queuing and pro- cessing that happens in the virtualization layer. Further- more, the virtual machine has been provided with the vir- tual disk shows the limitations of existing disk schedulers [2]. Thus, there is a need to re-examine the disk scheduler for the virtual environment.

Till date the contribution on disk scheduling in virtualized environment is comparatively less. The paper aims to study whether the traditional disk scheduling algorithm still provides performance benefits in the layered system consisting of virtualized operating system and underlying virtual machine monitor. In addition, performance provided by the different sched- uling algorithm depends on the workload characteristics. There are various applications in the system that needs to be executed. There are multiple requests to be handled by a single application. Thus, the system performs such that it takes a request from the application and sends it to the disk, thus executing the request sequentially producing the response. It optimizes the performance of various re- quests. The operating system can improve the overall sys- tem performance by keeping the disk as busy as possible. The scheduler in the virtualized environment has a signifi- cant role in improving the overall fairness and perfor- mance characteristics of the system .Therefore, there is a need for the implementation of the recently developed disk scheduling algorithms in virtualized system and to see which one is better than the existing ones. In any operating system the aim to perform disk schedul- ing is to increase the throughput and to decrease the seek time and rotational latency and keep disk as busy as pos- sible. Even in the virtualized environment the disk sched- uling is performed for the same reason. To perform the disk scheduling various scheduling algorithms are used. The most important issue while performing disk schedul- ing is to consider two important parameters, the through- put and fairness. Throughput refers to the number of re- quests that are completed in some period. A totally fair

Ashwini Meshram Project Student M.Tech (CSE) G. H. Raisoni College of Engineering Nagpur, India Urmila Shrawankar G. H. Raisoni College of Engineering Nagpur, India

2012 JCSE www.Journalcse.co.uk

system is the one which ensures that the mean response time of the disk requests is the same for all processes. In order to improve the system performance the best way is to modify existing disk scheduler [3]. Providing system performance guarantees leads to performance isolation that means the performance experienced by an application should not suffer due to variations in the workload from other applications [4]. Throughput and rotational latency are the key parameters of disk scheduling technique. Traditional disk scheduling algorithms which are designed for OS aims to improve the disk throughput and performance of the system. In this paper, the existing disk scheduling algorithms like High Throughput Token Bucket algorithm which is de- signed for traditional OS are used inside the virtual oper- ating system. These algorithms are non-work conserving i.e. they predict the future request arrival. In this paper, it is studied whether the concept of predicting future re- quests still improve the throughput and performance of the virtualized system.

2.2 Disk Read/ Write Operation To perform the read/write operation the disk must move the disk arm to the specific track called as Seek time. After that, the disk waits for the desired sector/data to rotate under the position of head. This time required is called as Rotational latency. The sector is the smallest amount of data that can be read or written. The total disk access time is therefore the combination of Seek Time and Rotational Latency.

3 DISK SCHEDULING ALGORITHMS


The environment that the Linux operating system provides is the best one to compare the disk scheduling algorithms [6].The disk scheduling algorithms are basically classified into two types; they are High Throughput schedulers and Performance Aware disk schedulers. The goal to perform scheduling is to increase throughput and to improve the systems performance in terms of bursts, fairness and rotational latency. The throughput is nothing but the number of disk requests that are completed in some period of time. The fairness is to have mean response time of all requests same for all the processes. The rotational latency is the time a disk must wait for the desired data to rotate into the position under head. During burst time the disk remains idle. The high throughput schedulers are also called as nonwork conserving schedulers because they predict the future request arrival [7]. The performance aware disk schedulers are work conserving. These schedulers choose one of the pending requests from queue to dispatch even if the pending request is far away from the current disk head. They do not predict the future request arrival which leads to poor disk throughput. The work conserving disk schedulers include YFQ (Yet Another Queuing) [8, 9], pClock [10, 11, 12].For the non-work conserving schedulers the request that is soon to arrive must be closer to the disk head than the pending requests. The nonwork conserving schedulers include CFQ (Completely Fair Queuing) [13], BFQ(Budget Fair Queuing ) [14] and HTBS (High Throughput Token Bucket Scheduler)[15, 16]. 3.1 YFQ YFQ is the fair queuing algorithm [8]. It is based on the tag assignment policy. Each request has been assigned a start tag and a finish tag. The scheduler serves one request at a time. It has good fairness guarantee and less delay. But, it has excessive rotational latency and seek overhead which leads to poor disk throughput. The algorithm does not predict future request arrival which causes deceptive idleness. Deceptive idleness is the condition where scheduler assumes that the process under service has no more requests to issue. Thats why the

2 DISK STRUCTURE AND ORGANIZATION


The data on disk is addressed by cylinders, surfaces and sectors [5]. Sector 0 is the first sector of the first track on the outermost cylinder. The mapping proceeds in order through the track, then the rest of the tracks in the cyliner, and then through the rest of the cylinders from outermost to innermost.

2.1 Classification of Disks The disk has been divided into three types. They are
1) Fixed head disk 2) Movable head disk and 3) Optical disk The Fixed Head Disk are expensive and they have less stor- age space as compared to movable head disk. But they are faster than the movable head disk. The Movable head disk is in the form of a disk pack, which is a stack of disks in which several platters are stacked on a central spindle. The platters are so placed that there is a slight space in between them so that the read write head can move between the pairs of disk. Each platter has two surfaces for recording. Each surface is then formatted with specific number of tracks for the data to be recorded on. The Optical disk has a high density storage and durability. So, it is responsible for the replacement of magnetic disk. The optical disk performs same as the magnetic disk. The head moves forward and backward, track to track.

scheduler is forced to switch to a request from another process. To overcome the deceptive idleness, anticipatory schedul- ing framework has been applied to this algorithm [9]. The scheduler waits for the additional requests to come from the same process which issues the last serviced request. 3.2 pClock The algorithm is based on the tag assignment policy [10].The algorithm is based on the fair queuing algorithm [11, 12]. Each request is assigned a start tag and end tag. The AUB (Arrival Upper Bound) function controls the tag assignment to the request. The algorithm has less burst time i.e. it keeps disk as busy as possible. But, as the algorithm is work conserving, it leads to poor disk throughput. 3.3 CFQ CFQ stands for Completely Fair Queuing algorithm. It is a non-work conserving algorithm i.e. it predicts future request arrival [7, 13]. It equally distributes the disk time among various processes. It provides fairness guarantee and achieves good aggregate throughput. It causes high rotational latency. 3.4 BFQ The algorithm is Budget Fair Queuing. It is also one of the fair queuing algorithms. In this algorithm, each application is assigned a budget in terms of number of sectors to traverse. The disk access is granted to one application at a time which is called as active application. There is a request queue for each application. Each request of the active application is dispatched one by one and respectively the budget counter is decreased by the size of request [14]. But, in case if the queue gets empty at a time, the timer is set for the disk to wait for the possible arrival of the next request. This may cause the disk to remain idle for a time but prevents the algorithm to switch to a new application. But, if the active application issues no more request before the timer expiration, it is considered to be idle and next budget is calculated. The next active application is selected which fits in the calculated budget. The algorithm achieves high throughput but lacks the support of burst and delay configuration per application. Following is the stepwise description of the BFQ algorithm Step 1: Assign budget to applications in the form of sectors to traverse. Step 2: Select the active application Step 3: Add request in the request queue of per application Step 4: If the active application is waiting for the arrival of

next request Step 5: Unset the timer expiration Step 6: Dispatch the request to serve Step 7: Decrement the budget counter by the size of the request Step 8: De-queue the request Step 9: If the queue is empty Step 10: Set the timer to the current time +Twait Step 11: If no requests are issued by the application before timer expiration Step 12: The application is declared idle and a new budget is assigned to it 3.5 HTBS The algorithm based on the fair queuing algorithm [10, 14, 15].There is a request queue per application. The algo- rithm is based on tag assignment policy. Each request is assigned a start tag and end tag. Each application is also assigned a tag which represents the request having biggest start tag. The algorithm keeps on searching the future request which has minimum finish tag [16]. The request is se- quentially dispatched from the active application. The active application is the one which has request with minimum finish tag. There are two main features of this algorithm. 1) It controls the maximum number of requests the same application can issue which avoids the starvation of requests from the other applications. 2) It limits the time the disk can be kept idle when waiting for future request. Following is the stepwise description of the BFQ algo- rithm Step 1: Select the active application Step 2: Add requests in the request queue per application Step 3: Dispatch the request from the active application. Step 4: If the active application is waiting for future re- quests. Step 5: Prevent the execution of timer expired and the re- quest is dispatched. Step 6: Update the number of tokens of a given applica- tion. Step 7: Compute the tags for the requests Step 8: If the active application is not waiting for future request Step 9: Queue up the request. Step 10: Update the number of tokens of a given applica- tion. Step 11: Compute the tags for the requests

4 COMPARISON OF EXISTING DISK SCHEDULING


ALGORITHMS

6 PROPOSED METHODOLOGY OF DISK SCHEDULING


IN VIRTUALIZED ENVIRONMENT

TABLE 1
COMPARISON OF EXISTING DISK SCHEDULING ALGORITHM Parameter Predicion of future request Tag assignment policy Throughput Rotational Latency YFQ [8, 9] No pClock [10, 11, 12] No CFQ [7, 13] Yes As per applicatio n Good High BFQ [14] Yes As per applicati on High Less HTBS [16] Yes

As per request Low High

As per request Poor Less

As per request High Less

The guest operating system is being provided with the virtual disk. This is achieved, by selecting a portion of the original disk for the creation of virtual disk for the guest operating system. There is additional queuing and processing issue that happens inside the particular virtual machine. The disk service latency characteristics are also found to be different and a significant problem than the assumed characteristics. The two parameters, that are, additional processing and latency characteristics, are responsible for the low system performance and less throughput. In order to have good system performance and high throughput in spite of having these limitations is to use the best scheduling algorithm. The traditional HTBS algorithm is being proposed for virtualized disk scheduling. The main feature of this algorithm is that it predicts future request arrival by continuously searching for the request with smallest finish tag. Following is the layered architecture of virtual environment as shown in figure 1. Hypervisor is also known as the virtual machine manager which serves as an abstraction between the underlying hardware and the guest operating system. User space is the memory area where all user mode applications work. The guest operating system will run as one of the application of host operating system in the user space.
USER SPACE GUEST OS HYPERVISOR HARD- WARE

5 EXISTING DISK SCHEDULING ALGORITHMS FOR VIRTUALIZED ENVIRONMENT


The algorithms that have been used for scheduling in vir- tualized environment are the Noop scheduler, the Com- pletely Fair Queuing scheduler, the Anticipatory sched- uler [4, 17], and the Deadline scheduler. They have been tested for the I/O workload. The basic operations that the disk schedulers perform are merging and sorting. Merging find outs the adjacent request and then combines them. Thus, it reduces the number of requests. The sorting is performed to arrange the pending requests in block order so that it can minimize the distance that the disk heads have to traverse on the physical disk. In a vir- tualized environment, there occurs the number of transi- tions between the guest operating system and the VMM, which are frequently the source of much of the overhead in virtualized environments [3]. These two operations per- formed together reduce the number of transitions between the guest and the VMM. TABLE 2 COMPARISON OF SCHEDULING ALGORITHMS IN VIRTULIZED ENVIRONMENT
Parameter Throughput Noop [6,17] High CFQ [6,17] Low Anticipatory [6] Low Deadline [6]

Fig. 1 Virtualization Environment High

Following are the steps to create virtual environment


Fairness Good High Low Good

Step 1 Host OS Host OS is the original OS installed on a computer. In this project, Red Hat Linux is being used as a host operating system. Step 2 Hypervisor A hypervisor is used to set up a new guest OS. The hypervisor for Linux includes KVM (Kernel Based Virtual

Rotational latency

Less

Less

Less

Less

Machine). The hypervisor which is being used in this project is Qemu-KVM 0.14.1. Step 3 Guest OS The hypervisor serves as an abstraction between the platform hardware and the guest operating systems. A wide variety of guest operating system work with KVM. The operating system which is being used in this project as a guest OS include Ubuntu 12.10. Step 4 Environment Specifications Each guest operating system is viewed as a single process of the host operating system that runs in the user space. Guest OS requires minimum 6 GB of free disk space in addition to the space required for the guest OS depending on its image format. The space required for guest OS is the greater than equal to the sum of space required by the guest's raw image files, the space required by the host operating system, and the swap space that guest OS will require and can be expressed as Total Required Space for Guest = Images + Host + Swap Using swap space can provide additional memory beyond the available physical memory.

period is known as Twait. This factor limits the time the disk can be kept idle. The active application can issue the limited number of consecutive requests which is being controlled by the factor Bmax. Greater the number of requests with locality grater will be the performance of the system. The seek_time, time required to the disk head from current position to the to the target track, is given by Seek_time= seek_factor*[abs(Block_access-current head position)][18] Where Block access is the block location on the disk. Seek_factor will be assumed as 0.3.

6 CONCLUSION
1. Among the different scheduling algorithms that are studied above, it is observed that the HTBS algorithm is non-work conserving, as it predicts future request arrival. 2. It has been implemented for traditional disk scheduling and it is observed that it works fine. It increases the overall throughput of the system and at the same time provides the quality of service in terms of rotational latency and burst time. 3. So the basic idea is to check whether this concept of future request arrival prediction can still increase the throughput and system performance in the virtual environment. 4. It will check the feasibility of existing disk scheduling algorithm in a virtualized environment and evaluate the throughput and guarantee performance improvement in the underlying virtual environment by the use of future request arrival prediction.

6.1APPLYING THE ALGORITHM IN VIRTUAL


ENVIRONENT

The stepwise description of the HTBS algorithm which is going to be used in the virtual environment is as follows. Step 1: Take 3 applications as input a) write on a notepad b) copy image of size 1MB c) read word document of size 50 KB Set maximum time the disk should wait for next request Twait=10miliseconds and maximum requests the queue can hold Bmax=5 Step 2: There is a request queue per application. Add request in the request queue per application Step 3: Assign tags to the requests i.e. Start tag and End tag Step 4: Out of the three applications listed above select the active application having minimum finish tag. Step 5: Dispatch the request from the active application with minimum finish tag. Step 6: If the disk is waiting for future request and request belongs to the active application. Step 7: Dispatch request and prevent the timer expired. Step 8: If the disk is not waiting for future request. Step 9: Queue up the request. Step 10: Update the token numbers and compute tags for the new request. The disk must wait for a specific time period for the next request to come from the active application. That time

REFERENCES
[1] D. Ongaro ,A.L. Cox ,S.Rixner Scheduling I/O in Virtual Machine Monitors in the proceedings of the fourth ACM SIGPLAN/SIGOPS International conference on virtual execution environment. Seattle, Washington, USA. 2008, pp 1- 10. Y.Zhang, B.Bhargava Self Learning disk Scheduling , in IEEE Transaactions on Knowledge and Data Engineering, vol. 21, IEEE computer Society,Jan 2009 J. Ke, X. Zhu, W. Na, and L. Xu, AVSS: An adaptable virtual storage system, in IEEE/ACM International Symposium on Cluster Computing and the Grid. IEEE Computer Society, Washington, DC ,USA ,2009,pp. 292299. S. Seelam and P. Teller, Fairness and performance isolation: an analysis of disk scheduling algorithms, in IEEE International Conference on Cluster Computing, IEEE, Barcelona, Spain,2006, pp. 110. R.Schlesinger, J.Garrido Principles of Modern Operating System 1st Jones and Bartlett Publishers,Inc.,USA ,2007 D.Bouthcher and A.Chandra Does Virtualization Make Disk Scheduling Passe? in Proceedings of ACM SIGPOS Operating System Review Volume 44, January 2010, New York, USA,pp 20-24 Y. Xu and S. Jiang, A scheduling framework that makes any disk schedulers non-work-conserving solely based on request characteristics, in Proceedings of the 9th USENIX conference

[2] [3]

[4]

[5] [6]

[7]

[8]

[9]

[10]

[11] [12]

[13] [14]

[15]

[16]

[17]

[18]

on File and storage technologies. USENIX Association,Berkeley, CA,USA,2011, pp. 119132. J. Bruno, J. Brustoloni, E. Gabber, B. Ozden, and A. Silberschatz, Disk scheduling with quality of service guarantees, in Proceedings of the IEEE International Conference on Multimedia Computing and Systems. IEEE Computer Society, 1999,Florence,Italy, pp. 400405 S. Iyer and P. Druschel, Anticipatory scheduling: A disk scheduling framework to overcome deceptive idleness in synchronous I/O, in 18th ACM Symposium on Operating Systems Principles, 2001,New York, USA, pp. 117130. A. Gulati, A. Merchant, and P. Varman, pClock: An arrival curve based approach for QoS in shared storage systems, in Proceedings of ACM SIGMETRICS. ACM, 2007, New York, USA ,pp. 1324. Jon C. R. Bennett and H.Zhang Hierarchical packet fair queueing algorithms, in IEEE Transactions on Networkig, vol. 5. IEEE Press, 1997,New York, USA, pp. 675689. P. Goyal, H. Vin, and H. Cheng, Start-time fair queuing: A scheduling algorithm for integrated services packet switching networks, in IEEE Transactions on Networks, 1997,New York, USA, pp. 690704. J. Axboe, Linux block I/O - present and future, in Proceedings of the Ottawa Linux Symposium, 2004, pp. 5161. P. Valente and F. Checconi, High throughput disk scheduling with fair bandwidth distribution, in IEEE Transactions on Computing, vol. 59. IEEE Computer Society, 2010, pp. 1172 1186. B. J. Brustoloni, E. Gabber, B. Ozden, and A. Silberschatz, Disk scheduling with quality of service guarantees, in Proceedings of the IEEE International Conference on Multimedia Computing and Systems. IEEE Computer Society, 1999,Florence, Italy, pp. 400405. P.E.Rocha,Luis C. E. Bona A QoS Aware Non-work- conserving Disk Scheduler in the 28th symposium on Mass Storage System And Technologies(MSST) 2012, IEEE, San Diegao, CA,pp.1-5 M.Kesavan, A.Gavrilovska, K.Schwan On Disk I/O Scheduling in Virtual Machines in the proceedings of Second Workshop on I/O Virtualization (WIOV10), March 13, 2010, Pittsburgh, PA, USA. S.Y.Mamdani, M.S.Ali ,S.M.Mundada Mathematical Model for Real Time Disk Scheduling Problem in proceedings of Inter- national Journal of Computer Applications (IJCA),2012,pp-21- 24,USA

Anda mungkin juga menyukai