The Design and Implementation of Log-Structure File System: M. Rosenblum and J. Ousterhout

The Design and Implementation of Log-Structure File System
M. Rosenblum and J. Ousterhout
Introduction
CPU Speed increases dramatically Memory Size increased
Most Read hits in cache
Disk improves only on the size but access is still very slow due to seek and rotational latency
Write must go to disk eventually
As a result
Write dominate the traffic Application has disk-bound problem
Overview of LFS
Unix FFS
Random write Scan entire disk Very slow restore consistency after crash
LFS
Write new data to disk in sequence Eliminate seek Faster crash recovery The most recent log always at the end
Traditional Unix FFS

Spread information around the disk
Layout file sequentially but physically separates different files Inode separate from file contents
Takes at least 5 I/O for each seek to create new file Causes too many small access Only use 5% disk bandwidth
Most of the time spent on seeking
Sprite LFS
Inode not at fixed position It written to the log Use inode map to maintain the current location of the inode It divided into blocks and store in the log Most of the time in cache for fast access (rarely need disk access) A fixed checkpoint on each disk store all the inode map location Only one single write of all information to disk required + inode map update All information in a single contiguous segment
Compare FFS/LFS
Task Allocate disk address FFS Block creation LFS Segment Write Appended to log Lookup in i-node map
Allocate i-node Fixed location Map anode Static numbers into address disk addresses Maintain free Bitmap space
Cleaner Segment usage table
Space Management
Goal: keep large free extents to write new data Disk divided into segments (512kB/1MB) Sprite Segment Cleaner
Threading between segments Copying within segment
Threading
Leave the live data in place Thread the log through the free extents Cons
Free space become many fragmented Large contiguous write wont be possible LFS cant access faster
Copying and Compacted

Copy live data out of the log Compact the data when it written back Cons: Costly
Segment Cleaning Mechanism

Read a number of Segments into memory Check if it is live data If true, write it back to a smaller number of clean segments Mark segment as clean
Segment summary block

Identify each piece of information in segment Version number + inode = UID Version number incremented in inode map when file deleted If UID of block mismatch to that in inode map when scanned, discard the block
Cleaning Policies
Sprite starts cleaning segment when the number of clean segment drops below a threshold It uses the write cost to compare the cleaning policies
"write cost" is the average amount of time the disk is busy per byte of new data written total bytes read and written N N u N 1 u 2 Write cost 1 u new data written N 1 u
Disk space underutilized via performance
u < 0.8 will give better performance compare to current Unix FFS u < 0.5 will give better performance compare to the improved Unix FFS
Simulate more real situation

Data random access pattern
Uniform Hot and cold 10% is hot and select 90% of the time 90% is cold and select 10% of the time
Cleaner use Greedy Policy

Choose the least-utilized segment to clean
Conclude hot and cold data should treat differently
Cost Benefit Policy

Cold data is more stable and will likely last longer Assume Cold data = older (age) Clean segment with higher ratio Group by age before rewrite
benefit free space generated age 1 u age cost cost 1 u
Cost Benefit Result
Left: bimodal distribution achieved Cold cleaned at u=75%, hot at u=15% Right: cost-benefit better, especially at utilization>60%
Crash Recovery
Traditional Unix FFS:
Scan all metadata Very costly especially for large storage
Sprite LFS
Last operations locate at the end of the log Fast access, recovery quicker Checkpoint & roll-forward Roll-forward hasnt integrated to Sprite while the paper was written Not focus here
Micro-benchmarks (small files)
Fig (a)
Fig (b)
Shows performance of large number of files create, read and delete LFS 10 times faster than Sun OS in create and delete LFS kept the disk 17% busy while SunOS kept the disk busy 85% Predicts LFS will improve by another factor of 4-6 as CPUs get faster No improvement can be expected in SunOS
Micro-benchmarks (large files)

100Mbyte file (with sequential, random) write, then read back sequentially LFS gets higher write bandwidth Same read bandwidth in both FS In the case of reads require seek (reread) in LFS, the performance is lower than SunOS
- SunOS: pay additional cost for organizing disk Layout - LFS: group information created at the same time, not optimal for reading randomly written files
Real Usage Statistics
Previous result doesnt include cleaning overhead The table shows better prediction This real 4 months usage includes cleaning overhead Write cost range is 1.2-1.6 More than half of cleaned segments empty Cleaning overhead limits write performance about 70% of the bandwidth for sequential writing In practice, possible to perform the cleaning at night or idle period
Thank You =)
~The end~

The Design and Implementation of Log-Structure File System: M. Rosenblum and J. Ousterhout

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

The Design and Implementation of Log-Structure File System: M. Rosenblum and J. Ousterhout

Diunggah oleh

Hak Cipta:

Format Tersedia

The Design and Implementation of Log-Structure File System

M. Rosenblum and J. Ousterhout

Traditional Unix FFS

Cleaner Segment usage table

Copying and Compacted

Segment Cleaning Mechanism

Segment summary block

Disk space underutilized via performance

Simulate more real situation

Cleaner use Greedy Policy

Conclude hot and cold data should treat differently

Cost Benefit Policy

Cost Benefit Result

Micro-benchmarks (small files)

Micro-benchmarks (large files)

Real Usage Statistics

Anda mungkin juga menyukai