Relocation
OS is highest in memory, meaning that the process starts at zero with a maximum address of
(Memory size -OS size).
Base address (the first, aka smallest physical address), aka the relocation address
Limit address (the largest physical address the process can access), aka bound.
When a process is loaded it is placed in a segment of contiguous memory, if it doesn't fit then the OS
waits for a process to terminate.
Dynamic Relocation: OS can move processes while they run
Static Relocation: OS adjusts the
Privileged Bound register and base register are added (context
Uniprogramming:
switches must save (in PCB) and restore (to CPU) the bound
and base registers). Checks the address and adds base in
parallel.
Advantages
Disadvantages
MEMORY MANAGEMENT
The OS tracks which memory is available and what is utilized.
Given a memory request from a starting process the OS must
chose which gap to use for the process.
External Fragmentation
Internal Fragmentation
Unused memory between units
Unused memory within a unit
of allocation, gaps between
of allocation
processes
Simple to implement
Slow allocation
External fragmentation
Relatively simple
External fragmentation
Requirements:
Allocation is fast
External fragmentation
Overlays
Virtual Memory
Process' view of memory
Can be much larger than the size of physical memory, no longer limited by the size
of the machine .Only portions of the virtual address space are in physical memory.
Paging
Divide a process' virtual address space into fixed sized pages. Stores a copy of that
address space on disk.
A shared page may exist in different parts of the virtual address space of each
process, but the virtual addresses map to the same physical address.
Frames
Pages
Page Faults
logN
c+a
Page size
2^a
logA
Number of Pages
2^c
2^(n-a)
log(C)
n-a
2^m
logM
d+a
Frame size
2^a
logA
Number of Frames
2^d
2^(m-a)
logD
ma
2^e
logE
Caches
SRAM: L1, L2, and L3 caches between the
CPU and main memory. Cache
misses are served from DRAM
DRAM: the VM system's cache, caches virtual
pages in main memory. Slower than
SRAM, but faster than disk. Expensive
misses because misses are served
from disk.
Fully Associative Cache: each cache access
examines all blocks of cache to see if
any of them contains the location being
accessed. Not feasible for realistically
large caches
Set Associative Cache: Restrict any given
memory location to only a small set of
positions within the cache
Segmentation Fault
Mapping invalid because it's unallocated to the
Add additional levels of indirection to the page table by subdividing page number into
process
k parts. Creates a k-depth tree of page tables. No allocation necessary.
Each entry in the first level page table points to a second level page table, and each Temporal Locality
If a process accesses an item in memory, it will
entry in the second level page table points to a third level page table, and so on.
tend to reference the same item again soon.
The kth level page table entries actually hold the bit/frame number information that
is usually in a page table
Spatial Locality
Saves space because you only allocate space for the page-table levels that you
If a process accesses an item in memory, it will
need.
tend to reference an adjacent item soon
Multilevel Paging
There will always only ever be one first level page table.
Make pages tables proportional to the size of the physical address space. One entry
for each physical page. Has less overhead because there is no per process data
structure required. Works well with larger virtual address spaces.
Each entry contains
Residence bit
Protection bits
Hash function uses the PID and the page number. The result of the hash function
is added to the PTBR to get the index address of the right page table entry.
At the address of the page table the PID and the page number are checked
against the ones stored in the page table.
If they match then the right frame has been found and the frame number (the
index into the table) is sent along.
If they dont match that means the frame isn't holding the information for that
page and it needs to be swapped in.
Memory Mapping
Area: contiguous chunk of allocated virtual memory whose pages are related in some
way (examples: heap, user stack, data segment, code segment), allows the virtual
address space to have gaps, AKA segments
Memory mapping: associating the contents of a virtual memory area with an object on
disk. There are two types of object:
Anonymous file: contains all zeros, created by the kernel, no data transfer between
disk and memory. AKA demand-zero pages
when a page
Fetch Policy: Decides
is assigned a frame.
When a process starts: OS could immediately assign
frames and create mappings for all the pages and
load all of the pages.
Demand Paging: OS creates the mapping for each
page in response to a page fault when accessing that
page. Never wastes time creating a page mapping
that goes unused, but it has to have to full cost of a
page fault for each page. Linux uses demand this for
zero pages because of the low page fault cost.
GOOD WHEN: the processes has limited spatial
locality and when page fault cost is low.
1. A process arrives
2. The OS stores the process' virtual address
space on disk and does not put any of it into
memory.
3. As the process executes, pages are faulted in.
Prepaging: OS anticipates future use of pages and
pre-loads them
1. Process needing k pages arrives
2. If k frames are free the OS allocates all k pages
to the free frames If there aren't enough free
frames the OS must replace some frames.
3. The OS puts the first page in a frame and
updates the page table (put the frame number
in the first entry), and so on.
4. The OS marks all TLB entries as invalid
(flushes the TLB).
5. OS starts a process.
6. As process executes, OS loads TLB entries as
each page is accessed, replacing an existing
entry if the TLB is full.
Clustered Paging: each page fault causes a
cluster of neighboring pages to be fetched,
including the one causing the fault. In Linux, the
cluster of pages are put into RAM in the page
cache, this way subsequent page faults can
quickly find pages in the page cache. Changes
major page faults (reading disk) to minor page
faults (just updating the page table). AKA
readaround
Overlays: programmer indicates when to load and
remove pages from frames.
Low Water Mark: a minimum number of free frames, if the number of frames
Thrashing
drops below the low-water mark then the replacement policy starts freeing
Memory is overcommited and pages are tossed out
pages until the high-water mark is passed.
while they are still in use. Pages are swapped in and
Advantages: last minute freeing in response to a page fault will delay the
out continuously. Can happen when the working set
faulting process, evicting dirty pages (pages that have been written to) requires
size exceeds the size of physical memory(either the
writing out to disk first, evicting early allows us to write back several pages in a
sum of all working sets, or the a single process'
single operation (maximizing efficiency).
working set, or when T is too small.
Freeing pages when you need them:
Load Control
When a page is referenced that is not in memory and memory is full:
The number of processes that can reside in memory at
1. The OS selects a page to replace (using a page replacement algorithm)
one time. Reduces thrashing. Adds a new stage to the
2. It invalidates the old page in the page table
process life cycle, suspended. When the total number
3. It starts loading the new page into physical memory from disk
of pages needed is greater tan the number of frames
4. Context switch to another process while I/O is being done
available processes are swapped out to disk. A process
5. Get interrupt when page is finished being loaded into memory
in any state can be suspended, and a suspended
6. Updates the page table entry
process can only go to ready. Modern OS dont use a
7. Continues the faulting process
load control system, they use best effort systems.
Paging Advantages
Paging Disadvantages
Belady's Anomaly
amongst processes
Requires hardware support (TLB) to be increase the number of page faults rather than
decreasing as one would expect. Optimal and LRU are
FIFO
Optimal
Throw out oldest page, easy to implement, OS can throw out frequently
accessed pages
Look into future and throw out page that will be access farthest in the future,
impossible in real life.
LRU
Clock
Approximation of optimal, uses past to predict future, throw out the page that
has not been used in the longest time.
Option One: Keep a time stamp for each
page representing the last access. OS must
record time stamp for each memory access
and search all pages to find one to toss.
Slow! Lots of things to update often.
User Role
Returns pointer
If you free an object too late waste space and slow system
If you never free best case waste space, worst case run out of
memory
Organizing Free Lists
LIFO order + first fit: allocator inspects most recently used blocks
first, Freeing a block is constant time
Address order + first fit: address of each block in the list is less than
the address of it successor, better memory utilization than LIFO first
fit, but has linear time search to locate predecessor to the freed block
Buddy System: a segregated fits free list where the bins are powers of
2, requested block sizes are rounded up to the nearest power of 2. Has
fast searching and coalescing, can cause significant internal
fragmentation.
Clock Second Chance It's cheaper to replace a page that hasn't been
written since it doesn't need to be written back to disk. Modify the clock
algorithm to allow dirty pages to always survive one sweep of the clock hand
to avoid extra expensive disk writes.
Check both the reference bit and the modify bit to determine which page to
replace (reference bit, modify bit).
On page fault, OS searches for lowest page in the lowest non-empty class:
(0,1) not recently used, but modified OS needs to write, but may
not be needed anymore
(1,0) recently used, not modified may be needed again soon, but
doesn't need to be written
For pages with the reference bit set, the reference bit is cleared
On the second pass (no (0,0) page found on first pass), pages that were
(0,1) or (1,0) may have changed.
Implementation Two: OS goes around at mos three times searching for the
(0,0) class.
If the OS finds (0,1) clear the dirty bit and move on. Remember (possibly in
a list of pages) that the page is dirty, write only if the page is evicted.
For pages with the reference bit set, the reference bit is cleared
On the second pass (no (0,0) page found on first pass), pages that were
(0,1) or (1,0) may have changed.
Algorithm
When a page is referenced its reference bit is set, if the page was written
to (modified) the dirty bit is set.
During second sweep the dirty bit is cleared (OS keeps track of what
pages are really dirty)
Only replace pages that have (0,0), so if a page is dirty it had to wait 2 full
sweeps to be replaced.
Bump Pointer
Free List
Divides memory into blocks, and maintains a list of free blocks (stored on
the heap). To allocate memory, find a block in the free list using (first fit, best
fit, worst fit, etc). To deallocate memory put memory back on the free list.
Free Block
Binning: Exact Fit Have bin for each block size, except for last bin
which holds all larger free blocks (which can be broken up later!). Faster
allocation, but takes up more space. AKA simple segregated storage
Binning: Range Have bin for a range of block sizes (requires fewer
bins, but you have to search for a good sized block within the bin ), except
for final bin which holds larger free blocks. AKA segregated fits.
Non-copying Reclamation
Uses free-list allocation and reclamation, only way for explicit memory
management
Mark Sweep: free list + trace + sweep-to-free
1. Get a pointer to free space from the free-list(that uses binning)
2. If there is no memory of the right size then a collection is triggered
3. Mark Phase: Transitive closure marking all the reachable objects
4. Sweep Phase: sweep the memory for unreachable objects
populating the free list (put unreachable objects back in the free
list). Can be made incremental by organizing heap in blocks and
sweeping one block at a time on demand.
Copying Reclamation
(Generally) uses bump pointer allocation
En masse reclamation
Mark-Compact: bump allocation + trace + compact
1. Use the bump pointer to allocate
2. Mark Phase: Transitive closure marking all the reachable objects
3. Compact Phase: Copy all the remaining objects (reachable
objects) to one end of the of heap, this is why we can use the
bump pointer
Space efficiency
Efficiency of allocator
Time to allocate
Incrementally
Identifying Garbage
Reference counting:
Count the number of references to each object, if the reference
number is 0 the object is garbage
Doesn't work for circularly linked lists with no external
pointers(cycles)
Tracing:
Trace reachability from program roots (registers, stack, stack
variables) and mark reachable objects
Objects not traced are unreachable.
Heap Organization & Incremental Collection
It takes too long to trace the whole heap at once (long pause
times). Also why collect long living objects repeatedly? Incremental
Collection: divide the heap into increment and collect one at a time.
Generational Hypothesis: young objects die more quickly than
older ones, older objects are more likely to survive garbage
collection.
Generational Heap Organization: Divide the heap into a young
space and an old space.
Allocate into the young space. When young space fills up,
garbage collect it and copy into the old space (emptying the
young space). Allocate into young space again.
When old space fills up, collect the young and the old spaces
and move still live objects to the new to space
Device Hardware
The hardware associated with an I/O device consists of four
pieces:
1. Bus: allows device to communicate with the CPU, typically
shared by multiple devices
2. Device Port: typically consists of four register
1. Status Register: holds if the device is busy, if data is ready,
or if an error occurred.
2. Control Register: command to perform, what we want it to
do
3. Data-in Register: data being sent form the device to the
CPU
4. Data-out Register: data being sent from the CPU to the
device
3. Controller: receives commands from the system bus, translate
commands into device actions, and reads data from and write
data to the system bus
4. Device: the device itself
I/O scheduling
Interrupts
Rather than have CPU continually check the device, the device can
interrupt the CPU when it completes an I/O operation. On an I/O
interrupt:
1. Determine which device caused the interrupt
2. If the last command was an input operation, retrieve the data from
the device register
3. Start the next operation for that device
Disks
Disks
Memory
Lasts forever(ish)
Volatile
Cheap
Seek Time
Time to position head over track/cylinder, depends on how fast the
hardware can move the arm.
To get the fastest disk response time we have to minimize seek time and
rotational latency
Place commonly used files on the
FCFS/FIFO
SCAN/elevator/LOOK
Flash Storage
Operations
Erase block
To improve durability
Management of defective
pages/erasure blocks (firmware
marks them as bad, wear leveling
(spreads updates, remapping
helps with this wears disk out in
uniform way), spares (for both
wear leveling and mapping bad
pages and blocks).
Block vs Sector
OS may choose to use a larger
block size than the sector size, if
we only have large files we want
large blocks.
Most systems allow transferring of
many sectors between interrupts.
Create()
File Operations
Link()
Unlink()
Open()
Close()
Write()
Seek()
Fsync()
Simple
Disadvantages
External fragmentation
Linked Allocation
File stored as a linked list of blocks. File head has a pointer to
the first and last sector/block allocated to that file (Pointer to
the last block makes it easier to grow the file). Each sector has
a pointer to the next sector.
Advantages
No external fragmentation
(minimum allocation unit is
a block, so no wasted
space)
Disadvantages
Direct Allocation
File header points to each data block.
Advantages
Little fragmentation
Disadvantages
Indexed Allocation
Create a non-data index block for each file, it contains a list of
pointers to file blocks (number of pointers depends on the size
of the pointer and size of the block).
The file header has a pointer to the index block (file header has
no direct knowledge of where file info is stored on the disk, no
longer points to data blocks).
OS allocates an array to hold the pointers to all the blocks
when it creates the file, but allocates the blocks only on
demand.
Advantages
Disadvantages
Volume and file size are limited (no more than 2^28 blocks,
files no bigger than 4GB)
Simple to implement
because of fixed structure
Allow file growth/appends
Easy to access small files
Efficient in sequential reads
Directories
File System Layout on Disk
A file that contains a collection of mappings
MBR: master boot record
from file name to file number (inumber).
Partition Table: contains the addresses of the first and last blocks of each partition
Those mappings are directory entries. Only
Super Block: contains metadata of the file system
Free Space Management: free space info
OS can modify directories (this ensures the
Root Directory: holds root directory info
integrity of mapping, application programs
can read directories. Directories create a
name space for the files (same files or
same names can be used in different
directories)
Simple and Stupid Directories: One name
space for the entire disk.
Use a special area of the disk to hold the
directory
Directory contains <name, index> pairs
If one user uses a name, no one else can
Simple User Based Directories: Each user
has a separate directory, but all of each
user's files must still have unique names,
names can be reused by different users.
Multi-Level Directories: tree-structured
hierarchical name space, what modern OS
use.
Store directories on disk, just like files
except the file header fr directories has a
special flag bit
User programs read directories just like
any other file, but only special system
calls can write directories
Each directory contains <name, inumber>
pairs in no particular order. The inumber
is the index into the array of inodes on
disk.
There is one special root directory (stored
in a special location on disk)
How do you find the blocks of a file?
Fast File System
1. Get the inumber from the directory that
contains the file
Smart index structure
2. With the inumber find the inode (inumber Multilevel index allows to locate all blocks
is the index into the inode array)
of a file (efficient for both large and small
3. The inode contains the file blocks. Find
files)
the correct block you are looking for.
Smart Locality heuristics
Block group placement
Optimizing Directories
Represented as a bitmap
seeks when reading metadata to
Small file where data is resident (no extents needed a small optimization)
File's metadeta takes up too much room in the record, attribute list holds pointers
to the other records holding the rest of the data
Hardware provides:
OS provides:
Redundancy allows
recovery from some
additional failures
(persistence)
RAID-0
Disk striping (disk blocks broken down and stored on
different disks, they will be accessed concurrently.
Gives us higher disk bandwidth but poor reliability
(failure of a single disk would cause data loss),
RAID-1
Mirrored disk, write same thing to both disks. On
failure use surviving disk. Expensive (must write each
change twice)
RAID-3
Byte striped with parity (bytes written to same spot on
each disk), parity allows us to detect and correct
errors in one of the disks, writing the parity disk every
time a change is made.
RAID-4
Block striped with parity (blocks written to same spot
on each disk), writing the parity disk every time a
change is made
RAID-5
Block interleaved distributed parity, there is no single
parity disk, parity and data is distributed across all
disks
RAID-10
Stripes using RAID-0(disk striping) across reliable
logical disks (reliable because of RAID-1, mirrored
disks)
RAID-50
Stripes using RAID-0(disk striping) across groups of
disks with block interleaved distributed parity (RAID5)
High
Large cost to
Organize storage
Write Through: write changes immediately back to disk,
Performance
initiate I/O
to access data in
consistent, slow (we have to wait for the write to hit the disk and
large sequential
generate an interrupt)
units
Write Back: delay writing modified data (for example delay until
Use caching
page is replaced in memory), better performance, can cause
inconsistencies (data can be lost in a crash)
Named Data
Large capacity
Support files and
File System Inconsistency: metadata structures dont match (for
Survives
directories
crashes
meaningful names example bitmaps and inode structures), we dont care about user
data!
Shared across
Updating multiple data structures with file system operations
programs
Move a file between directories
Controlled
Device may
Include metadata
Reliability
Crash can
Use transactions
occur during
Use redundancy
Storage
correct failures
devices can fail Migrate data to
Example: Appending a data block to the file
Flash memory
even the wear
Add new data block to data block struct
wears out
Update inode struct
Reliability
Data
block bitmap and data block struct succeed
Physical
Characteristics
Design Implication