Anda di halaman 1dari 4

Buftree

Buftree is a fundamental issue that needs proper understanding. Without this, a big portion of WAFL will not be
easy to follow. There are many steps in that journey. They are:
How is a buftree initialized??
How does it grow??
...
...
Lets go through this journey and try and understand it.

Contents
1 Root buffer of an In-core Inode (wip)
2 Mysterious portions of struct wafl_Inode{} i.e. wi_u
3 Usage of wi_u
3.1 Inode with 0 bytes
3.2 Inode size (non-zero) < 64 bytes
3.3 Inode size = 65 bytes
3.4 Inode size > 4Kb
3.5 What happens when after all the Block pointers in the inode point to L0 Blocks and we still
need more blocks to hold data ?
4 Disk Inode to In-core Inode
4.1 Copy from Disk Inode to In-core Inode
4.2 Initialize the In-core Inode's root buffer

Root buffer of an In-core Inode (wip)


An inode has an embedded wafl_Buf in the wip structure. The structure of this buffer is important.
struct wafl_Inode {
....
struct wafl_Buf wi_buf;
....
}

/* Top of file's buffer tree. */

wi_buf is an embedded buffer and is thus called the Root buffer of the buftree OR the root buffer of the Inode.
But, wafl_Buf is a buffer header only. Where is the actual data kept i.e. where is wafl_Bdata?? This is a mystery
that needs more poking.

Contents

Buftree

Mysterious portions of struct wafl_Inode{} i.e. wi_u


struct wafl_Inode {
....
union {
int4 wi_vol_blockno[MAX_INODE_INDIRECT_COUNT];
char wi_data[INODE_DATA_SIZE];
uint4 wi_rdev;
} wi_u;
....
}
#define INODE_DATA_SIZE 64

/* Bytes in an inode. */

Usage of wi_u
The above union is where we keep different things in the wip, based on the inode size. Lets look at two use cases.

Inode with 0 bytes


When a disk inode is created on the filesystem, we can have a file with zero bytes of data. An inode with zero
bytes of data need not have any buffers associated with it to store the data. So, we do not use even a single byte of
wi_u.

Inode size (non-zero) < 64 bytes


If inode size is < INODE_DATA_SIZE (64 bytes), then the file data is stored in the wip itself i.e. in
wip->wi_data.
What are the implications of this??
wip->wi_blk_cnt is zero as we have not allocated any buffer to the inode yet.
wip->wi_buf.data.data will be setup to point to wip->wi_u.wi_data. How?? We will find out later.
Buftree level is ZERO as the root buffer points to the data directly i.e. without using an indirect pointer.
Inode level is also ZERO as inode level is nothing but the level of its buftree.

Inode size = 65 bytes

Mysterious portions of struct wafl_Inode{} i.e. wi_u

Buftree
Once the inode data crosses 64 bytes, the inode can no longer store data within itself. So, it orders a brand new
4Kb block to store data. We call this an L0 Block. Note: The diagram shows 8 placeholders keeping in mind the
new 64-bit WAFL filesystem 64 / 8 bytes, where 8 bytes is the sizeof(vbn). A inode on a 32-bit file system will
have 64/4 = 16 block ptrs.
If inode size, currently <64 bytes, is extended to 65 bytes, then we need to do the following:
Allocate a block to the inode i.e. wip->wi_blk_cnt will move from 0 to 1.
Copy the original 64bytes of data from wip->wi_buf.data.data, which currently points to
wip->wi_u.wi_data. And then add the remaining 65th byte.
Switch from using wip->wi_data[] to wip->wi_vol_blockno[] i.e. the inode will now store block
pointers to actual blocks that contain data. NOTE that the 64 bytes have to be zeroed in this
transition.
Store the block number of the block we just allocated and copied the data to, in
wi_vol)blockno[] array.
NOTE that the root buf (wi_buf) will now continue to point to the same address inside
the wip, but the contents are now block numbers.
What are the implications of this??
wip->wi_blk_cnt is 1
wip->wi_buf.data.data will continue to point to wip->wi_u.wi_data.
Buftree level is ONE as the root buffer no longer points to data directly. Root buf now points to data
through a block pointer i.e. the Root buf now points to an indirect block .... almost as it points to a pseudo
block (wi_u.wi_data).
Inode level is also ONE as inode level is nothing but the level of its buftree.

Inode size > 4Kb


wip->wi_blk_cnt is 2
wip->wi_buf.data.data will continue to point to wip->wi_u.wi_data.
Buftree level is still ONE, level is still one.

What happens when after all the Block pointers in the inode point to L0
Blocks and we still need more blocks to hold data ?

An indirect block gets allocated to point to the existing 8 blocks. In addition, we also need to allocate another L0
block to store the data beyond 32Kb = 8 * 4Kb

Inode size = 65 bytes

Buftree
Since all the block ptrs are full, WAFL now allocates a new indirect 4Kb block to point to all the L0
blocks.
After allocating the L1 indirect block and setting it to point to the L0s, it clears the remaining block ptrs
within the inode, except the first inode block ptr.
The first inode block ptr will now point to the L1.
Until now, we have just done the groundwork to be able to point to more data. We also need to allocate
another L0 block to store the new data.

Disk Inode to In-core Inode


How does the above transition happen??

Copy from Disk Inode to In-core Inode


We need to copy the 64 bytes stored in the disk inode wdi_u structure. If inode size is <64 bytes, we copy file
data; else we copy block pointers. The path taken is: wafl_read_inode() -> wafl_copy_disk_to_incore_inode().
void
wafl_copy_disk_to_incore_inode(wafl_Disk_inode *src_dip, wafl_Inode *dst_wip)
{
....
dst_wip->wi_level = src_dip->wdi_level;
bcopy(src_dip->wdi_u.wdi_data, dst_wip->wi_u.wi_data, INODE_DATA_SIZE);
....
}

Initialize the In-core Inode's root buffer


We need to point wip->wi_buf to the location of either data (level 0) OR block pointers (level 1 and above). We
do this in wafl_read_inode() -> wafl_init_wip_root_buf().
void
wafl_init_wip_root_buf(wafl_Inode *wip, uint4 panic)
{
wip->wi_buf.inodep = wip;
wip->wi_buf.cp_state_flags[0] = 0;
wip->wi_buf.cp_state_flags[1] = 0;
WAFL_BUF_B_SET(&(wip->wi_buf), BUF_B_VALID);
wip->wi_buf.level = wafl_size_to_level_wip(wip, WAFL_SIZE(wip), panic);
wip->wi_buf.data.data = wip->wi_u.wi_data;
wip->wi_buf.lru_msec = sk_msecs4;
if (panic)
VERIFY(wip->wi_buf.level == wip->wi_level);
}

What happens when after all the Block pointers in the inode point to L0Blocks and we still need more blocks
4
to h

Anda mungkin juga menyukai