Anda di halaman 1dari 43

Getting Started With ZFS "Zettabyte File System" in Solaris.

ZFS stands for Zettabyte File System. Zfs is a 128 bit Fs that was firstly
introduced on June 2006 with Solaris 10 6/06 release. ZFS allows 256 quadrillion
zettabytes of storage. Which means there would be no limit on number of
filesystems and on number of files/directories that can exists on ZFS.
Zfs does not replace any traditional Filesystem (UFS in Solaris) and it does not
improve any existing UFS technology. Instead it is a New Approach to Manage
Data in Solaris. Zfs is more robust, scalable & earier to administer than
traditional FS. But it will take some time to so capture the market and to replace
UFS, the most stable FS for Solaris till date.

ZFS Pools

ZFS pools are the storage pools to manager Physical Disks/Storage. In traditinal
UFS filesystem we use to partition the disk and then used to make Filesystems on
the slices. In ZFS the approach is completely different. Here we used to make a
pool of Block devices (Disks) and the Filesystems are created from the pools. It
means whatever disks would be free can be used to create Filesystems as per
requirement. You can think Pools as Diskgroups used in VXVM.

ZFS requirements

1.) ZFS is fully supported on Sparc and intel Solaris boxes.


2.) ZFS is supportable in solaris 10 minimum release level should be Solaris 10
6/06.
3.) Recommended memory should be 1GB or high.
4.) The minimum disk size should be 128MB for ZFS as per documents and
minimum disk space for Storage pool is approx. 64MB.

ZFS Terminology

TERMS

Definition

Checksum

A 256-bit hash of the data in a FS block.

Clone
snapshot

A FS with contents that are identical to the contents of a ZFS

Dataset
A generic name for the following ZFS entities: clone,
filesystems, snapshots & volumes. Each dataset is identified by a unique name in
the ZFS namespace.

Default FS
A file system that s created by default when using Solaris Live
Upgrade to migrate from UFS to a ZFS root FS. The current set of default FS term
is /, /usr,/opt & /var.

ZFS FS
A ZFS dataset that is mounted within the standard syatem
namespace and behaves like other traditinal FS.

Mirror
A virtual device also called a RAID-1 devices, that stores
identical copies of data on two or more disks.

Pool
A logical group of block devices describing the layout & physical
characteristics of the available storage. Space for datasets is allocated from a
pool. Also called a storage pool or simply a pool.

RAID-Z
A virtual device that stores data and parity on multiple disks,
same as that of RAID-5.

Resilvering
The process of transerring data from one device to another
device. For example, when a mirror component is taken offline and then later is
put back online, the data from the up to date mirror component us copied to the
newly restored mirror component. The process is called mirror resynchronization
in traditional volume management products.

Shared FS
The set of the file systems that are shared between the alternate
boot environment and the primary boot environment. This set includes file
systems, such as .export & the area reserved for swap. Shared FS migh also
contain zone roots.

Snapshot

A read-only image of a FS or volume at a give point of time.

Virtual Device A logical device in a pool, which can be a physical device, a file or
a collection of device.

Volume
A dataset used to emulate a physical device, For example you
can create a ZFS volume as a swap device.

ZFS RAID Configurations:


========================

ZFS support 3 RAID configurations as given below

1.) RAID-0 : Data distributed across one or more disks. There is no redundancy in
RAID-0. If any disk fails, all data will be lost. That the reason RAID-0 is least
preferred.

2.) RAID-1 : The two exact copies of data will be retained in the server. There will
be no data loss unless & until one mirror survives. This is the most commmonly
used RAID in any Volume Manager.

3.) RAID-Z : RAID-Z is similar to RAID-5.

Creation Of Basic ZFS:


======================

I am creating a Pool consisting of one disk c1t0d0 named as mypool. Its a


RAID-0 pool.

Note: You can use -f option in case of any errors while zpool creation.

# zpool create mypool c1t0d0

# df -h mypool
Filesystem size used avail capacity Mounted on
mypool 67G 21K 67G 1% /mypool

This output is showing a pool named as mypool and ZFS filesystem /mypool
which can be used to store data. ZFS its create this directory itself.

We can check the space availabity by using zpool list command as shown below.

# zpool list
NAME SIZE ALLOC FREE CAP HEALTH ALTROOT
mypool 68G 124K 68.0G 0% ONLINE

Any errors in the Zpool can be check by zpool status command as show below
errors: No known data errors:

# zpool status
pool: mypool
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
mypool ONLINE 0 0 0
c1t0d0 ONLINE 0 0 0
errors: No known data errors

Now I will create a new ZFS filesystem using same Zpool. As zpool list is
showng that the zpool is having 68GB space you we can use the free space to
create new FS.

# zfs create mypool/myfs

# df -h /mypool/myfs
Filesystem size used avail capacity Mounted on
mypool/myfs 67G 21K 67G 1% /mypool/myfs

This is the way a simple ZFS FS is created. Herein now I will present to change
some of the FS properties which are mostly used in ZFS.

# I will create a new ZFS FS named as mypool/myquotefs and set the FS quota to
20GB. This will prevent the other FS in the pool from using all the space in the
pool.

# zfs create mypool/myquotefs

# zfs set quota=20g mypool/myquotefs

# df -h /mypool/myquotefs
Filesystem size used avail capacity Mounted on
mypool/myquotefs 20G 21K 20G 1% /mypool/myquotefs

Here I will change the FS mountpoint to desired name by chainging the ZFS
property.

# zfs set mountpoint=/test mypool/myfs

# df -h /test
Filesystem size used avail capacity Mounted on
mypool/myfs 67G 21K 67G 1% /test
#

Zfs list will list all the active ZFS FS and volumes on the server:

# zfs list
NAME USED AVAIL REFER MOUNTPOINT
mypool 140K 66.9G 23K /mypool
mypool/myfs 21K 66.9G 21K /test
mypool/myquotefs 21K 20.0G 21K /mypool/myquotefs

-r option is used to list recursively the datasets in the zpool followed by the
pool name as shown below:

# zfs list -r mypool/myfs


NAME USED AVAIL REFER MOUNTPOINT
mypool/myfs 21K 66.9G 21K /test
#

# zpool status
pool: mypool
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
mypool ONLINE 0 0 0
c1t0d0 ONLINE 0 0 0
errors: No known data errors

Note: To check all the properties you can use zfs get all, I will show the output of
the same below in the post.

Renaming a ZFS FS:

We can rename a ZFS FS using zfs rename command. Below is the syntax for
the same:

# zfs rename

Eg: zfs rename /mypool/myquotefs /mypool/new-myquotefs

Mirroring in ZFS:
=================

I will be going to present the mirroing on ZFS FS and the operation like taking
disk out of ZFS pool and insertion on disk in the pool. You will clearly notice the
status while insertion and taking the disk out. Also you will see how to check the
ZFS parameters and how to change them using zfs set command. I will also try
to show you how to destroy the ZFS FS and ZFS pool. This will give you the basic
platform on how to get speedup with the ZFS FS which is going to be the Primary
FS in Solaris11.

Note: You will also find the procedure to change disk under ZFS in the below
given post, You need to take the disk offline from the respective pool and replace
the disk with newer one and again take the replaced disk online for the
respective pool and monitor untill it completly synced.

# zpool attach -f mypool c1t0d0 c1t2d0

Note: Both disks should be of same size or the new disk which we are going to
mirror is of more size than that of existing one.

# zpool status
pool: mypool
state: ONLINE
scrub: resilver completed after 0h0m with 0 errors on Sun Oct 9 22:56:23 2011
config:
NAME STATE READ WRITE CKSUM
mypool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c1t0d0 ONLINE 0 0 0
c1t2d0 ONLINE 0 0 0 143K resilvered
errors: No known data errors

# zpool list
NAME SIZE ALLOC FREE CAP HEALTH ALTROOT
mypool 68G 156K 68.0G 0% ONLINE

# zpool iostat

capacity operations bandwidth


pool alloc free read write read write
-
mypool 156K 68.0G 0 0 85 489

# zpool status -x
all pools are healthy

# zpool status
pool: mypool
state: ONLINE
scrub: resilver completed after 0h0m with 0 errors on Sun Oct 9 22:56:23 2011
config:
NAME STATE READ WRITE CKSUM
mypool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c1t0d0 ONLINE 0 0 0
c1t2d0 ONLINE 0 0 0 143K resilvered
errors: No known data errors

# zpool detach mypool c1t0d0

# zpool status
pool: mypool
state: ONLINE
scrub: resilver completed after 0h0m with 0 errors on Sun Oct 9 22:56:23 2011
config:
NAME STATE READ WRITE CKSUM
mypool ONLINE 0 0 0
c1t2d0 ONLINE 0 0 0 143K resilvered
errors: No known data errors
#

# zpool attach -f mypool c1t2d0 c1t0d0

# zpool status
pool: mypool
state: ONLINE
scrub: resilver completed after 0h0m with 0 errors on Sun Oct 9 23:02:54 2011
config:
NAME STATE READ WRITE CKSUM
mypool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c1t2d0 ONLINE 0 0 0
c1t0d0 ONLINE 0 0 0 154K resilvered
errors: No known data errors
#

# zpool offline mypool c1t2d0

# zpool status
pool: mypool
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using zpool online or replace the device with
zpool replace.
scrub: resilver completed after 0h0m with 0 errors on Sun Oct 9 23:02:54 2011
config:
NAME STATE READ WRITE CKSUM
mypool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
c1t2d0 OFFLINE 0 0 0

c1t0d0 ONLINE 0 0 0 154K resilvered


errors: No known data errors

# zpool online mypool c1t2d0

# zpool status
pool: mypool
state: ONLINE
scrub: resilver completed after 0h0m with 0 errors on Sun Oct 9 23:03:57 2011
config:
NAME STATE READ WRITE CKSUM
mypool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c1t2d0 ONLINE 0 0 0 24K resilvered
c1t0d0 ONLINE 0 0 0
errors: No known data errors
#

# zpool history mypool


History for mypool:
2011-10-09.22:47:43 zpool create mypool c1t0d0
2011-10-09.22:49:15 zfs create mypool/myfs
2011-10-09.22:51:49 zfs create mypool/myquotefs
2011-10-09.22:52:01 zfs set quota=20g mypool/myquotefs
2011-10-09.22:53:30 zfs set mountpoint=/test mypool/myfs
2011-10-09.22:56:24 zpool attach -f mypool c1t0d0 c1t2d0
2011-10-09.23:01:49 zpool detach mypool c1t0d0
2011-10-09.23:02:54 zpool attach -f mypool c1t2d0 c1t0d0
2011-10-09.23:03:25 zpool offline mypool c1t2d0
2011-10-09.23:03:57 zpool online mypool c1t2d0
# zpool history -l mypool
History for mypool:

2011-10-09.22:47:43 zpool create mypool c1t0d0 [user root on yogeshtest#:global]


2011-10-09.22:49:15 zfs create mypool/myfs [user root on yogesh-test#:global]
2011-10-09.22:51:49 zfs create mypool/myquotefs [user root on yogeshtest#:global]
2011-10-09.22:52:01 zfs set quota=20g mypool/myquotefs [user root on yogeshtest#:global]
2011-10-09.22:53:30 zfs set mountpoint=/test mypool/myfs [user root on
yogesh-test#:global]
2011-10-09.22:56:24 zpool attach -f mypool c1t0d0 c1t2d0 [user root on yogeshtest#:global]
2011-10-09.23:01:49 zpool detach mypool c1t0d0 [user root on yogeshtest#:global]
2011-10-09.23:02:54 zpool attach -f mypool c1t2d0 c1t0d0 [user root on yogeshtest#:global]
2011-10-09.23:03:25 zpool offline mypool c1t2d0 [user root on yogeshtest#:global]
2011-10-09.23:03:57 zpool online mypool c1t2d0 [user root on yogeshtest#:global]
#

# zpool status
pool: mypool
state: ONLINE
scrub: scrub completed after 0h0m with 0 errors on Sun Oct 9 23:06:37 2011
config:
NAME STATE READ WRITE CKSUM
mypool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c1t2d0 ONLINE 0 0 0
c1t0d0 ONLINE 0 0 0
errors: No known data errors

# zfs list
NAME USED AVAIL REFER MOUNTPOINT

mypool 155K 66.9G 23K /mypool


mypool/myfs 21K 66.9G 21K /test
mypool/myquotefs 21K 20.0G 21K /mypool/myquotefs

# zfs get all mypool


NAME PROPERTY VALUE SOURCE
mypool type filesystem
mypool creation Sun Oct 9 22:47 2011
mypool used 155K
mypool available 66.9G
mypool referenced 23K
mypool compressratio 1.00x
mypool mounted yes
mypool quota none default
mypool reservation none default
mypool recordsize 128K default
mypool mountpoint /mypool default
mypool sharenfs off default
mypool checksum on default
mypool compression off default
mypool atime on default
mypool devices on default
mypool exec on default
mypool setuid on default
mypool readonly off default
mypool zoned off default
mypool snapdir hidden default
mypool aclmode groupmask default
mypool aclinherit restricted default
mypool canmount on default
mypool shareiscsi off default
mypool xattr on default

mypool copies 1 default


mypool version 4
mypool utf8only off
mypool normalization none
mypool casesensitivity sensitive
mypool vscan off default
mypool nbmand off default
mypool sharesmb off default
mypool refquota none default
mypool refreservation none default
mypool primarycache all default
mypool secondarycache all default
mypool usedbysnapshots 0
mypool usedbydataset 23K
mypool usedbychildren 132K
mypool usedbyrefreservation 0
mypool logbias latency default

Note: I am disabling the ZFS inherit FS feature which means we wont be able to
see the FS which are mounted automatically by ZFS. Below is the eg. given:

Note: I will suggest you to go through the ZFS FS property index table which you
will be able to find easily from google or sun site. That will give you more idae
about all the parameters which can be changed and there effect can be noticed.

# zfs set mountpoint=none mypool

# zfs list
NAME USED AVAIL REFER MOUNTPOINT
mypool 164K 66.9G 23K none
mypool/myfs 21K 66.9G 21K /test
mypool/myquotefs 21K 20.0G 21K none

# df -k
Filesystem kbytes used avail capacity Mounted on
/dev/vx/dsk/bootdg/rootvol
8262869 5102426 3077815 63% /
/devices 0 0 0 0% /devices
ctfs 0 0 0 0% /system/contract
proc 0 0 0 0% /proc
mnttab 0 0 0 0% /etc/mnttab
swap 23798088 1624 23796464 1% /etc/svc/volatile
objfs 0 0 0 0% /system/object
sharefs 0 0 0 0% /etc/dfs/sharetab
/platform/sun4u-us3/lib/libc_psr/libc_psr_hwcap1.so.1
8262869 5102426 3077815 63% /platform/sun4u-us3/lib/libc_psr.so.1
/platform/sun4u-us3/lib/sparcv9/libc_psr/libc_psr_hwcap1.so.1
8262869 5102426 3077815 63% /platform/sun4u-us3/lib/sparcv9/libc_psr.so.1
fd 0 0 0 0% /dev/fd
swap 23796480 16 23796464 1% /tmp
swap 23796528 64 23796464 1% /var/run
swap 23796464 0 23796464 0% /dev/vx/dmp
swap 23796464 0 23796464 0% /dev/vx/rdmp
/dev/vx/dsk/bootdg/var_crash
20971520 71784 19593510 1% /var/crash
mypool/myfs 70189056 21 70188892 1% /test

# zfs inherit mountpoint mypool

# zfs list
NAME USED AVAIL REFER MOUNTPOINT
mypool 164K 66.9G 23K /mypool
mypool/myfs 21K 66.9G 21K /test
mypool/myquotefs 21K 20.0G 21K /mypool/myquotefs

# df -k
Filesystem kbytes used avail capacity Mounted on
/dev/vx/dsk/bootdg/rootvol
8262869 5102427 3077814 63% /
/devices 0 0 0 0% /devices
ctfs 0 0 0 0% /system/contract
proc 0 0 0 0% /proc
mnttab 0 0 0 0% /etc/mnttab
swap 23796840 1624 23795216 1% /etc/svc/volatile
objfs 0 0 0 0% /system/object
sharefs 0 0 0 0% /etc/dfs/sharetab
/platform/sun4u-us3/lib/libc_psr/libc_psr_hwcap1.so.1
8262869 5102427 3077814 63% /platform/sun4u-us3/lib/libc_psr.so.1
/platform/sun4u-us3/lib/sparcv9/libc_psr/libc_psr_hwcap1.so.1
8262869 5102427 3077814 63% /platform/sun4u-us3/lib/sparcv9/libc_psr.so.1
fd 0 0 0 0% /dev/fd
swap 23795232 16 23795216 1% /tmp
swap 23795280 64 23795216 1% /var/run
swap 23795216 0 23795216 0% /dev/vx/dmp
swap 23795216 0 23795216 0% /dev/vx/rdmp
/dev/vx/dsk/bootdg/var_crash
20971520 71784 19593510 1% /var/crash
mypool/myfs 70189056 21 70188892 1% /test
mypool 70189056 23 70188892 1% /mypool
mypool/myquotefs 20971520 21 20971499 1% /mypool/myquotefs
#

# zfs list
NAME USED AVAIL REFER MOUNTPOINT
mypool 164K 66.9G 23K /mypool
mypool/myfs 21K 66.9G 21K /test
mypool/myquotefs 21K 20.0G 21K /mypool/myquotefs

# zfs destroy mypool/myquotefs


# zfs destroy mypool/myfs

# zpool destroy mypool


#
#
# zfs list
no datasets available
# zpool list
no pools available
#

I hope this post will help the begineers to get some speed with ZFS. I will try to
cover the complex ZFS tasks in my coming posts. This is just an overview and I
think is useful for many SOlaris Administrators. :-)

In our earlier post ( To get Started with ZFS ) , yogesh discussed about various
ZFS pool and file system Operations. In this post I will be demonstrating the
redundancy capability for different ZFS pools and also the recovery procedure
from the disk failure scenarios. I have performed this lab on Solaris 11 , these
instructions are same for Solaris 10 though.

Quick Recap about ZFS Pools

1. Simple and Striped Pool ( Equivalent to Raid-0 and Data is Non redundant)
2. Mirrored Pool ( Equivalent to Raid-1)
3. Raidz pool ( Equivalent to Single Parity Raid 5 Can with stand upto single
disk failure)
4. Raidz-2 pool ( Equivalent to Dual Parity Raid 5 Can withstand upto two disk
failures)

5. Raidz-3 pool ( Equivalent to Triple Partity Raid 5 Can with stand upto thre
disk Failures)

RAIDZ Configuration Requirements and Recommendations

A RAIDZ configuration with N disks of size X with P parity disks can hold
approximately (N-P)*X bytes and can withstand P device(s) failing before data
integrity is compromised.

Start a single-parity RAIDZ (raidz) configuration at 3 disks (2+1)


Start a double-parity RAIDZ (raidz2) configuration at 6 disks (4+2)
Start a triple-parity RAIDZ (raidz3) configuration at 9 disks (6+3)
(N+P) with P = 1 (raidz), 2 (raidz2), or 3 (raidz3) and N equals 2, 4, or 6
The recommended number of disks per group is between 3 and 9. If you have
more disks, use multiple groups

A general consideration is whether your goal is to maximum disk space or


maximum performance.
A RAIDZ configuration maximizes disk space and generally performs well when
data is written and read in large chunks (128K or more).
A RAIDZ-2 configuration offers better data availability, and performs similarly to
RAIDZ. RAIDZ-2 has significantly better mean time to data loss (MTTDL) than
either RAIDZ or 2-way mirrors.
A RAIDZ-3 configuration maximizes disk space and offers excellent availability
because it can withstand 3 disk failures.
A mirrored configuration consumes more disk space but generally performs
better with small random reads.

Disk Failure Scenario for Simple/Striped ZFS Non Redundant Pool

Disk Configuration:
root@gurkulunix3:~# echo|format

Searching for disksdone

AVAILABLE DISK SELECTIONS:


0. c3t0d0 <SUN ZFS 7120 HARDDISK-1.0 cyl 8351 alt 2 hd 255 sec 63>
/pci@0,0/pci8086,2829@d/disk@0,0
1. c3t2d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@2,0
2. c3t4d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@4,0
3. c3t5d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@5,0
4. c3t6d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@6,0

Creating Simple ZFS Storage Pool

root@gurkulunix3:/dev/chassis# zpool create poolnr c3t2d0 c3t3d0


poolnr successfully created, but with no redundancy; failure of one
device will cause loss of the pool

root@gurkulunix3:/dev/chassis# zpool list


NAME

SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT

poolnr 3.97G 92.5K 3.97G 0% 1.00x ONLINE


rpool 63.5G 5.21G 58.3G 8% 1.00x ONLINE

Creating Sample Filesystem for new pool

root@gurkulunix3:/dev/chassis# zfs create poolnr/testfs

root@gurkulunix3:/downloads# zpool status poolnr


pool: poolnr

state: ONLINE
scan: none requested
config:

NAME

STATE

poolnr

ONLINE

READ WRITE CKSUM


0

c3t2d0 ONLINE

c3t3d0 ONLINE

errors: No known data errors

After Manual Simulation of the Disk ( c3t2d0) failure:

root@gurkulunix3:~# echo|format
Searching for disksdone

AVAILABLE DISK SELECTIONS:


0. c3t0d0 <SUN ZFS 7120 HARDDISK-1.0 cyl 8351 alt 2 hd 255 sec 63>
/pci@0,0/pci8086,2829@d/disk@0,0
1. c3t2d0 <drive type unknown>
/pci@0,0/pci8086,2829@d/disk@2,0
2. c3t4d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@4,0
3. c3t5d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@5,0
4. c3t6d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@6,0

root@gurkulunix3:~# zpool status poolnr

pool: poolnr
state: UNAVAIL
status: One or more devices are faulted in response to persistent errors. There
are insufficient replicas for the pool to
continue functioning.
action: Destroy and re-create the pool from a backup source. Manually marking
the device
repaired using zpool clear may allow some data to be recovered.
scan: none requested
config:

NAME

STATE

poolnr

UNAVAIL

READ WRITE CKSUM


0

0 insufficient replicas

c3t2d0 FAULTED

0 too many errors

c3t6d0 ONLINE

From The Above Scenario it has been observed that Simple ZFS pool cannot
withstand for any disk failures.

Disk Failure Scenario for Mirror Pool

Initial Disk Configuration

root@gurkulunix3:~# echo|format
Searching for disksdone

AVAILABLE DISK SELECTIONS:


0. c3t0d0 <SUN ZFS 7120 HARDDISK-1.0 cyl 8351 alt 2 hd 255 sec 63>
/pci@0,0/pci8086,2829@d/disk@0,0
1. c3t2d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@2,0

2. c3t3d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>


/pci@0,0/pci8086,2829@d/disk@3,0
3. c3t4d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@4,0
4. c3t7d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@7,0
Specify disk (enter its number): Specify disk (enter its number):

Create Mirror Pool

root@gurkulunix3:~# zpool create mpool mirror c3t4d0 c3t7d0


root@gurkulunix3:~# zfs create mpool/mtestfs

>>> Copy Some Sample data to new file system

root@gurkulunix3:~# df -h|grep /mpool/mtestfs


mpool

2.0G

mpool/mtestfs

32K

2.0G

2.0G

31K

1%

2.0G

1%

root@gurkulunix3:~# zpool status mpool


pool: mpool
state: ONLINE
scan: none requested
config:

NAME

STATE

mpool

ONLINE

mirror-0 ONLINE

c3t4d0 ONLINE

c3t7d0 ONLINE

READ WRITE CKSUM

errors: No known data errors

/mpool
/mpool/mtestfs

After Manually simulating the Disk Failure

root@gurkulunix3:~# echo|format
Searching for disksdone

AVAILABLE DISK SELECTIONS:


0. c3t0d0 <SUN ZFS 7120 HARDDISK-1.0 cyl 8351 alt 2 hd 255 sec 63>
/pci@0,0/pci8086,2829@d/disk@0,0
1. c3t2d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@2,0
2. c3t3d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@3,0
3. c3t4d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@4,0
Specify disk (enter its number): Specify disk (enter its number):
<== we lost the disk c3t7d0

Checking pool Status after Disk Failure


root@gurkulunix3:~# zpool status mpool
pool: mpool
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using zpool online.
see: http://www.sun.com/msg/ZFS-8000-2Q
scan: none requested
config:

NAME

STATE

READ WRITE CKSUM

mpool

DEGRADED

mirror-0 DEGRADED

c3t4d0 ONLINE

c3t7d0 UNAVAIL

0 cannot open

errors: No known data errors

After physically Replacing the Failed disk ( placing new disk in same location)

root@gurkulunix3:~# echo|format
Searching for disksdone

AVAILABLE DISK SELECTIONS:


0. c3t0d0 <SUN ZFS 7120 HARDDISK-1.0 cyl 8351 alt 2 hd 255 sec 63>
/pci@0,0/pci8086,2829@d/disk@0,0
1. c3t2d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@2,0
2. c3t3d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@3,0
3. c3t4d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@4,0
4. c3t7d0 <SUN ZFS 7120 HARDDISK-1.0 cyl 1022 alt 2 hd 128 sec 32>
/pci@0,0/pci8086,2829@d/disk@7,0 << New Disk

>>> Label new disk with SMI Label ( A requirement to attach to ZFS pool)
root@gurkulunix3:~# format -L vtoc -d c3t7d0
Searching for disksdone
selecting c3t7d0
[disk formatted]

c3t7d0 is labeled with VTOC successfully.

Replace the Failed Disk Component from the ZFS pool

root@gurkulunix3:~# zpool replace mpool c3t7d0


root@gurkulunix3:~# zpool status -x mpool
pool mpool is healthy
root@gurkulunix3:~# zpool status mpool
pool: mpool
state: ONLINE
scan: resilvered 210M in 0h0m with 0 errors on Sun Sep 16 10:41:21 2012
config:

NAME

STATE

READ WRITE CKSUM

mpool

ONLINE

mirror-0 ONLINE

c3t4d0 ONLINE

c3t7d0 ONLINE

<<< Disk Online

errors: No known data errors


root@gurkulunix3:~#

Single and Double Disk Failure Scenarios for ZFS Raid-Z Pool

Disk Configuration Available for new Raid-Z pool Creation

root@gurkulunix3:~# echo|format
Searching for disksdone

AVAILABLE DISK SELECTIONS:


0. c3t0d0 <SUN ZFS 7120 HARDDISK-1.0 cyl 8351 alt 2 hd 255 sec 63>
/pci@0,0/pci8086,2829@d/disk@0,0
1. c3t2d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@2,0
2. c3t3d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@3,0
3. c3t4d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@4,0
4. c3t7d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@7,0
Specify disk (enter its number): Specify disk (enter its number):

Creating New RaidZ Pool

root@gurkulunix3:~# zpool create rzpool raidz c3t2d0 c3t3d0 c3t4d0 c3t7d0


invalid vdev specification
use -f to override the following errors:
/dev/dsk/c3t2d0s0 is part of exported or potentially active ZFS pool poolnr.
Please see zpool(1M).
==> Here we had a issue with one of the disk we selected for the pool, and
the reason is the disk already used by some zpool earlier. But now the old zpool
no longer available, and we want to reuse the disk for the new zpool.

==> We can solve the problem by two ways 1. Use -f option to


override the configuration 2. Reinitialize the partition table for the disk ( Solaris
X86 only).
==> In this example I have reinitialized the whole disk as solaris
partition with below command

root@gurkulunix3:~# fdisk -B /dev/rdsk/c3t3d0p0


root@gurkulunix3:~# zpool create rzpool raidz c3t2d0 c3t3d0 c3t4d0 c3t7d0

root@gurkulunix3:~# zpool status rzpool


pool: rzpool
state: ONLINE
scan: none requested
config:

NAME

STATE

rzpool

ONLINE

READ WRITE CKSUM


0

raidz1-0 ONLINE

c3t2d0 ONLINE

c3t3d0 ONLINE

c3t4d0 ONLINE

c3t7d0 ONLINE

errors: No known data errors


Create File system and Copy some test data to rzpool/r5testfs

root@gurkulunix3:~# zfs create rzpool/r5testfs


root@gurkulunix3:/downloads# df -h|grep test
rzpool/r5testfs

5.8G 575M

5.3G

10%

/rzpool/r5testfs

root@gurkulunix3:/downloads# cd /rzpool/r5testfs/
root@gurkulunix3:/rzpool/r5testfs# ls -l
total 1176598
-rw-rr 1 root

root

602057762 Sep 16 11:09 OLE6-U2-VM-Template.zip

root@gurkulunix3:/rzpool/r5testfs#

After Manual Simulation of the Disk failure ( i.e. c3t7d0)

root@gurkulunix3:~# echo|format
Searching for disksdone

AVAILABLE DISK SELECTIONS:


0. c3t0d0 <SUN ZFS 7120 HARDDISK-1.0 cyl 8351 alt 2 hd 255 sec 63>
/pci@0,0/pci8086,2829@d/disk@0,0
1. c3t2d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@2,0
2. c3t3d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@3,0
3. c3t4d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@4,0

<<== c3t7d0 missing

Specify disk (enter its number): Specify disk (enter its number):

Checking the zpool Status it is in Degraded State


root@gurkulunix3:~# zpool status -x rzpool
pool: rzpool
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using zpool online.
see: http://www.sun.com/msg/ZFS-8000-2Q
scan: none requested
config:

NAME
rzpool

STATE

READ WRITE CKSUM

DEGRADED

raidz1-0 DEGRADED

c3t2d0 ONLINE

c3t3d0 ONLINE

c3t4d0 ONLINE

c3t7d0 UNAVAIL

0 cannot open

errors: No known data errors


Checking if the File system is Still Accessible

root@gurkulunix3:~# df -h |grep testfs


rzpool/r5testfs

5.8G 575M

5.3G

10%

/rzpool/r5testfs

root@gurkulunix3:~# cd /rzpool/r5testfs
root@gurkulunix3:/rzpool/r5testfs# ls -l
total 1176598
-rw-rr 1 root

root

602057762 Sep 16 11:09 OLE6-U2-VM-Template.zip

root@gurkulunix3:/rzpool/r5testfs#

After replacing the failed disk with new disk, in the same location

root@gurkulunix3:~# echo|format
Searching for disksdone

AVAILABLE DISK SELECTIONS:


0. c3t0d0 <SUN ZFS 7120 HARDDISK-1.0 cyl 8351 alt 2 hd 255 sec 63>
/pci@0,0/pci8086,2829@d/disk@0,0
1. c3t2d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@2,0
2. c3t3d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@3,0
3. c3t4d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@4,0
4. c3t7d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@7,0
Specify disk (enter its number): Specify disk (enter its number):

root@gurkulunix3:~# zpool status -x


pool: rzpool

state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using zpool replace.
see: http://www.sun.com/msg/ZFS-8000-4J
scan: none requested
config:

NAME
rzpool

STATE

READ WRITE CKSUM

DEGRADED

raidz1-0 DEGRADED

c3t2d0 ONLINE

c3t3d0 ONLINE

c3t4d0 ONLINE

c3t7d0 FAULTED
0
0
0 corrupted data
<<== this State
Changed to Faulted just because the zpool could see the new disk but with
no/corrupted data
errors: No known data errors
Replacing the Failed Disk Component in the Zpool

root@gurkulunix3:~# zpool replace rzpool c3t7d0


invalid vdev specification
use -f to override the following errors:
/dev/dsk/c3t7d0s0 is part of exported or potentially active ZFS pool mpool.
Please see zpool(1M).
root@gurkulunix3:~# zpool replace -f rzpool c3t7d0 <<== using -f option to
override above message
root@gurkulunix3:~# zpool status -x
all pools are healthy
root@gurkulunix3:~# zpool status rzpool
pool: rzpool
state: ONLINE

scan: resilvered 192M in 0h1m with 0 errors on Sun Sep 16 11:50:49 2012
config:

NAME

STATE

rzpool

ONLINE

READ WRITE CKSUM


0

raidz1-0 ONLINE

c3t2d0 ONLINE

c3t3d0 ONLINE

c3t4d0 ONLINE

c3t7d0 ONLINE

errors: No known data errors

Two Disk Failures Scenario for RaidZ pool And it Fails

Zpool Status Before Disk Failure

root@gurkulunix3:~# zpool status rzpool


pool: rzpool
state: ONLINE
scan: resilvered 192M in 0h1m with 0 errors on Sun Sep 16 11:50:49 2012
config:

NAME

STATE

rzpool

ONLINE

READ WRITE CKSUM


0

raidz1-0 ONLINE

c3t2d0 ONLINE

c3t3d0 ONLINE

c3t4d0 ONLINE

c3t7d0 ONLINE

Disk Configuration After Simulating double disk failure

root@gurkulunix3:~# echo|format
Searching for disksdone

AVAILABLE DISK SELECTIONS:


0. c3t0d0 <SUN ZFS 7120 HARDDISK-1.0 cyl 8351 alt 2 hd 255 sec 63>
/pci@0,0/pci8086,2829@d/disk@0,0
1. c3t2d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@2,0
2. c3t3d0 <SUN ZFS 7120 HARDDISK-1.0-2.00GB>
/pci@0,0/pci8086,2829@d/disk@3,0 <== C3t4d0 & c3t7d0 missing
Specify disk (enter its number): Specify disk (enter its number):

Zpool Status after the Double Disk Failure


root@gurkulunix3:~# zpool status -x
pool: rzpool
state: UNAVAIL
status: One or more devices could not be opened. There are insufficient
replicas for the pool to continue functioning.
action: Attach the missing device and online it using zpool online.
see: http://www.sun.com/msg/ZFS-8000-3C
scan: none requested
config:

NAME

STATE

rzpool

UNAVAIL

raidz1-0 UNAVAIL

READ WRITE CKSUM


0
0

0
0

0 insufficient replicas
0 insufficient replicas

c3t2d0 ONLINE

c3t3d0 ONLINE

c3t4d0 UNAVAIL

0 cannot open

c3t7d0 UNAVAIL

0 cannot open

Conclusion: /rzpool/r5testfs filesystem not available for usage and the Zpool
cannot be recovered from the current state

Its too long here to post the RaidZ2 and RaidZ3 disk failure scenarios, I will be
posting them as a separate post.

The ZFS file system is a new kind of file system that fundamentally changes the
way file systems are administered, with the below mentioned features:

ZFS Pooled Storage

ZFS uses the concept of storage pools to manage physical storage. Historically,
file systems were constructed on top of a single physical device. To address
multiple devices and provide for data redundancy, the concept of a volume
managerwas introduced to provide a representation of a single device so that file
systems would not need to be modified to take advantage of multiple devices.
This design added another layer of complexity and ultimately prevented certain
file system advances because the file system had no control over the physical
placement of data on the virtualized volumes.

ZFS eliminates volume management altogether. Instead of forcing you to create


virtualized volumes, ZFS aggregates devices into a storage pool

Transactional Semantics

ZFS is a transactional file system, which means that the file system state is
always consistent on disk. In Transactional file system data is managed using
copy on write semantics. Data is never overwritten, and any sequence of
operations is either entirely committed or entirely ignored. Thus, the file system
can never be corrupted through accidental loss of power or a system crash.
Although the most recently written pieces of data might be lost, the file system
itself will always be consistent. In addition, synchronous data (written using the

O_DSYNC flag) is always guaranteed to be written before returning, so it is never


lost.

Checksums and Self-Healing Data

With ZFS, all data and metadata is verified using a user-selectable checksum
algorithm. In addition, ZFS provides for self-healing data. ZFS supports storage
pools with varying levels of data redundancy. When a bad data block is detected,
ZFS fetches the correct data from another redundant copy and repairs the bad
data, replacing it with the correct data.

Unparalleled Scalability

Zfs is 128 bit filesystem, that allows 256 quadrillion zettabytes of storage.All
metadata is allocated dynamically, so no need exists to preallocate inodes or
otherwise limit the scalability of the file system when it is first created. All the
algorithms have been written with scalability in mind. Directories can have up to
248 (256 trillion) entries, and no limit exists on the number of file systems or the
number of files that can be contained within a file system.

ZFS Snapshots

A snapshot is a read-only copy of a file system or volume. Snapshots can be


created quickly and easily. Initially, snapshots consume no additional disk space
within the pool.

As data within the active dataset changes, the snapshot consumes disk space by
continuing to reference the old data. As a result, the snapshot prevents the data
from being freed back to the pool.

Below is the Quick Reference for ZFS command line Operations

CREATE / DESTROY POOL

Remove a disk from a pool

#zpool detach prod c0t0d0

Delete a pool and all associated filesystems

#zpool destroy prod

Create a pool named prod

#zpool create prod c0t0d0

Create a pool with a different default mount point

#zpool create -m /app/db prod c0t0d0

CREATE RAID-Z / MIRROR

Create RAID-Z vdev / pool

#zpool create raid-pool-1 raidz c3t0d0 c3t1d0 c3t2d0

Add RAID-Z vdev to pool raid-pool-1

#zpool add raid-pool-1 raidz c4t0d0 c4t1d0 c4t2d0

create a RAID-Z1 Storage Pool

#zpool create raid-pool-1 raidz1 c0t0d0 c0t1d0 c0t2d0 c0t3d0 c0t4d0 c0t5d0

create a RAID-Z2 Storage Pool

#zpool create raid-pool-1 raidz2 c0t0d0 c0t1d0 c0t2d0 c0t3d0 c0t4d0 c0t5d0

Add a new mirrored vdev to a pool

#zpool add prod mirror c3t0d0 c3t1d0

Force the creation of a mirror and concat

#zpool create -f prod c3t0d0 mirror c4t1d0 c5t2d0

Force the creation of a mirror between two different sized disks

#zpool create -f mypool mirror c2t0d0 c4t0d0

diska is mirrored to diskb

#zpool create mypool mirror diska diskb

diska is mirrored to diskb AND diskc is mirrored to diskd

#zpool create mypool mirror diska diskb mirror diskc diskd

CREATE / DESTROY A FILESYSTEM AND/OR A BLOCKDEVICE

Create a filesystem named db in pool prod

#zfs create prod/db

Create a 5gb block device volume named db in pool prod

#zfs create -V 5gb prod/db

Destroy the filesystem or block device db and associated snapshot(s)

#zfs destroy -fr prod/db

Destroy all datasets in pool prod

#zfs destroy -r prod

MOUNT / UMOUNT zfs

Set the FS mount point to /app/db

#zfs set mountpoint=/app/db prod/db

Mount #zfs oracle in pool prod

#zfs mount prod/db

Mount all #zfs filesystems

#zfs mount -a

Unmounting all #zfs filesystems

#zfs umount a

Unmount #zfs filesystem prod/db

#zfs umount prod/db

LIST ZFS FILESYSTEM INFORMATION

List all zfs filesystems

#zfs list

Listing all properties and settings for a FS

#zfs list -o all


#zfs get all mypool

LIST ZFS POOL INFORMATION

List pool status

# zpool status -x

List individual pool status mypool in detail

# zpool status -v mypool

Listing storage pools brief

# zpool list

Listing name and size

# zpool list -o name,size

Listing without headers / columns

# zpool list -Ho name

SET ZFS FILESYSTEM PROPERTIES

Set a quota on the disk space available to user guest22

#zfs set quota=10G mypool/home/guest22

How to set aside a specific amount of space for a filesystem

#zfs set reservation=10G mypool/prod/test

Enable mounting of a filesystem only through /etc/vfstab

# zfs set mountpoint=legacy mypool/db


and then Add appropriate entries to /etc/vfstab

NFS share /prod/export/share

# zfs set sharenfs=on prod/export/share

Disable execution of files on /prod/export

# zfs set exec=off prod/export

Set the recordsize to 8k

# zfs set recordsize=8k prod/db

Do not update the file access time record

#zfs set atime=off prod/db/datafiles

Enable data compression

#zfs set compression=on prod/db

Enable fletcher4 type checksum

# zfs set checksum=fletcher4 prod/data

Remove the .snapshot directory visibility from the filesystem

# zfs set snapdir=hidden prod/data

ANALYSE ZFS PERFORMANCE

Display zfs IO statistics every 2 seconds

#zpool iostat 2

Display #zfs IO statistics in detail every 2 seconds

#zpool iostat -v 2

zfs FILESYSTEM MAINTENANCE

Scrub all filesystems in pool mypool

# zpool scrub mypool

Temporarily offline a disk (until the next reboot)

#zpool offline -t mypool c0t0d0

Clear error count by onlining a disk

#zpool online

Clear error count (without the need to online a disk)

#zpool clear

IMPORT / EXPORT POOLS AND FILESYSTEMS

List pools available for import

#zpool import

Imports all pools found in the search directories

#zpool import -access

To search for pools with block devices not located in /dev/dsk

#zpool import -d

Search for a pool with block devices created in /zfs

#zpool import -d /zfs prod/data

Import a pool originally named mypool under new name temp

#zpool import mypool temp

Import pool using pool ID

#zpool import 6789123456

Deport a Zfs pool named mypool

#zpool export mypool

Force the unmount and deport of a #zfs pool mypool

#zpool export -f mypool

CREATE / DESTROY SNAPSHOTS

Create a snapshot named test of the db filesystem

#zfs snapshot mypool/db@test

List snapshots

#zfs list -t snapshot

Roll back to Tues (recursively destroy intermediate snaps)

#zfs rollback -r prod/prod@tuesday

Roll back must and force unmount and remount

#zfs rollback -rf prod/prod@tuesday

Destroy snapshot created earlier

#zfs destroy mypool/db@test

CREATE / DESTROY CLONES

Create a snapshot and then clone that snap

#zfs snapshot prod/prod@12-11-06


#zfs clone prod/prod@12-11-06 prod/prod/clone

Destroy clone

#zfs destroy prod/prod/clone

Anda mungkin juga menyukai