Solaris -- ZFS
ZFS ist ein von Sun Microsystems entwickeltes transaktionales Dateisystem, welches zahlreiche Erweiterungen für die Verwendung im Server- und Rechenzentrumsbereich enthält. Hierzu zählen die enorme maximale Dateisystemgröße, eine einfache Verwaltung selbst komplexer Konfigurationen, die integrierten RAID-Funktionalitäten, das Volume-Management sowie der prüfsummenbasierte Schutz vor Datenübertragungsfehlern. Der Name ZFS stand ursprünglich für Zettabyte File System, ist aber inzwischen ein Pseudo-Akronym, wodurch die Langform nicht mehr gebräuchlich ist.vgl. You say zeta, I say zetta
Inhaltsverzeichnis
Creating and manipulating zpools (zfs)
For pooling devices, zpools can be:
- a mirror
- a RAIDz with single or double parity
- a concatenated/striped storage
First we will try to look up the disks accessible by our system:
vidar/# format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c1t0d0 <DEFAULT cyl 10440 alt 2 hd 255 sec 63> /pci@0,0/pci15ad,1976@10/sd@0,0 1. c1t1d0 <DEFAULT cyl 10440 alt 2 hd 255 sec 63> /pci@0,0/pci15ad,1976@10/sd@1,0 2. c2t0d0 <VMware,-VMware Virtual S-1.0-1.00GB> /pci@0,0/pci15ad,790@11/pci15ad,1976@2/sd@0,0 3. c2t1d0 <VMware,-VMware Virtual S-1.0-1.00GB> /pci@0,0/pci15ad,790@11/pci15ad,1976@2/sd@1,0 4. c2t2d0 <VMware,-VMware Virtual S-1.0-1.00GB> /pci@0,0/pci15ad,790@11/pci15ad,1976@2/sd@2,0 5. c2t3d0 <VMware,-VMware Virtual S-1.0-1.00GB> /pci@0,0/pci15ad,790@11/pci15ad,1976@2/sd@3,0 6. c2t4d0 <VMware,-VMware Virtual S-1.0-1.00GB> /pci@0,0/pci15ad,790@11/pci15ad,1976@2/sd@4,0 7. c2t5d0 <VMware,-VMware Virtual S-1.0-1.00GB> /pci@0,0/pci15ad,790@11/pci15ad,1976@2/sd@5,0 Specify disk (enter its number): ^C
Type CTRL-C to quit "format".
If your disks do not show up, use 'devfsadm'.
Let's create our first pool by simply putting together all three disks (c1t0d0 is our root partition and c1t1d0 our '/var' directory which is not usable for our example)
vidar/# zpool create iscsi1 raidz c2t0d0 c2t1d0 c2t2d0 c2t3d0 c2t4d0 c2t5d0
That's it. You have just created a zpool named "iscsi1" containing all three disks. Your available space will be just the sum of all six disks.
vidar/# zpool list NAME SIZE ALLOC FREE CAP HEALTH ALTROOT iscsi1 5.91G 167K 5.91G 0% ONLINE -
Use "zpool status" to get detailed status information of the components of your zpool.
vidar/# zpool list NAME SIZE ALLOC FREE CAP HEALTH ALTROOT iscsi1 5.91G 167K 5.91G 0% ONLINE - vidar/# zpool status pool: iscsi1 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM iscsi1 ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c2t2d0 ONLINE 0 0 0 c2t3d0 ONLINE 0 0 0 c2t4d0 ONLINE 0 0 0 c2t5d0 ONLINE 0 0 0 errors: No known data errors
To destroy a pool, use "zpool destroy":
vidar/# zpool destroy zfstest
Using zfs (basics)
This zpool "iscsi1" also has one incorporated zfs filesystem on it. To manipulate zfs there is the "zfs" command. So keep in mind: zpool manipulates pool storage, zfs manipulates zfs generation and options.
vidar/# zfs list NAME USED AVAIL REFER MOUNTPOINT iscsi1 107K 4.83G 34.9K /iscsi1
As you can see, the pool "iscsi1" also has a filesystem on it, mounted automatically at mountpoint /zfstest.
You may create a new filesystem by using "zfs create".
vidar/# zfs create iscsi1/affe vidar/# zfs list NAME USED AVAIL REFER MOUNTPOINT iscsi1 157K 4.83G 34.9K /iscsi1 iscsi1/affe 34.9K 4.83G 34.9K /iscsi1/affe
New filesystems within a pool are always named "poolname/filesystemname". Without any additional options, it will also mount automatically on "/poolname/filesystemname".
vidar/# zfs create iscsi1/elefant vidar/# zfs list NAME USED AVAIL REFER MOUNTPOINT iscsi1 201K 4.83G 36.5K /iscsi1 iscsi1/affe 34.9K 4.83G 34.9K /iscsi1/affe iscsi1/elefant 34.9K 4.83G 34.9K /iscsi1/elefant
We see some differences between old-fashioned filesystems and zfs: Usable storage is shared among all filesystems in a pool. "iscsi1/affe" has 4.83G available, "iscsi1/elefant" also, as does the master pool filesystem "iscsi1". So why create filesystems then? Couldn't we just use subdirectories in our master pool filesystem "iscsi1" (mounted on /iscsi1)? The "trick" about zfs filesystems is the possibility to assign options to them, so they can be treated differently. We will see that later. First, let's push some senseless data on our newly created filesystem.
vidar/iscsi1/affe# mkfile 1g /iscsi1/affe/randomfile
This command creates a file "randomfile" in directory /iscsi1/affe, consisting of 1GB. That's big enough for our purpose. "zfs list" reads:
vidar/# zfs list NAME USED AVAIL REFER MOUNTPOINT iscsi1 1023M 3.83G 38.2K /iscsi1 iscsi1/affe 1023M 3.83G 1023M /iscsi1/affe iscsi1/elefant 34.9K 3.83G 34.9K /iscsi1/elefant
1023 megabytes are used from filesystem iscsi1/affe, as expected. Notice also that now every other filesystem on that pool only can allocate 3.83G, as 1023M are taken (compare with 4.83G above, before creating that big file).
You CAN look up free space in your zfs filesystems also doing a "df -k", but I wouldn't recommend it: You won't see snapshots and the numbers can be very big.
vidar/# df -h Filesystem size used avail capacity Mounted on /dev/dsk/c1t0d0s0 77G 6.6G 69G 9% / /devices 0K 0K 0K 0% /devices ctfs 0K 0K 0K 0% /system/contract proc 0K 0K 0K 0% /proc mnttab 0K 0K 0K 0% /etc/mnttab swap 2.0G 980K 2.0G 1% /etc/svc/volatile objfs 0K 0K 0K 0% /system/object sharefs 0K 0K 0K 0% /etc/dfs/sharetab fd 0K 0K 0K 0% /dev/fd /dev/dsk/c1t1d0s7 79G 3.7G 74G 5% /var swap 2.0G 8K 2.0G 1% /tmp swap 2.0G 32K 2.0G 1% /var/run /vol/dev/dsk/c0t0d0/sol_10_910_sparc 2.1G 2.1G 0K 100% /cdrom/sol_10_910_sparc /hgfs 16G 4.0M 16G 1% /hgfs iscsi1 4.8G 38K 3.8G 1% /iscsi1 iscsi1/affe 4.8G 1023M 3.8G 21% /iscsi1/affe iscsi1/elefant 4.8G 35K 3.8G 1% /iscsi1/elefant
So let's try out first option: "quota". As you can imagine, "quota" limits storage. You know that as nearly every mailbox provider do impose a quota on your storage, as do file space providers. First: To set and get options, you need to use "zfs set" and "zfs get", respectively. So here we define a quota on 'iscsi1/elefant'
vidar/# zfs set quota=1G iscsi1/elefant vidar/# zfs list NAME USED AVAIL REFER MOUNTPOINT iscsi1 1023M 3.83G 38.2K /iscsi1 iscsi1/affe 1023M 3.83G 1023M /iscsi1/affe iscsi1/elefant 34.9K 1024M 34.9K /iscsi1/elefant
Only 1G left to use at mountpoint /iscsi1/elefant. Note, that you may still gobble up 3.83G in /iscsi1/affe, making it impossible then to put 1G in /iscsi1/elefant. So a quota does not guarantee any storage, it only limits it. To guarantee a certain amount of storage, use the option "reservation":
vidar/# zfs set reservation=1G iscsi1/elefant vidar/# zfs list NAME USED AVAIL REFER MOUNTPOINT iscsi1 2.00G 2.83G 38.2K /iscsi1 iscsi1/affe 1023M 2.83G 1023M /iscsi1/affe iscsi1/elefant 34.9K 1024M 34.9K /iscsi1/elefant
Now we simulated a classical "partition" - we reserved the same amount of storage as the quota implies, 1G. The other filesystems only have 2.83G left, as 1 G are really reserved for iscsi1/elefant.
Now, let's try another nice option: compression Perhaps now you are thinking about compression nightmares on windows systems, like doublespace, stacker and all these other parasital programs which killed performance, not storage. Forget them! zfs compression IS reliable and - fast! With todays' CPU power the effect of compressing and decompressing objects is a charm and won't harm significantly your overall performance - it can boost performance as you will need less i/o due to compression. As with many other zfs options, changing the compression only affects newly written files/sectors. Uncompressed blocks still can be read. It's transparent to the application. fseek() et.al. do not even notice that files are compressed.
vidar/# set compression=on iscsi1/elefant
Logical Volumes
A logical volume exported as a raw or block device. This type of dataset should only be used under special cir-cumstances. File systems are typically used in most environments. The volume is exported as a block device in /dev/zvol/{dsk,rdsk}/path, where path is the name of the volume in the ZFS namespace. The size represents the logical size as exported by the device. By default, a reservation of equal size is created. Size is automatically rounded up to the nearest 128 KB to ensure that the volume has an integral number of blocks regardless of blocksize.
vidar/dev# zfs create -V 4G iscsi1/volume1 vidar/dev# zfs list NAME USED AVAIL REFER MOUNTPOINT iscsi1 4.13G 716M 34.9K /iscsi1 iscsi1/volume1 4.13G 4.83G 26.6K -
enable iSCSI
Enabling iSCSI on a zfs volume is pretty easy.
vidar/# zfs shareiscsi=on iscsi/affe
If you set 'shareiscsi=on' to 'iscsi1' then all volumes beyond will be available as iSCSI targets.