The MacView

Virtual Instrumentation from a Mac perspective

Tuesday, June 26, 2007

What ZFS support means

There have been a lot of ramblings about ZFS support in Leopard. Many people have wondered what the big deal is, and how it would help the average Mac user. Personally, I was just waiting for Steve Jobs to say the words "boot ZFS" and my life would have been compelte (well, not really, but it would have been pretty cool).

First, a little history (as I remember it, so it may not be completely accurate, but its close enough). The first filesystem (the way files are stored on disk) on the Macintosh was MFS (Multiple File System). MFS was very short lived (I have never actually used it, just heard about it). It did not have hierarchical file system (no folders). HFS (Hierarchical File System) replaced MFS pretty quickly. HFS was written when most people still booted off of 3.5" floppy disks (1.44 MB max) and computer RAM was measured in kB not GB. Then Apple upgraded HFS to HFSPlus. It handled much bigger drives, supported Unicode better and was just overall a better, more modern filesystem. The last little tweak Apple did to HFSPlus was to add journalling support (the filesystem better handled unexpected power outages).

The Sun develops an incredible filesystem, ZFS (Zetabyte File System).

  1. Adding more disk space is easy

    Right now, there are one or more logical disks for every physical disk. Most people are familiar with Macintosh HD which is a logical disk (Macintosh HD on the desktop) and a physical disk (say, a Western Digital inside their Mac). You can partition a physical disk into multiple logical disks. I have done this with a 250 GB external firewire drive. I have a different Mac OS version on each of (now 5) partitions, or logical disks.

    ZFS takes that trend in the reverse direction. You can have multiple physical disks "pooled" together into one logical disk. Imagine the following: you are running out of disk space, so you buy a new, much bigger hard drive. You install it and format it, and now you have two options: (1) migrate everything to the new drive, or (2) do some UNIX command lines to hobble the new disk into the filesystem on the old disk (sym-links, moving the home directory, etc). Neither of these is very clean, and both make things feel like the system is fragile.

    If your main hard drive had been formatted ZFS, you could do the following instead: Tell the OS to add the new drive to the "pool" of drive space available. That's it. The logical disk, Macintosh HD on your desktop, would now have the full storage capacity of both physical disks in your system. You can add as many drives as you can connect to your machine.

  2. Speed benefits of RAID, but in a simpler package

    The idea of "pooling" physical disks together for the logical disk has another benefit that is similar to the speed concepts in RAID. When you write data, you split the data between multiple disks. Then when you read it, both disks work as fast as they can to get their piece of the data requested. This is similar in concept to multiple CPUs (cores) in Macs today. If you can have two or more things sharing the load, you can make it faster.

  3. Failing disks can be detected sooner

    The problem with using multiple disks, is that your risk of disk failure goes up dramatically. Instead of a 5% that one disk will fail, you have a 10% chance that one of the two disks will fail (ok, I don't remember my statistics, or at least didn't want to think too hard about it, but you get the idea). Some RAID schemes solve this problem by storing just enough shared data on the disks, that if one drive goes down, it can continue to give you correct data (although slower) until you replace the drive.

    ZFS has a RAID mode, but you have to have all the disks be the same size. Standard ZFS has a cool feature though. Every block of data that is written, has a checksum written with it. Every time a block is read, it checks to make sure that checksum is correct. As soon as it finds a bad block, you can tell it to remove the bad disk from the "pool" (which will copy the data onto the remaining disks) and then replace the bad disk. You get very early detection of a failing disk, and should lose less data.

  4. Compression is built into the filesystem

    Back in the dark days Back when I used a PC and DOS, there was a cool program called Stacker (later Microsoft had a very similar feature). It allows you to reformat a disk and use compression to get more disk space. ZFS brings this back.

    With ZFS, you can turn compression on or off at any moment. While it is on, any data written to disk will be compressed. When off, data is written in raw, uncompressed format. When reading, it will read whatever format was written, compressed or uncompressed.

    You may thing that adding compression would slow down the filesystem, but it actually speeds it up. Processor and memory speeds have been growing at a much faster pace than disk speeds. So the little bit of time it takes to compress/uncompress the data is nothing compared to the time it takes the disk to read or write the data.

  5. Entire "Disk" revision history available

    ZFS has a feature called snapshots. This allows you to create a special "directory" (really a file) that is a snapshot of the filesystem at that moment in time. Think of it as a live, whole system Time Machine, without the external disk.

    This snapshot "directory" takes next to no time to create (there's nothing to copy), and takes up very minimal space. Basically, whenever you modify a file, the old one is kept in the snapshot. Any files that have not been changed since the snapshot are shared between the two. Its kind of like a whole filesystem diff.

    Imagine working on a project, and you get to a cross-road. A decision on direction is needed. You choose what you think is the best direction, but want to get back to the point your at in your project just in case. The code may not be to a point that you can put it in source code control, so you right click on the folder and select Create an Archive and make a zip file of the "snapshot". If it is a large project, you go to lunch and come back just as it finishes archiving it. You continue on and then realize that it is not the direction you really wanted to go. You unzip your snapshot zip file (again, going to lunch), and only then realize that you missed some files in another folder.

    With ZFS, just take regular snapshots of the entire filesystem. They are quick, small and capture the state of the entire filesystem.

    NOTE: Snapshots are not a replacement for backups. If your system gets fried, you lose your snapshots and your data. Backups, like with Time Machine, are extremely valuable and everyone should have a backup strategy.

So what is Apple's plan with ZFS? Nobody but Apple (maybe even Steve Jobs) really knows, but here is what we do know. Leopard has had some limited support for ZFS. They are stated that Leopard has "read only" support for ZFS (at least the beta the just handed out at WWDC has read only ZFS). MacRumors has posted that Apple is giving developers a beta of read-write ZFS as a separate download.

What I am looking forward to is when Apple replaces Journaled HFSPlus with ZFS as the default filesystem. That means they still need a non-beta read-write filesystem that you can boot Mac OS X off of (booting of ZFS is a fairly new feature on any OS).

So I hope the clamor for ZFS grows and Apple listens. I love HFSPlus, but I have a feeling I would love ZFS a whole lot more.

Labels: , , , , , , ,


Anonymous Anonymous said...

La gran ventaja de los sistemas raid, pasa por la redundancia manteniendo los tiempos de transferencia, por eso los niveles 0 de raid no son los mas efectivos. Pero ojo, que a veces los raid fallan y la recuperaciĆ³n de sus datos se puede convertir en una pesadilla. Si en un momento dado necesitais recuperar datos de varios discos duros en raid os recomiendo las siguiente web :

Sunday, July 29, 2007 11:23:00 AM  
Blogger Marc said...

Rough translation of anne's comments: The great benefit of RAID systems is the redundancy and transfer throughput, so RAID level 0 is not the most effective. But eye?, some times RAID fails and the recovery of data can become burdensome. If in a moment it becomes necessary to recover data from various hard disks in RAID, I recomend the following website:

This looks a bit like comment spam to me.

(I did the translation myself and did not use a service, so it may not be exactly correct).

Monday, July 30, 2007 7:12:00 AM  
Blogger Marcos said...

I believe MFS stood up for Macintosh File System. It could have folders, but only 1 level deep, and they only showed up in the Finder.

The big immediate advantage of going HFS to HFS+ back in the Mac OS 8.1 days was not the long filenames, Unicode, etc. It was the fact that it allowed for the drive to allocate more (thus smaller) blocks, so the minimum file size was way smaller. This was great for people with big hard drives, like the 240 MB (yes, Megabytes) one I had.

Apple did one more tweak to HFS+ with Mac OS X 10.5 (after this post): It allows you to hard-link folders, and this is an important feature in order to implement Time Machine more efficiently.

Wednesday, November 28, 2007 12:37:00 PM  

Post a Comment

<< Home

The views expressed on this website/weblog are mine alone and do not necessarily reflect the views of my employer.