Having set up ZFS on my server's partitions now, I wanted to get an automated backup going. There's a few posts on automatic ZFS snapshots kicking around (see ZFS Automatic Snapshots in Nevada 100 and Jeff's snapshot script) but the former was heavily tied into Solaris' management system, whilst the latter just created thousands of snapshots nearly indistinguishable from another.
So, without further fanfare, here is AlBlue's crontab generated ZFS snapshots (with apologies in advance for Blogger's rendering:
#Pool is called 'Data', search and replace for yours ... @reboot /usr/sbin/zpool scrub Data @daily /usr/sbin/zpool scrub Data @hourly /usr/sbin/zpool status Data | /usr/bin/egrep -q "scrub completed|none requested" && /usr/sbin/zfs snapshot -r Data@AutoH-`date +"\%FT\%H:\%M"` @daily /usr/sbin/zfs snapshot -r Data@AutoD-`date +"\%F"` @weekly /usr/sbin/zfs snapshot -r Data@AutoW-`date +"\%Y-\%U"` @monthly /usr/sbin/zfs snapshot -r Data@AutoM-`date +"\%Y-\%m"` @yearly /usr/sbin/zfs snapshot -r Data@AutoY-`date +"\%Y"` # do a spot of housecleaning - somewhat assumes the daily ones have run .. @hourly /usr/sbin/zpool status Data | /usr/bin/egrep -q "scrub completed|none requested" && /usr/sbin/zfs list -t snapshot -o name | /usr/bin/grep Data@AutoH- | /usr/bin/sort -r | /usr/bin/tail -n +26 | /usr/bin/xargs -n 1 /usr/sbin/zfs destroy -r @daily /usr/sbin/zfs list -t snapshot -o name | /usr/bin/grep Data@AutoD- | /usr/bin/sort -r | /usr/bin/tail -n +9 | /usr/bin/xargs -n 1 /usr/sbin/zfs destroy -r @weekly /usr/sbin/zfs list -t snapshot -o name | /usr/bin/grep Data@AutoW- | /usr/bin/sort -r | /usr/bin/tail -n +7 | /usr/bin/xargs -n 1 /usr/sbin/zfs destroy -r @monthly /usr/sbin/zfs list -t snapshot -o name | /usr/bin/grep Data@AutoM- | /usr/bin/sort -r | /usr/bin/tail -n +14 | /usr/bin/xargs -n 1 /usr/sbin/zfs destroy -r
Caveats:
- Use at your own risk
- This works for a ZFS pool called 'Data', because that's what mine is called. You can use any name you want.
- It's a bit fast-and-ugly, but that was good for me at the time. You could pull it out in a script and parameterise the pool name(s) if you wanted.
- It would be much better to do this on individual file sets which have a particular property set to control which file systems get handed; this does everything. Again, met my needs, might not meet yours. Tim's approach is to use custom attributes; you could do this to select hourly or otherwise backups.
- This deletes snapshots starting with Auto[HDWMY]-, so don't call your own ones that unless you want to kiss them goodbye.
The cleanup routine helps here. It uses some script kung-fu, and relies on the fact that lexicographic sorting of the snapshot names (for each subset of [HDWMY]) is also the time ordering. The plan is to keep 24xH snapshots, then 7xD snapshots, then 5xW snapshots and then don't delete yearlies at all. However, (a) there's an off-by-one in my kung-fu, and you really want to let the weekly complete before deleting the last daily. So I added a couple in each case for a safety margin.
Note that if you run this, the snapshots consume space ... be warned! Even if you delete something, it could be kicking around in your snapshot space somewhere (although they'll helpfully tell you which ones are causing the problems). Here's what my snapshots look like now:
Data/Users/alex@AutoD-2008-11-21 Data/Users/alex@AutoD-2008-11-22 Data/Users/alex@AutoH-2008-11-21T23:40 Data/Users/alex@AutoH-2008-11-22T00:21 Data/Users/alex@AutoH-2008-11-22T00:36 Data/Users/alex@AutoH-2008-11-22T00:41 Data/Users/alex@AutoH-2008-11-22T00:51 Data/Users/alex@AutoH-2008-11-22T01:00 Data/Users/alex@AutoM-2008-11 Data/Users/alex@AutoW-2008-46 Data/Users/alex@AutoY-2008
Please feel free to leave feedback, copy and improve on it. A link back here would also be appreciated; you can keep up to date with all the ZFS tagged posts. My thanks to Mike for pointing out an improvement to the hourly snapshot generation guard.

7 comments:
nice :)
NB I've updated it since the original post - needed to put a -r on the zfs destroy, since otherwise it left child fs hanging around ...
Note that doing a snapshot/destroy whilst a pool scrub is in process results in the scrub re-starting. in this case, if your scrub takes more than 1h to complete (mine takes 1h30, for a 90G load across 15 fs and on a FW400 dual-hard drive), then it goes into a state of permanent scrubbing.
I've updated the script so that it checks the status of the pool prior to executing the hourly snapshot/destroy, which should avoid conflicting with the nightly scrub.
Alex,
I found that the hourly snapshots weren't running on my MBP because there's another condition "none requested" that's being missed by /usr/bin/grep -q "scrub completed". I changed it to : /usr/bin/egrep -q "scrub completed|none requested" and it now works.
regards,
Mike
Mike,
You're right, of course - a pool which hasn't undergone a scrub yet will say 'none requested'. So until you scrub it a first time, the automatic scrub won't work unless you amend it in the way you've shown. I'll update my original post to include your code because it makes sense to cover all options.
Thanks a lot for the feedback,
Alex
Fantastic script! Is there anyway to amend it to work with remote ZFS pools over SSH (say, for a MacBook on a network) or have the cron call fail gracefully if either the local disk or remote server are unavailable?
Post a Comment