CouchDB on ZFS

CouchDB was made for next generation filesystems such as ZFS and BTRFS. First off, unlike PostgreSQL or MySQL, CouchDB can be snapshot while in production without any flushing or locking trickery since it uses an append only B-Tree storage approach. That alone makes it a compelling database choice on ZFS/BTRFS.

Second, CouchDB works hand-in-hand with ZFS’s block level compression. ZFS can compress blocks of data as they are being written out to the disk. However, it only does it for new blocks and not retroactively. Now, the awesome part, CouchDB on compaction writes out a brand new database file which can utilize the new gzip compression settings on ZFS. This means you can try out different gzip compression settings just by compacting your CouchDB.

Some tips on running CouchDB on ZFS:

1. Use automated snapshots to prevent $admin error, it is painless with ZFS and CouchDB loves being snapshot 😉

You can give my little ruby script a try for daily snapshots; I use it both on Mac OSX and Solaris for automated ZFS snapshot goodness.

zfs snapshot rpool/couchdb@mysnapshot-tuesday

2. Try out various gzip compression schemes on your CouchDB workload, re-compact the database to use the new gzip compression settings. I personally use the gzip-4 compression for our workload which strikes the perfect balance between space and cpu utilization.

zfs set compression=gzip-4 rpool/couchdb

3. Set the ZFS dataset to 4k block record size and turn off atime. Yes the B-Tree append only approach is elastic on writes but you can have near perfect tiny writes with a small 4k block record size.

zfs set recordsize=4k rpool/couchdb zfs set atime=off rpool/couchdb

7 comments so far

Ruben Fonseca says:

June 29, 2010 at 4:28 am

Hi! Thank you for your post. Sorry if this question is dumb, but how can I have test ZFS on OS X? Can’t find any pointers.. Thank you!
Till says:

June 29, 2010 at 7:58 am

Hey,

I think we talked in #couchdb the other day.

What I was wondering is, say I have a constant i/o on my CouchDB database. Do I still need to shut CouchDB down for a consistent snapshot? Or how does it work?

I know that e.g. when I snapshot EBS (on AWS/EC2), I need to turn off all activity on the volume for both consistency and semi decent performance for the snapshot process.

Curious, curious!

Thanks for blogging this!
Till
Jan says:

July 2, 2010 at 2:41 am

Hi Till,

you do not need to turn off writes to snaspshot CouchDB ever. For performance reasons perhaps, but never for consistency reasons.

CouchDB’s pure tail-append storage allows hot snapshotting at any time.

Cheers
Jan
—
J Chris A says:

July 13, 2010 at 5:54 pm

Thanks for this article. It definitely illustrates CouchDB’s unique simplicity.
Dustin Sallings says:

July 13, 2010 at 11:38 pm

Ruben Fonseca: ZFS on OS X: http://dustin.github.com/2009/10/23/mac-zfs.html

I use it for all my media.
victori says:

July 14, 2010 at 4:39 pm

I am using the 10a286 beta bits from the 10.6 pre-releases; http://victori.uploadbooth.com/osx86/zfs-10a286.tar.gz

It is much more functional than the earlier open source implementation that is currently out at this time. My two cents.

oh and I use it for *all* my media on a 3x500G raidz volume (1.3TB of space).
Job Sky says:

September 25, 2018 at 11:44 am

know that e.g. when I snapshot EBS (on AWS/EC2), I need to turn off all activity on the volume for both consistency and semi decent performance for the snapshot process

Letsgetdugg

7 comments so far

Leave a Reply

About Me

Subscribe

Open Source Projects

Widgets

Recent Posts

Recent Comments

Archives

Categories

Blogroll

Projects

Meta

Letsgetdugg

CouchDB on ZFS

7 comments so far

Leave a Reply

About Me

Subscribe

Open Source Projects

Widgets

Recent Posts

Recent Comments

Archives

Categories

Blogroll

Projects

Tags

Meta