Browsing the 2010 June archive
CouchDB was made for next generation filesystems such as ZFS and BTRFS. First off, unlike PostgreSQL or MySQL, CouchDB can be snapshot while in production without any flushing or locking trickery since it uses an append only B-Tree storage approach. That alone makes it a compelling database choice on ZFS/BTRFS.
Second, CouchDB works hand-in-hand with ZFS’s block level compression. ZFS can compress blocks of data as they are being written out to the disk. However, it only does it for new blocks and not retroactively. Now, the awesome part, CouchDB on compaction writes out a brand new database file which can utilize the new gzip compression settings on ZFS. This means you can try out different gzip compression settings just by compacting your CouchDB.
Some tips on running CouchDB on ZFS:
1. Use automated snapshots to prevent $admin error, it is painless with ZFS and CouchDB loves being snapshot
You can give my little ruby script a try for daily snapshots; I use it both on Mac OSX and Solaris for automated ZFS snapshot goodness.
2. Try out various gzip compression schemes on your CouchDB workload, re-compact the database to use the new gzip compression settings. I personally use the gzip-4 compression for our workload which strikes the perfect balance between space and cpu utilization.
3. Set the ZFS dataset to 4k block record size and turn off atime. Yes the B-Tree append only approach is elastic on writes but you can have near perfect tiny writes with a small 4k block record size.
zfs set atime=off rpool/couchdb
I recently had a WD Raptor drive die in a server that hosted our PostgreSQL database. I had a ZFS snapshot strategy setup that sent over ZFS snapshots of the live database to a ZFS mirror for backup purposes. Looked good in theory right? Except, I forgot to do one critical thing, test my backups. Long story short, I had a bunch of snapshots that were useless. Luckily I had offsite nightly PostgreSQL dumps that I did test which were used to seed my development database. So in the end I avoided catastrophic data failure.
With that lesson in mind, I reconfigured our backup system to do it correctly after re-reading the PostgreSQL documentation.
Prerequisite: You must have WAL archiving on and have the archive directory under your database directory. For example if your database is under /rpool/pgdata/db1 configure your archive directory under /rpool/pgdata/db1/archives
Completely optional but I highly suggest you automate your backups; My zbackup ruby script is pretty simple to setup.
This is how my /rpool/pgdata/db1 Looks like:
victori@opensolaris:/# ls /rpool/data/db1 archives pg_clog pg_multixact pg_twophase postmaster.log backup_label pg_hba.conf pg_stat_tmp PG_VERSION postmaster.opts base pg_ident.conf pg_subtrans pg_xlog postmaster.pid global pg_log pg_tblspc postgresql.conf
Source for my pgsnap.sh script.
PGPASSWORD=”mypass” psql -h fab2 postgres -U myuser -c “select pg_start_backup(‘nightly’,true);”
/usr/bin/ruby /opt/zbackup.rb rpool/pgdata 7
PGPASSWORD=”mypass” psql -h fab2 postgres -U myuser -c “select pg_stop_backup();”
rm /rpool/pgdata/db1/archives/*
The process is quite simple. You issue a command to initiate the backup process so PostgreSQL goes into “backup mode.” Second, you do the ZFS snapshot, in this case I am using my zbackup ruby script. Third, you issue another SQL command to PostgrSQL to get out of backup mode. Lastly, since you have the database snapshot you can safely delete your previous WAL archives.
Now, this is all nice and dandy but you should *TEST* your backups, before assuming your backup strategy actually worked.
postgres –singleuser mydb -D /rpool/pgtest/db1
Basically you clone the snapshot and test it by running it under PostgreSQL in single user mode. Once in singleuser mode, test out your backup to make sure it is readable, you can issue a SQL queries to confirm that all is fine with the backup.
ZFS you rock my world
SYEnc is a Scala decoder for the yEnc format that is based on Alex Russ’s Java yEnc Decoder. SYEnc was designed to be used as a library by applications needing to use yEnc to decode data. It should be thread-safe, so don’t worry about using it in a threaded context.
Uploaded to github: http://github.com/victori/syenc
Example:

(4 votes, average: 4.75 out of 5)