Speedy PostgreSQL Parallel Compression Dumps

I used to backup our database using the following statement;

pg_dump -h fab2 -Fc -Z9 somedb > somedb.db

Once our dataset grew into the gigabytes, it took a very long time to do database dumps. Today, I stumbled upon yet another awesome blog post done by Ted Dzibua mentioning two useful parallel compression utilities. So why not try parallel compression with PostgreSQL dumps?

pbzip2 – Parallel BZIP2: Parallel implementation of BZIP2. BZIP2 is well known for being balls slow, so speed it up using multiple CPUs.

pigz – Parallel GZIP: Parallel implementation of GZIP written by Mark Adler.

Time to try this out with our PostgreSQL dump, here are the result times.

• This was done on a quad core xeon 2.66ghz machine.

# time pg_dump -U secret -h fab2 somedb | pigz -c > somedb.gz


real	2m7.332s

user	1m16.414s

sys	0m8.233s
# time pg_dump -U secret -h fab2 somedb | pbzip2 -c > somedb.bz2

real 4m14.253s user 10m35.879s sys 0m10.904s

The original database was 1.6gigs. The compressed files came out to….

# du -hs somedb.* 147M somedb.bz2 194M somedb.gz

And just to make this post complete, to pipe the SQL dump back into PostgreSQL

# dropdb somedb # createdb somedb # gzip -d -c somedb.gz | psql somedb

2 comments so far

Chris says:

January 27, 2010 at 11:59 pm

Since you are using opensolaris, why bother with sql dumps ? Leveraging ZFS send/recv with incremental snapshot is imo a much better option. We backup our psql data every minute that way and it works beautifully. And if you really need a SQL dump, then you can at a given time have a dummmy psql postmater on another server which run against a snapshot of your bakcup and dump the in sql your data somewhere.
victori says:

February 2, 2010 at 8:09 pm

I need to do SQL dumps for my workstation that runs on OSX. I can’t do work against the live database, people actively use the service throughout the day. Even If I could do work against the live database, it would be a horrible idea and a disaster just waiting to happen.

Letsgetdugg