Speedy PostgreSQL Parallel Compression Dumps
I used to backup our database using the following statement;
Once our dataset grew into the gigabytes, it took a very long time to do database dumps. Today, I stumbled upon yet another awesome blog post done by Ted Dzibua mentioning two useful parallel compression utilities. So why not try parallel compression with PostgreSQL dumps?
pbzip2 – Parallel BZIP2: Parallel implementation of BZIP2. BZIP2 is well known for being balls slow, so speed it up using multiple CPUs.
pigz – Parallel GZIP: Parallel implementation of GZIP written by Mark Adler.
Time to try this out with our PostgreSQL dump, here are the result times.
• This was done on a quad core xeon 2.66ghz machine.
# time pg_dump -U secret -h fab2 somedb | pbzip2 -c > somedb.bz2
The original database was 1.6gigs. The compressed files came out to….
And just to make this post complete, to pipe the SQL dump back into PostgreSQL
# createdb somedb
# gzip -d -c somedb.gz | psql somedb