Browsing the tag postgresql
Speedy PostgreSQL Parallel Compression Dumps
2 Comments | Filed under administration main open sourceI used to backup our database using the following statement;
Once our dataset grew into the gigabytes, it took a very long time to do database dumps. Today, I stumbled upon yet another awesome blog post done by Ted Dzibua mentioning two useful parallel compression utilities. So why not try parallel compression with PostgreSQL dumps?
pbzip2 – Parallel BZIP2: Parallel implementation of BZIP2. BZIP2 is well known for being balls slow, so speed it up using multiple CPUs.
pigz – Parallel GZIP: Parallel implementation of GZIP written by Mark Adler.
Time to try this out with our PostgreSQL dump, here are the result times.
• This was done on a quad core xeon 2.66ghz machine.
real 2m7.332s
user 1m16.414s
sys 0m8.233s
# time pg_dump -U secret -h fab2 somedb | pbzip2 -c > somedb.bz2
real 4m14.253s
user 10m35.879s
sys 0m10.904s
The original database was 1.6gigs. The compressed files came out to….
147M somedb.bz2
194M somedb.gz
And just to make this post complete, to pipe the SQL dump back into PostgreSQL
# createdb somedb
# gzip -d -c somedb.gz | psql somedb

