Browsing the tag postgresql
I recently learned a hard lesson with PostgreSQL’s long lived connections; persistent connections expand in memory size without giving any back to the OS. I saw our PostgreSQL database go from 1.2gigs (shared_buffers set to 1024m) to 3.8gigs in memory size. Almost 2.6gigs of memory wasted on persistent connections held on by our connection pool. Moral of the story? Configure your connection pool to recycle Postgresql connections.
I recently had a WD Raptor drive die in a server that hosted our PostgreSQL database. I had a ZFS snapshot strategy setup that sent over ZFS snapshots of the live database to a ZFS mirror for backup purposes. Looked good in theory right? Except, I forgot to do one critical thing, test my backups. Long story short, I had a bunch of snapshots that were useless. Luckily I had offsite nightly PostgreSQL dumps that I did test which were used to seed my development database. So in the end I avoided catastrophic data failure.
With that lesson in mind, I reconfigured our backup system to do it correctly after re-reading the PostgreSQL documentation.
Prerequisite: You must have WAL archiving on and have the archive directory under your database directory. For example if your database is under /rpool/pgdata/db1 configure your archive directory under /rpool/pgdata/db1/archives
Completely optional but I highly suggest you automate your backups; My zbackup ruby script is pretty simple to setup.
This is how my /rpool/pgdata/db1 Looks like:
victori@opensolaris:/# ls /rpool/data/db1 archives pg_clog pg_multixact pg_twophase postmaster.log backup_label pg_hba.conf pg_stat_tmp PG_VERSION postmaster.opts base pg_ident.conf pg_subtrans pg_xlog postmaster.pid global pg_log pg_tblspc postgresql.conf
Source for my pgsnap.sh script.
PGPASSWORD=”mypass” psql -h fab2 postgres -U myuser -c “select pg_start_backup(‘nightly’,true);”
/usr/bin/ruby /opt/zbackup.rb rpool/pgdata 7
PGPASSWORD=”mypass” psql -h fab2 postgres -U myuser -c “select pg_stop_backup();”
rm /rpool/pgdata/db1/archives/*
The process is quite simple. You issue a command to initiate the backup process so PostgreSQL goes into “backup mode.” Second, you do the ZFS snapshot, in this case I am using my zbackup ruby script. Third, you issue another SQL command to PostgrSQL to get out of backup mode. Lastly, since you have the database snapshot you can safely delete your previous WAL archives.
Now, this is all nice and dandy but you should *TEST* your backups, before assuming your backup strategy actually worked.
postgres –singleuser mydb -D /rpool/pgtest/db1
Basically you clone the snapshot and test it by running it under PostgreSQL in single user mode. Once in singleuser mode, test out your backup to make sure it is readable, you can issue a SQL queries to confirm that all is fine with the backup.
ZFS you rock my world
Speedy PostgreSQL Parallel Compression Dumps
2 Comments | Filed under administration main open sourceI used to backup our database using the following statement;
Once our dataset grew into the gigabytes, it took a very long time to do database dumps. Today, I stumbled upon yet another awesome blog post done by Ted Dzibua mentioning two useful parallel compression utilities. So why not try parallel compression with PostgreSQL dumps?
pbzip2 – Parallel BZIP2: Parallel implementation of BZIP2. BZIP2 is well known for being balls slow, so speed it up using multiple CPUs.
pigz – Parallel GZIP: Parallel implementation of GZIP written by Mark Adler.
Time to try this out with our PostgreSQL dump, here are the result times.
• This was done on a quad core xeon 2.66ghz machine.
real 2m7.332s
user 1m16.414s
sys 0m8.233s
# time pg_dump -U secret -h fab2 somedb | pbzip2 -c > somedb.bz2
real 4m14.253s
user 10m35.879s
sys 0m10.904s
The original database was 1.6gigs. The compressed files came out to….
147M somedb.bz2
194M somedb.gz
And just to make this post complete, to pipe the SQL dump back into PostgreSQL
# createdb somedb
# gzip -d -c somedb.gz | psql somedb

