I have finally nailed out all our issues surrounding Varnish on Solaris, thanks to the help of sky from #varnish. Apparently Varnish uses a wrapper around connect() to drop stale connections to avoid thread pileups if the back-end ever dies. Setting connect_timeout to 0 will force Varnish to use connect() directly. This should eliminate all 503 back-end issues under Solaris that I have mentioned in an earlier blog post.

Here is our startup script for varnish that works for our needs. Varnish is a 64-bit binary hence the “-m64” cc_command passed.


#!/bin/sh

rm /sessions/varnish_cache.bin

newtask -p highfile /opt/extra/sbin/varnishd -f /opt/extra/etc/varnish/default.vcl -a 72.11.142.91:80 -p listen_depth=8192 -p thread_pool_max=2000 -p thread_pool_min=12 -p thread_pools=4 -p cc_command='cc -Kpic -G -m64 -o %o %s' -s file,/sessions/varnish_cache.bin,4G -p sess_timeout=10s -p max_restarts=12 -p session_linger=50s -p connect_timeout=0s -p obj_workspace=16384 -p sess_workspace=32768 -T 0.0.0.0:8086 -u webservd -F

I noticed varnish had particular problem of keeping connections around in CLOSE_WAIT state for a long time, enough to cause issues. I did some tuning on Solaris’s TCP stack so it is more aggressive in closing sockets after the work has been done.

Here are my aggressive TCP settings to force Solaris to close off connections in a short duration of time, to avoid file descriptor leaks. You can merge the following TCP tweaks with the settings I have posted earlier to handle more clients.


# 67 seconds default 675 seconds
/usr/sbin/ndd -set /dev/tcp tcp_fin_wait_2_flush_interval 67500

# 30 seconds, aggressively close connections - default 4 minutes on solaris < 8 /usr/sbin/ndd -set /dev/tcp tcp_time_wait_interval 30000 # 1 minute, poll for dead connection - default 2 hours /usr/sbin/ndd -set /dev/tcp tcp_keepalive_interval 60000

Last but not least, I have finally swapped out ActiveMQ for the FUSE message broker, an "enterprise" ActiveMQ distribution. Hopefully it won't crash once a week like ActiveMQ does for us. The FUSE message broker is based off of ActiveMQ 5.3 sources that fix various memory leaks found in the current stable release of ActiveMQ 5.2 as of this writing.

If the FUSE message broker does not work out, I might have to give Kestrel a try. Hey, if it worked for twitter, it should work for us...right?