Browsing the topic open source
Clustering Wicket for fun and profit!
Leave a comment | Filed under administration main open source programmingI hate expired sessions, death to all expired sessions. Traditionally a Java servlet container has a fixed session time, a flood of traffic can potentially cause JVM OOM errors if the session time is set too high. I wanted a smart session container that can hold onto sessions for as long as possible and expire sessions only when it is absolutely necessary; A Memcached store would be perfect for this.
There for I recently open sourced the jetty-session-store to solve this problem. With the jetty-session-store you can save your session state to Ehcache, Memcached or the database. State should not be bound to a single JVM, Viva Shared Session Stores!
So now that jetty-session-store is out in the wild you can technically cluster Wicket using just the HttpSessionStore. However, it isn’t very efficient with the way Memcached allocates data in fixed sized cache buckets.
1. Wicket sessions under the HttpSessionStore can get quite large, well over 1Mb in size. A Wicket session not only stores the session state but also the previous serialized pages the user has visited.
2. Serializing and de-serializing a large data structure can get expensive. The HttpSessionStore retains an AccessStackPageMap, which is a list data structure consisting of multiple page map revisions.
So instead of saving one large AccessStackPageMap, I wrote a SecondLevelCacheSessionStore that saves a page map revision per cache entry. This leads to much better cache utilization and a whole lot less serialization on the wire. Not to mention this avoids the whole 1Mb Memcached size limit.
Before you go willy nilly with clustering, read the Wicket render strategies page. Wicket requires session affinity for buffered responses with the default rendering strategy.
Clustering Wicket has never been easier.
Here is an example on how to offload page maps to a hybrid EhCache/Memcached cache. Memcached for long term shared storage while EhCache for short-lived fast cache look ups.
@Override
protected ISessionStore newSessionStore() {
// localhost:11211 — memcached server
// "fabpagestore" — unique appender to avoid key clashes.
// 300 — 5 minute TTL for local ehcache.
return new SecondLevelCacheSessionStore(this,
new CachePageStore(Arrays.asList("localhost:11211"),"fabpagestore",300));
}
}
Here is an example on how to offload page maps to the database.
@Override
protected ISessionStore newSessionStore() {
// "fabpagestore" — unique appender to avoid key clashes.
return new SecondLevelCacheSessionStore(this,new CachePageStore(
new DBCache("jdbc:mysql://foo/mydb", "myname", "mypass", "com.driver.Name", "fabpagestore")));
}
}
Here is my CachePageStore;
import java.util.List;
import org.apache.wicket.Page;
import org.apache.wicket.protocol.http.SecondLevelCacheSessionStore.IClusteredPageStore;
import org.apache.wicket.protocol.http.pagestore.AbstractPageStore;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.base.cache.AsyncMemcache;
import com.base.cache.ICache;
public class CachePageStore extends AbstractPageStore implements IClusteredPageStore {
private ICache cache;
private Logger logger = LoggerFactory.getLogger(CachePageStore.class);
public CachePageStore(final List<String> servers,final String poolName,final int ttl) {
this(new AsyncMemcache(servers, poolName, ttl));
}
public CachePageStore(final ICache cache) {
this.cache = cache;
}
public String getKey(final String sessId,final String pageMapName,final int pageId,final int pageVersion) {
return getKey(sessId,pageMapName,pageId,pageVersion,-1);
}
public String getKey(final String sessId,final String pageMapName,final int pageId,final int pageVersion,final int ajaxVersion) {
String key = sessId+":"+pageMapName+":"+pageId+":"+pageVersion+":"+ajaxVersion;
if(logger.isDebugEnabled()) {
logger.debug("GetKey: "+key);
}
return key;
}
public String storeKey(final String sessionId,final Page page) {
String key = sessionId+":"+page.getPageMapName()+":"+page.getId()+":"+page.getCurrentVersionNumber()+":"+(page.getAjaxVersionNumber()-1);
if(logger.isDebugEnabled()) {
logger.debug("StoreKey: "+key);
}
return key;
}
public boolean containsPage(final String sessionId, final String pageMapName, final int pageId, final int pageVersion) {
return cache.keyExists(getKey(sessionId,pageMapName,pageId,pageVersion));
}
public void destroy() {
}
public <T> Page getPage(final String sessionId, final String pagemap, final int id, final int versionNumber, final int ajaxVersionNumber) {
return (Page) cache.get(getKey(sessionId,pagemap,id,versionNumber,ajaxVersionNumber));
}
public void pageAccessed(final String sessionId, final Page page) {
}
public void removePage(final String sessionId, final String pagemap, final int id) {
String key = getKey(sessionId,pagemap,id,0);
key = key.substring(0,key.lastIndexOf(":"));
for(String k : cache.getKeys()) {
if (k.startsWith(key)) {
cache.remove(k);
}
}
}
public void storePage(final String sessionId, final Page page) {
cache.put(storeKey(sessionId,page), page);
}
public void unbind(final String sessionId) {
for(String key : cache.getKeys()) {
if (key.startsWith(sessionId)) {
cache.remove(key);
}
}
}
}
Update: I feel like a jackass now, I thought I was running this against the stable haproxy build, but in reality this was against haproxy-1.4dev6. DOH! Well on the bright-side, I am helping out the author fix a potentially critical bug. Here is the truss and tcp dump if anyone cares.
Well yet another Solaris specific bug/issue to report. HAProxy resets long running connections. Meaning users on slow bandwidth connections are affected by this. I have sent tcpdumps and logs to the author of HAProxy, hopefully this bug/issue would be resolved. I am writing this as a precautionary warning to other Solaris admins out there.
Here the way to trigger this, see if your service is affected by this.
Result:
–2010-01-20 11:19:29– http://somesite.com/onebigfile.txt
Resolving somesite.com (somesite.com)… 72.11.142.91
Connecting to somesite.com (somesite.com)|72.11.142.91|:84… connected.
HTTP request sent, awaiting response… 200 OK
Length: 3806025 (3.6M)
Saving to: “onebigfile.txt”
7% [====> ] 269,008 20.1K/s in 13s
2010-01-20 11:19:42 (20.1 KB/s) – Read error at byte 269008/3806025 (Connection reset by peer). Retrying.
–2010-01-20 11:19:43– (try: 2) http://somesite.com/onebigfile.txt
Connecting to somesite.com (somesite.com)|72.11.142.91|:84… connected.
HTTP request sent, awaiting response… 200 OK
Length: 3806025 (3.6M)
Saving to: “onebigfile.txt”
4% [==> ] 186,016 20.0K/s eta
/Raging, why are there so many Solaris TCP issues? First Varnish? now HAProxy? ARGHHHHH!@#!@
Speedy PostgreSQL Parallel Compression Dumps
2 Comments | Filed under administration main open sourceI used to backup our database using the following statement;
Once our dataset grew into the gigabytes, it took a very long time to do database dumps. Today, I stumbled upon yet another awesome blog post done by Ted Dzibua mentioning two useful parallel compression utilities. So why not try parallel compression with PostgreSQL dumps?
pbzip2 – Parallel BZIP2: Parallel implementation of BZIP2. BZIP2 is well known for being balls slow, so speed it up using multiple CPUs.
pigz – Parallel GZIP: Parallel implementation of GZIP written by Mark Adler.
Time to try this out with our PostgreSQL dump, here are the result times.
• This was done on a quad core xeon 2.66ghz machine.
real 2m7.332s
user 1m16.414s
sys 0m8.233s
# time pg_dump -U secret -h fab2 somedb | pbzip2 -c > somedb.bz2
real 4m14.253s
user 10m35.879s
sys 0m10.904s
The original database was 1.6gigs. The compressed files came out to….
147M somedb.bz2
194M somedb.gz
And just to make this post complete, to pipe the SQL dump back into PostgreSQL
# createdb somedb
# gzip -d -c somedb.gz | psql somedb
I just pushed up a new version of Satan to GitHub. For the uniformed uninformed Satan is my process reaper for run away unix processes. Satan was designed to work with Solaris’ SMF self-healing properties. Basically, Satan kills while SMF revives. The new version that was pushed up contains HTTP health checks, so Satan now has the ability to kill processes that are not responding back with a HTTP/200 response code.
The motivation behind HTTP health checks was because once a month or so at Fabulously40 our ActiveMQ would break down while still accepting connections, the only way to figure out if it was zombified was to check the HTTP administrator interface. If the ActiveMQ instance was actually knelled over, the administrator interface would come back with a HTTP/500 response code, hence the birth of HTTP health checks.
Here is our Satan configuration file that we use at Fabulously40.
The “args” property might be a bit confusing, it is a snippet of text that Satan looks for in the arguments passed to your application to identify the running process. So for example, if you start your ActiveMQ instance with the following arguments; “java -jar activemq.jar -Dactivemq=8161 -XXXXX” Placing “8161″ in args property would be a good unique identifier for Satan to pick up on.
Satan.watch do |s| s.name = "jvm instances" # name of job s.user = "webservd" # under what user s.group = "webservd" # under what group s.deamon = "java" # deamon binary name to grep for s.args = nil # globally look for specific arguments, optional s.debug = true # if to write out debug information s.safe_mode = false # If in safe mode, satan will not kill ;-( s.interval = 10.seconds # interval to run at to collect statistics s.sleep_after_kill = 1.minute # sleep after killing, satan is tired! s.contact = "victori@fabulously40.com" # admin contact, optional if you want email alerts s.kill_if do |process| process.condition(:cpu) do |cpu| # on cpu condition cpu.name = "50% CPU limit" # name for job cpu.args = "jetty" # make sure this is a jetty process, optional cpu.above = 48.percent # if above certain percentage cpu.times = 5 # how many times we can hit this condition before killing end process.condition(:memory) do |memory| # on memory condition memory.name = "850MB limit" # name for job memory.args = "jetty" # make sure this is a jetty process, optional memory.above = 850.megabytes # limit for memory use memory.times = 5 # how many times we can hit this condition before killing end # ActiveMQ tends to die on us under heavy load so we need the power of satan! process.condition(:http) do |http| # on http condition http.name = "HTTP ActiveMQ Check" # name for job http.args = "8161" # look for specific app arguments # to associate app to URI http.uri = "http://localhost:8161/admin/queues.jsp" # the URI http.times = 5 # how many times before kill end end end
I have finally nailed out all our issues surrounding Varnish on Solaris, thanks to the help of sky from #varnish. Apparently Varnish uses a wrapper around connect() to drop stale connections to avoid thread pileups if the back-end ever dies. Setting connect_timeout to 0 will force Varnish to use connect() directly. This should eliminate all 503 back-end issues under Solaris that I have mentioned in an earlier blog post.
Here is our startup script for varnish that works for our needs. Varnish is a 64-bit binary hence the “-m64″ cc_command passed.
rm /sessions/varnish_cache.bin
newtask -p highfile /opt/extra/sbin/varnishd -f /opt/extra/etc/varnish/default.vcl -a 72.11.142.91:80 -p listen_depth=8192 -p thread_pool_max=2000 -p thread_pool_min=12 -p thread_pools=4 -p cc_command=’cc -Kpic -G -m64 -o %o %s’ -s file,/sessions/varnish_cache.bin,4G -p sess_timeout=10s -p max_restarts=12 -p session_linger=50s -p connect_timeout=0s -p obj_workspace=16384 -p sess_workspace=32768 -T 0.0.0.0:8086 -u webservd -F
I noticed varnish had particular problem of keeping connections around in CLOSE_WAIT state for a long time, enough to cause issues. I did some tuning on Solaris’s TCP stack so it is more aggressive in closing sockets after the work has been done.
Here are my aggressive TCP settings to force Solaris to close off connections in a short duration of time, to avoid file descriptor leaks. You can merge the following TCP tweaks with the settings I have posted earlier to handle more clients.
/usr/sbin/ndd -set /dev/tcp tcp_fin_wait_2_flush_interval 67500
# 30 seconds, aggressively close connections – default 4 minutes on solaris < 8
/usr/sbin/ndd -set /dev/tcp tcp_time_wait_interval 30000
# 1 minute, poll for dead connection - default 2 hours
/usr/sbin/ndd -set /dev/tcp tcp_keepalive_interval 60000
Last but not least, I have finally swapped out ActiveMQ for the FUSE message broker, an “enterprise” ActiveMQ distribution. Hopefully it won’t crash once a week like ActiveMQ does for us. The FUSE message broker is based off of ActiveMQ 5.3 sources that fix various memory leaks found in the current stable release of ActiveMQ 5.2 as of this writing.
If the FUSE message broker does not work out, I might have to give Kestrel a try. Hey, if it worked for twitter, it should work for us…right?
I finally got around to open sourcing our scala memcached implementation that we use at fabulously40 for session storage.
Since wicket sessions can vary greatly in size, using the standard memcached server implementation became impractical due to the slab allocator.
The current code on github lacks the ehcache store and an Actor IoHandler adapter. The internal SMemcached application at fabulously40 uses a private caching API so we can hook up various caching backend storage implementations such as mysql, postgresql, ehcache or even another memcached server. You can grab the TCache project on github that SMemcached uses to unify caching under a single API. This gives SMemcached a lot of flexibility when it comes to caching your data.
fyi. TCache stands for “Tanek” Cache, Tanek means cache in russian.
The project works quite well, but don’t use it in production just yet since there is no data expiration for cached data in the HashMap storage implementation. This is just a technical preview. Do use it in production, this is what we use at Fabulously40

(5 votes, average: 4.80 out of 5)
(2 votes, average: 4.00 out of 5)