Browsing the topic programming
Clustering Wicket for fun and profit!
Leave a comment | Filed under administration main open source programmingI hate expired sessions, death to all expired sessions. Traditionally a Java servlet container has a fixed session time, a flood of traffic can potentially cause JVM OOM errors if the session time is set too high. I wanted a smart session container that can hold onto sessions for as long as possible and expire sessions only when it is absolutely necessary; A Memcached store would be perfect for this.
There for I recently open sourced the jetty-session-store to solve this problem. With the jetty-session-store you can save your session state to Ehcache, Memcached or the database. State should not be bound to a single JVM, Viva Shared Session Stores!
So now that jetty-session-store is out in the wild you can technically cluster Wicket using just the HttpSessionStore. However, it isn’t very efficient with the way Memcached allocates data in fixed sized cache buckets.
1. Wicket sessions under the HttpSessionStore can get quite large, well over 1Mb in size. A Wicket session not only stores the session state but also the previous serialized pages the user has visited.
2. Serializing and de-serializing a large data structure can get expensive. The HttpSessionStore retains an AccessStackPageMap, which is a list data structure consisting of multiple page map revisions.
So instead of saving one large AccessStackPageMap, I wrote a SecondLevelCacheSessionStore that saves a page map revision per cache entry. This leads to much better cache utilization and a whole lot less serialization on the wire. Not to mention this avoids the whole 1Mb Memcached size limit.
Before you go willy nilly with clustering, read the Wicket render strategies page. Wicket requires session affinity for buffered responses with the default rendering strategy.
Clustering Wicket has never been easier.
Here is an example on how to offload page maps to a hybrid EhCache/Memcached cache. Memcached for long term shared storage while EhCache for short-lived fast cache look ups.
@Override
protected ISessionStore newSessionStore() {
// localhost:11211 — memcached server
// "fabpagestore" — unique appender to avoid key clashes.
// 300 — 5 minute TTL for local ehcache.
return new SecondLevelCacheSessionStore(this,
new CachePageStore(Arrays.asList("localhost:11211"),"fabpagestore",300));
}
}
Here is an example on how to offload page maps to the database.
@Override
protected ISessionStore newSessionStore() {
// "fabpagestore" — unique appender to avoid key clashes.
return new SecondLevelCacheSessionStore(this,new CachePageStore(
new DBCache("jdbc:mysql://foo/mydb", "myname", "mypass", "com.driver.Name", "fabpagestore")));
}
}
Here is my CachePageStore;
import java.util.List;
import org.apache.wicket.Page;
import org.apache.wicket.protocol.http.SecondLevelCacheSessionStore.IClusteredPageStore;
import org.apache.wicket.protocol.http.pagestore.AbstractPageStore;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.base.cache.AsyncMemcache;
import com.base.cache.ICache;
public class CachePageStore extends AbstractPageStore implements IClusteredPageStore {
private ICache cache;
private Logger logger = LoggerFactory.getLogger(CachePageStore.class);
public CachePageStore(final List<String> servers,final String poolName,final int ttl) {
this(new AsyncMemcache(servers, poolName, ttl));
}
public CachePageStore(final ICache cache) {
this.cache = cache;
}
public String getKey(final String sessId,final String pageMapName,final int pageId,final int pageVersion) {
return getKey(sessId,pageMapName,pageId,pageVersion,-1);
}
public String getKey(final String sessId,final String pageMapName,final int pageId,final int pageVersion,final int ajaxVersion) {
String key = sessId+":"+pageMapName+":"+pageId+":"+pageVersion+":"+ajaxVersion;
if(logger.isDebugEnabled()) {
logger.debug("GetKey: "+key);
}
return key;
}
public String storeKey(final String sessionId,final Page page) {
String key = sessionId+":"+page.getPageMapName()+":"+page.getId()+":"+page.getCurrentVersionNumber()+":"+(page.getAjaxVersionNumber()-1);
if(logger.isDebugEnabled()) {
logger.debug("StoreKey: "+key);
}
return key;
}
public boolean containsPage(final String sessionId, final String pageMapName, final int pageId, final int pageVersion) {
return cache.keyExists(getKey(sessionId,pageMapName,pageId,pageVersion));
}
public void destroy() {
}
public <T> Page getPage(final String sessionId, final String pagemap, final int id, final int versionNumber, final int ajaxVersionNumber) {
return (Page) cache.get(getKey(sessionId,pagemap,id,versionNumber,ajaxVersionNumber));
}
public void pageAccessed(final String sessionId, final Page page) {
}
public void removePage(final String sessionId, final String pagemap, final int id) {
String key = getKey(sessionId,pagemap,id,0);
key = key.substring(0,key.lastIndexOf(":"));
for(String k : cache.getKeys()) {
if (k.startsWith(key)) {
cache.remove(k);
}
}
}
public void storePage(final String sessionId, final Page page) {
cache.put(storeKey(sessionId,page), page);
}
public void unbind(final String sessionId) {
for(String key : cache.getKeys()) {
if (key.startsWith(sessionId)) {
cache.remove(key);
}
}
}
}
Update: I feel like a jackass now, I thought I was running this against the stable haproxy build, but in reality this was against haproxy-1.4dev6. DOH! Well on the bright-side, I am helping out the author fix a potentially critical bug. Here is the truss and tcp dump if anyone cares.
Well yet another Solaris specific bug/issue to report. HAProxy resets long running connections. Meaning users on slow bandwidth connections are affected by this. I have sent tcpdumps and logs to the author of HAProxy, hopefully this bug/issue would be resolved. I am writing this as a precautionary warning to other Solaris admins out there.
Here the way to trigger this, see if your service is affected by this.
Result:
–2010-01-20 11:19:29– http://somesite.com/onebigfile.txt
Resolving somesite.com (somesite.com)… 72.11.142.91
Connecting to somesite.com (somesite.com)|72.11.142.91|:84… connected.
HTTP request sent, awaiting response… 200 OK
Length: 3806025 (3.6M)
Saving to: “onebigfile.txt”
7% [====> ] 269,008 20.1K/s in 13s
2010-01-20 11:19:42 (20.1 KB/s) – Read error at byte 269008/3806025 (Connection reset by peer). Retrying.
–2010-01-20 11:19:43– (try: 2) http://somesite.com/onebigfile.txt
Connecting to somesite.com (somesite.com)|72.11.142.91|:84… connected.
HTTP request sent, awaiting response… 200 OK
Length: 3806025 (3.6M)
Saving to: “onebigfile.txt”
4% [==> ] 186,016 20.0K/s eta
/Raging, why are there so many Solaris TCP issues? First Varnish? now HAProxy? ARGHHHHH!@#!@
I just pushed up a new version of Satan to GitHub. For the uniformed uninformed Satan is my process reaper for run away unix processes. Satan was designed to work with Solaris’ SMF self-healing properties. Basically, Satan kills while SMF revives. The new version that was pushed up contains HTTP health checks, so Satan now has the ability to kill processes that are not responding back with a HTTP/200 response code.
The motivation behind HTTP health checks was because once a month or so at Fabulously40 our ActiveMQ would break down while still accepting connections, the only way to figure out if it was zombified was to check the HTTP administrator interface. If the ActiveMQ instance was actually knelled over, the administrator interface would come back with a HTTP/500 response code, hence the birth of HTTP health checks.
Here is our Satan configuration file that we use at Fabulously40.
The “args” property might be a bit confusing, it is a snippet of text that Satan looks for in the arguments passed to your application to identify the running process. So for example, if you start your ActiveMQ instance with the following arguments; “java -jar activemq.jar -Dactivemq=8161 -XXXXX” Placing “8161″ in args property would be a good unique identifier for Satan to pick up on.
Satan.watch do |s| s.name = "jvm instances" # name of job s.user = "webservd" # under what user s.group = "webservd" # under what group s.deamon = "java" # deamon binary name to grep for s.args = nil # globally look for specific arguments, optional s.debug = true # if to write out debug information s.safe_mode = false # If in safe mode, satan will not kill ;-( s.interval = 10.seconds # interval to run at to collect statistics s.sleep_after_kill = 1.minute # sleep after killing, satan is tired! s.contact = "victori@fabulously40.com" # admin contact, optional if you want email alerts s.kill_if do |process| process.condition(:cpu) do |cpu| # on cpu condition cpu.name = "50% CPU limit" # name for job cpu.args = "jetty" # make sure this is a jetty process, optional cpu.above = 48.percent # if above certain percentage cpu.times = 5 # how many times we can hit this condition before killing end process.condition(:memory) do |memory| # on memory condition memory.name = "850MB limit" # name for job memory.args = "jetty" # make sure this is a jetty process, optional memory.above = 850.megabytes # limit for memory use memory.times = 5 # how many times we can hit this condition before killing end # ActiveMQ tends to die on us under heavy load so we need the power of satan! process.condition(:http) do |http| # on http condition http.name = "HTTP ActiveMQ Check" # name for job http.args = "8161" # look for specific app arguments # to associate app to URI http.uri = "http://localhost:8161/admin/queues.jsp" # the URI http.times = 5 # how many times before kill end end end
Ted Dziuba beautifully articulated why deadlines go to crap and seemingly straight forward tasks go out the window. You sir have done a public service for us all, thank you.
What I hate is fording endless rivers of horseshit that are in the way of seemingly simple tasks. And I hate it even more when I have to explain to a non-programmer what I am doing, "building LXML against a different version of libiconv because I think it might be the source of a crash". "But all I asked you to do was parse some documents." Good times.
Since Varnish did not work out on Solaris yet again. I have decided to bite the bullet and write a headers normalization patch for Squid 2.7. This patch should produce much better cache hit rates with Squid. Efficiency++
What the patch does:
1. Removes Cache-Control request headers, don’t let clients by-pass cache if it is primed.
2. Normalize Accept-Encoding Headers for a higher cache hit rate.
3. Clear Accept-Encoding Headers for content that should not be compressed.
If you have issues patching, here is the patched file. Just replace it with the default one.
and the patch: squid-headers-normalization.patch
Update: Fixed a minor memory leak, all good now.
--- src/client_side.c.og 2010-01-20 12:00:56.000000000 -0800 +++ src/client_side.c 2010-01-19 20:35:31.000000000 -0800 @@ -3983,6 +3983,7 @@ errorAppendEntry(http->entry, err); return -1; } + /* compile headers */ /* we should skip request line! */ if ((http->http_ver.major >= 1) && !httpMsgParseRequestHeader(request, &msg)) { @@ -3992,10 +3993,59 @@ err->url = xstrdup(http->uri); http->al.http.code = err->http_status; http->log_type = LOG_TCP_DENIED; + http->entry = clientCreateStoreEntry(http, method, null_request_flags); errorAppendEntry(http->entry, err); return -1; } + + /* + * Normalize Request Cache-Control / If-Modified-Since Headers + * Don't let client by-pass the cache if there is cached content. + */ + if(httpHeaderHas(&request->header,HDR_CACHE_CONTROL)) { + httpHeaderDelByName(&request->header,"cache-control"); + } + + /* + * Un-comment this if you want Squid to always respond with the request + * instead of returning back with a 304 if the cache has not changed. + */ + /* + if(httpHeaderHas(&request->header,HDR_IF_MODIFIED_SINCE)) { + httpHeaderDelByName(&request->header,"if-modified-since"); + }*/ + + /* + * Normalize Accept-Encoding Headers sent from client + */ + if(httpHeaderHas(&request->header,HDR_ACCEPT_ENCODING)) { + String val = httpHeaderGetByName(&request->header,"accept-encoding"); + if(val.buf) { + if(strstr(val.buf,"gzip") != NULL) { + httpHeaderDelByName(&request->header,"accept-encoding"); + httpHeaderPutStr(&request->header,HDR_ACCEPT_ENCODING,"gzip"); + } else if(strstr(val.buf,"deflate") != NULL) { + httpHeaderDelByName(&request->header,"accept-encoding"); + httpHeaderPutStr(&request->header,HDR_ACCEPT_ENCODING,"deflate"); + } else { + httpHeaderDelByName(&request->header,"accept-encoding"); + } + } + stringClean(&val); + } + + /* + * Normalize Accept-Encoding Headers for video/image content + */ + char *mime_type = mimeGetContentType(http->uri); + if(mime_type) { + if(strstr(mime_type,"image") != NULL || strstr(mime_type,"video") != NULL) { + httpHeaderDelByName(&request->header,"accept-encoding"); + } + } + + /* * If we read past the end of this request, move the remaining * data to the beginning
Are you running JRuby in production? Do you want distributed file storage for your “enterprise” application? Look no further, MogileFS is here.
MogileFS-Client has compatibility issues with JRuby due to it’s use of the low level Socket class. JRuby 1.5-dev does not yet support all the Socket methods, so here is a monkey patch to get the ruby mogilefs client working on JRuby. Yes it blocks, but who cares JRuby has native threads.
This is exactly why I love Ruby; monkey patching.
def self.mogilefs_new(host,port,timeout=5.0)
TCPSocket.open(host,port,timeout)
end
end
class TCPSocket
attr_accessor :mogilefs_addr, :mogilefs_connected, :mogilefs_size, :mogilefs_tcp_cork
def self.open(host,port,timeout = 5.0)
super(host,port.to_i)
end
def readable?
true
end
def write_nonblock(data)
write(data)
end
def recv_nonblock(size,arg)
recv(size,arg)
end
def mogilefs_init(host = nil, port = nil)
true
end
end
Here is an example test case on how to get it all to work.
require ‘mogilefs’
# jmogilefs.rb is the monkey patch above
# load it after loading mogilefs client.
require ‘jmogilefs.rb’
mg = MogileFS::MogileFS.new(:domain=>‘testserv’,:hosts=>[‘xxx.xxx.xxx.xxx:6001′])
p mg.get_file_data ‘video:100:default.jpg’
p mg.get_paths ‘video:100:default.jpg’,true
mg.list_keys(‘video:100′)[0].each do |f|
p f
end
*Update* Patches got accepted into MogileFS Trunk
Just go check out trunk, it has all my patches already included.
http://code.sixapart.com/svn/mogilefs/trunk/
The only thing you need is my mogstored disk patch which is still pending. All the issues revolving around postgresql and solaris have been already included in trunk.
I fixed a few issues with MogileFS and Solaris. MogileFS should run wonderfully on Solaris with my patches applied.
Directory for all my patches: http://victori.uploadbooth.com/patches
http://victori.uploadbooth.com/patches/solaris-disk-du.patch
This patch fixes mogstored to work with solaris’s df utility.
http://victori.uploadbooth.com/patches/store-max-requests.patch
This patch adds a new feature to the MogileFS Tracker – max_requests.
The default is 0, but it is suggested you set it to 1000 max_requests, to avoid memory leaks.
The tracker will give out the database handle up to the max_requests limit before expiring the connection for a new one. This avoids memory leaks with long running persistent connections. PostgreSQL has issues with long persistent connections, it accumulates a lot of ram and does not let go until the process/connection is killed off. This patch makes sure that the connection is expired after so many dbh handle requests.
http://victori.uploadbooth.com/patches/mogilefs-sunos-pg.patch
This patch applies the InactiveDestroy argument to avoid the MogileFS Tracker locking up with the PostgreSQL store on Solaris.
http://victori.uploadbooth.com/patches/solaris-mogilefs-full.patch
This is the full patch for all my fixes.
I am slowly migrating our fab40 static asset data to MogileFS. I have imported >300,000 images, no issues with my patches so far.
/ PLUG go make an account on uploadbooth!
Enjoy
I am working on a little twitter project that uses twitter4r as the client API. As of recently twitter pulled some strings on their API and broke compatibility.
/opt/local/lib/ruby/gems/1.8/gems/mbbx6spp-twitter4r-0.4.0/lib/twitter/client/base.rb:43:in `raise_rest_error’: Not Found (Twitter::RESTError)
from /opt/local/lib/ruby/gems/1.8/gems/mbbx6spp-twitter4r-0.4.0/lib/twitter/client/base.rb:48:in `handle_rest_response’
from /opt/local/lib/ruby/gems/1.8/gems/mbbx6spp-twitter4r-0.4.0/lib/twitter/client/base.rb:20:in `http_connect’
from /opt/local/lib/ruby/1.8/net/http.rb:543:in `start’
from /opt/local/lib/ruby/gems/1.8/gems/mbbx6spp-twitter4r-0.4.0/lib/twitter/client/base.rb:16:in `http_connect’
from /opt/local/lib/ruby/gems/1.8/gems/mbbx6spp-twitter4r-0.4.0/lib/twitter/client/user.rb:37:in `user’
from somebot.rb:5
Curse you twitter!
Luckly Ruby has the concept of monkey patching, here is the fix to get it all working correctly.
@@USER_URIS = {
:info => ‘/users/show.json’,
:friends => ‘/statuses/friends.json’,
:followers => ‘/statuses/followers.json’,
}
end
Shazzam… it works…
I finally got around to open sourcing our scala memcached implementation that we use at fabulously40 for session storage.
Since wicket sessions can vary greatly in size, using the standard memcached server implementation became impractical due to the slab allocator.
The current code on github lacks the ehcache store and an Actor IoHandler adapter. The internal SMemcached application at fabulously40 uses a private caching API so we can hook up various caching backend storage implementations such as mysql, postgresql, ehcache or even another memcached server. You can grab the TCache project on github that SMemcached uses to unify caching under a single API. This gives SMemcached a lot of flexibility when it comes to caching your data.
fyi. TCache stands for “Tanek” Cache, Tanek means cache in russian.
The project works quite well, but don’t use it in production just yet since there is no data expiration for cached data in the HashMap storage implementation. This is just a technical preview. Do use it in production, this is what we use at Fabulously40
I would be sitting on a gold mine.
SLOC Directory SLOC-by-Language (Sorted) 37890 src java=37890 5026 contrib-utilities java=5026 4457 contrib-crud java=4457 2259 contrib-mootools java=2259 1235 contrib-generaldao java=1235 1185 contrib-emailmanager java=1185 986 jetty-memcache java=986 961 contrib-cache java=961 787 jmemcached java=786,sh=1 640 contrib-thumbnail java=640 503 contrib-blueprint java=503 181 contrib-snapshot java=181 32 contrib-nicedit java=32 Totals grouped by language (dominant language first): java: 56141 (100.00%) sh: 1 (0.00%) Total Physical Source Lines of Code (SLOC) = 56,142 Development Effort Estimate, Person-Years (Person-Months) = 13.73 (164.80) (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05)) Schedule Estimate, Years (Months) = 1.45 (17.39) (Basic COCOMO model, Months = 2.5 * (person-months**0.38)) Estimated Average Number of Developers (Effort/Schedule) = 9.47 Total Estimated Cost to Develop = $ 1,855,214 (average salary = $56,286/year, overhead = 2.40). SLOCCount, Copyright (C) 2001-2004 David A. Wheeler SLOCCount is Open Source Software/Free Software, licensed under the GNU GPL. SLOCCount comes with ABSOLUTELY NO WARRANTY, and you are welcome to redistribute it under certain conditions as specified by the GNU GPL license; see the documentation for details. Please credit this data as "generated using David A. Wheeler's 'SLOCCount'."

(5 votes, average: 4.80 out of 5)
