Letsgetdugg

Random tech jargon

Browsing the topic main

I hate expired sessions, death to all expired sessions. Traditionally a Java servlet container has a fixed session time, a flood of traffic can potentially cause JVM OOM errors if the session time is set too high. I wanted a smart session container that can hold onto sessions for as long as possible and expire sessions only when it is absolutely necessary; A Memcached store would be perfect for this.

There for I recently open sourced the jetty-session-store to solve this problem. With the jetty-session-store you can save your session state to Ehcache, Memcached or the database. State should not be bound to a single JVM, Viva Shared Session Stores!

So now that jetty-session-store is out in the wild you can technically cluster Wicket using just the HttpSessionStore. However, it isn’t very efficient with the way Memcached allocates data in fixed sized cache buckets.

1. Wicket sessions under the HttpSessionStore can get quite large, well over 1Mb in size. A Wicket session not only stores the session state but also the previous serialized pages the user has visited.

2. Serializing and de-serializing a large data structure can get expensive. The HttpSessionStore retains an AccessStackPageMap, which is a list data structure consisting of multiple page map revisions.

So instead of saving one large AccessStackPageMap, I wrote a SecondLevelCacheSessionStore that saves a page map revision per cache entry. This leads to much better cache utilization and a whole lot less serialization on the wire. Not to mention this avoids the whole 1Mb Memcached size limit.

Before you go willy nilly with clustering, read the Wicket render strategies page. Wicket requires session affinity for buffered responses with the default rendering strategy.

Clustering Wicket has never been easier.

Here is an example on how to offload page maps to a hybrid EhCache/Memcached cache. Memcached for long term shared storage while EhCache for short-lived fast cache look ups.

public class WebApp extends WebApplication {
        @Override
        protected ISessionStore newSessionStore() {
                // localhost:11211 — memcached server
                // "fabpagestore" — unique appender to avoid key clashes.
                // 300 — 5 minute TTL for local ehcache.
                return new SecondLevelCacheSessionStore(this,
                                new CachePageStore(Arrays.asList("localhost:11211"),"fabpagestore",300));
        }
}

Here is an example on how to offload page maps to the database.

public class WebApp extends WebApplication {
        @Override
        protected ISessionStore newSessionStore() {
                // "fabpagestore" — unique appender to avoid key clashes.
                return new SecondLevelCacheSessionStore(this,new CachePageStore(
                                new DBCache("jdbc:mysql://foo/mydb", "myname", "mypass", "com.driver.Name", "fabpagestore")));
        }
}

Here is my CachePageStore;

package com.base.pagestore;

import java.util.List;

import org.apache.wicket.Page;
import org.apache.wicket.protocol.http.SecondLevelCacheSessionStore.IClusteredPageStore;
import org.apache.wicket.protocol.http.pagestore.AbstractPageStore;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import com.base.cache.AsyncMemcache;
import com.base.cache.ICache;

public class CachePageStore extends AbstractPageStore implements IClusteredPageStore {
        private ICache cache;
        private Logger logger = LoggerFactory.getLogger(CachePageStore.class);

        public CachePageStore(final List<String> servers,final String poolName,final int ttl) {
                this(new AsyncMemcache(servers, poolName, ttl));
        }

        public CachePageStore(final ICache cache) {
                this.cache = cache;
        }

        public String getKey(final String sessId,final String pageMapName,final int pageId,final int pageVersion) {
                return getKey(sessId,pageMapName,pageId,pageVersion,-1);
        }

        public String getKey(final String sessId,final String pageMapName,final int pageId,final int pageVersion,final int ajaxVersion) {
                String key = sessId+":"+pageMapName+":"+pageId+":"+pageVersion+":"+ajaxVersion;
                if(logger.isDebugEnabled()) {
                        logger.debug("GetKey: "+key);
                }
                return key;
        }

        public String storeKey(final String sessionId,final Page page) {
                String key = sessionId+":"+page.getPageMapName()+":"+page.getId()+":"+page.getCurrentVersionNumber()+":"+(page.getAjaxVersionNumber()-1);
                if(logger.isDebugEnabled()) {
                        logger.debug("StoreKey: "+key);
                }
                return key;
        }

        public boolean containsPage(final String sessionId, final String pageMapName, final int pageId, final int pageVersion) {
                return cache.keyExists(getKey(sessionId,pageMapName,pageId,pageVersion));
        }

        public void destroy() {
        }

        public <T> Page getPage(final String sessionId, final String pagemap, final int id, final int versionNumber, final int ajaxVersionNumber) {
                return (Page) cache.get(getKey(sessionId,pagemap,id,versionNumber,ajaxVersionNumber));
        }

        public void pageAccessed(final String sessionId, final Page page) {
        }

        public void removePage(final String sessionId, final String pagemap, final int id) {
                String key = getKey(sessionId,pagemap,id,0);
                key = key.substring(0,key.lastIndexOf(":"));
                for(String k : cache.getKeys()) {
                        if (k.startsWith(key)) {
                                cache.remove(k);
                        }
                }
        }

        public void storePage(final String sessionId, final Page page) {
                cache.put(storeKey(sessionId,page), page);
        }

        public void unbind(final String sessionId) {
                for(String key : cache.getKeys()) {
                        if (key.startsWith(sessionId)) {
                                cache.remove(key);
                        }
                }
        }

}

I’ll start this post off with a quote from IRC

ivaynberg: you cant build good looking sites with wicket victori: lies ivaynberg: or public-facing sites

I have to admit that Wicket appeals more to the “backend” programmer than to the front-end design conscious developer. For every good-looking Wicket site out there, there are ten abysmal looking Wicket sites. Just look at the Wicket Wiki, it is littered with some dreadfully designed sites (Sorry Guys, this isn’t personal). You can tell right off the bat that the developers behind the sites care more about OO and clean code rather than clean design. Well to be frank, I don’t even know if the code behind the listed sites is even elegant. However, the fact that the sites are written on Wicket, tells me that the developers care about things such as separation of concerns and object oriented programming.

So to combat against the whole mentality that Wicket can’t scale and any site done in Wicket must look atrocious. I have decided to compile a list of some awesomely kick ass public-facing / good-looking Wicket sites.

If you don’t see your site and you feel that it should have made the list, feel free to leave a comment with your site’s URL.

High Traffic Wicket Sites

adscale.de

This site has an Alexa 1,700 traffic rank and runs on a single Tomcat servlet container. No proxy caches, no fancy clustering just Tomcat.



vegas.com

Next time someone states that no public facing sites are ever written in wicket, point them to vegas.com.





Clean Wicket Sites

kontain.com

The design behind this site is quite good and sets the design bar in my book.



meetmoi.com

Ah, I remember when the developer behind meetmoi dropped by #wicket and stated that he is officially working on it full time with a million dollars in venture capital seed money.



songtexte.com

Don’t know much about this site, aside that it looks clean and the author did the original b-side wicket site that got replaced with wordpress.



memolio.com



fabulously40.com

Disclaimer: this is the site I developed and I think it looks good ;-)



winerevolution.com



islamicdesignhouse.com



Tagged with ,

I used to backup our database using the following statement;

pg_dump -h fab2 -Fc -Z9 somedb > somedb.db

Once our dataset grew into the gigabytes, it took a very long time to do database dumps. Today, I stumbled upon yet another awesome blog post done by Ted Dzibua mentioning two useful parallel compression utilities. So why not try parallel compression with PostgreSQL dumps?

pbzip2 – Parallel BZIP2: Parallel implementation of BZIP2. BZIP2 is well known for being balls slow, so speed it up using multiple CPUs.

pigz – Parallel GZIP: Parallel implementation of GZIP written by Mark Adler.

Time to try this out with our PostgreSQL dump, here are the result times.

• This was done on a quad core xeon 2.66ghz machine.

# time pg_dump -U secret -h fab2 somedb | pigz -c > somedb.gz

real 2m7.332s
user 1m16.414s
sys 0m8.233s

# time pg_dump -U secret -h fab2 somedb | pbzip2 -c > somedb.bz2

real 4m14.253s
user 10m35.879s
sys 0m10.904s

The original database was 1.6gigs. The compressed files came out to….

# du -hs somedb.*
147M somedb.bz2
194M somedb.gz

And just to make this post complete, to pipe the SQL dump back into PostgreSQL

# dropdb somedb
# createdb somedb
# gzip -d -c somedb.gz | psql somedb

I just pushed up a new version of Satan to GitHub. For the uniformed uninformed Satan is my process reaper for run away unix processes. Satan was designed to work with Solaris’ SMF self-healing properties. Basically, Satan kills while SMF revives. The new version that was pushed up contains HTTP health checks, so Satan now has the ability to kill processes that are not responding back with a HTTP/200 response code.

The motivation behind HTTP health checks was because once a month or so at Fabulously40 our ActiveMQ would break down while still accepting connections, the only way to figure out if it was zombified was to check the HTTP administrator interface. If the ActiveMQ instance was actually knelled over, the administrator interface would come back with a HTTP/500 response code, hence the birth of HTTP health checks.

Here is our Satan configuration file that we use at Fabulously40.

The “args” property might be a bit confusing, it is a snippet of text that Satan looks for in the arguments passed to your application to identify the running process. So for example, if you start your ActiveMQ instance with the following arguments; “java -jar activemq.jar -Dactivemq=8161 -XXXXX” Placing “8161″ in args property would be a good unique identifier for Satan to pick up on.

Satan.watch do |s| s.name = "jvm instances" # name of job s.user = "webservd" # under what user s.group = "webservd" # under what group s.deamon = "java" # deamon binary name to grep for s.args = nil # globally look for specific arguments, optional s.debug = true # if to write out debug information s.safe_mode = false # If in safe mode, satan will not kill ;-( s.interval = 10.seconds # interval to run at to collect statistics s.sleep_after_kill = 1.minute # sleep after killing, satan is tired! s.contact = "victori@fabulously40.com" # admin contact, optional if you want email alerts s.kill_if do |process| process.condition(:cpu) do |cpu| # on cpu condition cpu.name = "50% CPU limit" # name for job cpu.args = "jetty" # make sure this is a jetty process, optional cpu.above = 48.percent # if above certain percentage cpu.times = 5 # how many times we can hit this condition before killing end process.condition(:memory) do |memory| # on memory condition memory.name = "850MB limit" # name for job memory.args = "jetty" # make sure this is a jetty process, optional memory.above = 850.megabytes # limit for memory use memory.times = 5 # how many times we can hit this condition before killing end # ActiveMQ tends to die on us under heavy load so we need the power of satan! process.condition(:http) do |http| # on http condition http.name = "HTTP ActiveMQ Check" # name for job http.args = "8161" # look for specific app arguments # to associate app to URI http.uri = "http://localhost:8161/admin/queues.jsp" # the URI http.times = 5 # how many times before kill end end end
Tagged with ,

Ted Dziuba beautifully articulated why deadlines go to crap and seemingly straight forward tasks go out the window. You sir have done a public service for us all, thank you.

What I hate is fording endless rivers of horseshit that are in the way of seemingly simple tasks. And I hate it even more when I have to explain to a non-programmer what I am doing, "building LXML against a different version of libiconv because I think it might be the source of a crash". "But all I asked you to do was parse some documents." Good times.

I needed a thread-safe JSMin library for compressing javascripts on the fly on UploadBooth, so I took an existing ruby implementation and made it thread safe. I don’t think there was license defined when I got it, so I am re-releasing it as-is.

require ‘monitor’

class JSMin
  EOF = -1
  include MonitorMixin

  # jsmin — Copy the input to the output, deleting the characters which are
  # insignificant to JavaScript. Comments will be removed. Tabs will be
  # replaced with spaces. Carriage returns will be replaced with linefeeds.
  # Most spaces and linefeeds will be removed.
  # thread safe
  def minimize(jstext)
    synchronize do
      @theA = ""
      @theB = ""
      @current = 0
      @output = ""

      @text = jstext
      @theA = "\n"
      action(3)
      while (@theA != JSMin::EOF)
          case @theA
          when " "
              if (isAlphanum(@theB))
                  action(1)
              else
                  action(2)
              end
          when "\n"
              case (@theB)
              when "{","[","(","+","-"
                  action(1)
              when " "
                  action(3)
              else
                  if (isAlphanum(@theB))
                      action(1)
                  else
                      action(2)
                  end
              end
          else
              case (@theB)
              when " "
                  if (isAlphanum(@theA))
                      action(1)
                  else
                      action(3)
                  end
              when "\n"
                  case (@theA)
                  when "}","]",")","+","-","\"","\\", "’", ‘"’
                      action(1)
                  else
                      if (isAlphanum(@theA))
                          action(1)
                      else
                          action(3)
                      end
                  end
              else
                  action(1)
              end
          end
      end
      @output
    end
  end
 
  private
  # isAlphanum — return true if the character is a letter, digit, underscore,
  # dollar sign, or non-ASCII character
  def isAlphanum(c)
     return false if !c || c == JSMin::EOF
     return ((c >= ‘a’ && c <= ‘z’) || (c >= ‘0′ && c <= ‘9′) ||
             (c >= ‘A’ && c <= ‘Z’) || c == ‘_’ || c == ‘$’ ||
             c == \’ || c[0] > 126)
  end

  # get — return the next character from stdin. Watch out for lookahead. If
  # the character is a control character, translate it to a space or linefeed.
  # thread safe
  def get
    return JSMin::EOF if @current>(@text.length-1)
    c = @text[@current]
    @current += 1
    c = c.chr
    return c if (c >= " " || c == "\n" || c.unpack("c") == JSMin::EOF)
    return "\n" if (c == "\r")
    return " "
  end

  # Get the next character without getting it.
  def peek
      lookaheadChar = @text[@current]
      return lookaheadChar.chr
  end

  # mynext — get the next character, excluding comments.
  # peek() is used to see if a ‘/‘ is followed by a ‘/‘ or ‘*‘.
  def mynext
      c = get
      if (c == "/")
          if(peek == "/")
              while(true)
                  c = get
                  if (c <= "\n")
                  return c
                  end
              end
          end
          if(peek == "*")
              get
              while(true)
                  case get
                  when "*"
                     if (peek == "/")
                          get
                          return " "
                      end
                  when JSMin::EOF
                      raise "Unterminated comment"
                  end
              end
          end
      end
      return c
  end

  # action — do something! What you do is determined by the argument: 1
  # Output A. Copy B to A. Get the next B. 2 Copy B to A. Get the next B.
  # (Delete A). 3 Get the next B. (Delete B). action treats a string as a
  # single character. Wow! action recognizes a regular expression if it is
  # preceded by ( or , or =.
  def action(a)
      if(a==1)
          @output << @theA
      end
      if(a==1 || a==2)
          @theA = @theB
          if (@theA == "\’" || @theA == "\"")
              while (true)
                  @output << @theA
                  @theA = get
                  break if (@theA == @theB)
                  raise "Unterminated string literal" if (@theA <= "\n")
                  if (@theA == "\\")
                      @output << @theA
                      @theA = get
                  end
              end
          end
      end
      if(a==1 || a==2 || a==3)
          @theB = mynext
          if (@theB == "/" && (@theA == "(" || @theA == "," || @theA == "=" ||
                               @theA == ":" || @theA == "[" || @theA == "!" ||
                               @theA == "&" || @theA == "|" || @theA == "?" ||
                               @theA == "{" || @theA == "}" || @theA == ";" ||
                               @theA == "\n"))
              @output << @theA
              @output << @theB
              while (true)
                  @theA = get
                  if (@theA == "/")
                      break
                  elsif (@theA == "\\")
                      @output << @theA
                      @theA = get
                  elsif (@theA <= "\n")
                      raise "Unterminated RegExp Literal" + @output
                  end
                  @output << @theA
              end
              @theB = mynext
          end
      end
  end
end

Tagged with ,

http://github.com/victori/perlbal-plugin-mogilefs

Key features

- Asynchronous, does not stall the Perlbal event loop.
- Converts URL paths to MogileFS fetch keys.
- Failover to filesystem if key fetch failed.
- Pretty statistics in Perlbal’s Management console.

Its freaking awesome ;-)

On a side note, I have also updated my other two Perlbal plugins.

http://github.com/victori/perlbal-plugin-stickysessions

- Session affinity via Cookie.

http://github.com/victori/perlbal-plugin-backendheaders

- Appending Backend information on the served response.

Tagged with ,

*Update* Patches got accepted into MogileFS Trunk ;-)

Just go check out trunk, it has all my patches already included.

http://code.sixapart.com/svn/mogilefs/trunk/

The only thing you need is my mogstored disk patch which is still pending. All the issues revolving around postgresql and solaris have been already included in trunk.


I fixed a few issues with MogileFS and Solaris. MogileFS should run wonderfully on Solaris with my patches applied.

Directory for all my patches: http://victori.uploadbooth.com/patches

http://victori.uploadbooth.com/patches/solaris-disk-du.patch

This patch fixes mogstored to work with solaris’s df utility.

http://victori.uploadbooth.com/patches/store-max-requests.patch

This patch adds a new feature to the MogileFS Tracker – max_requests.

The default is 0, but it is suggested you set it to 1000 max_requests, to avoid memory leaks.

The tracker will give out the database handle up to the max_requests limit before expiring the connection for a new one. This avoids memory leaks with long running persistent connections. PostgreSQL has issues with long persistent connections, it accumulates a lot of ram and does not let go until the process/connection is killed off. This patch makes sure that the connection is expired after so many dbh handle requests.

http://victori.uploadbooth.com/patches/mogilefs-sunos-pg.patch

This patch applies the InactiveDestroy argument to avoid the MogileFS Tracker locking up with the PostgreSQL store on Solaris.

http://victori.uploadbooth.com/patches/solaris-mogilefs-full.patch

This is the full patch for all my fixes.

I am slowly migrating our fab40 static asset data to MogileFS. I have imported >300,000 images, no issues with my patches so far.

/ PLUG go make an account on uploadbooth!

Enjoy ;-)

I just received my “Guide to Open-Source Operating Systems” comparing Solaris with Linux from Sun’s marketing department. Here are some of the facts that made me cringe due to blatant lying and half truths. Hey Sun, don’t let the facts get in your way.

Believe it or not but this is actually verbatim from the guide.

• Solaris runs on more hardware platforms.

• Solaris is supported by more applications.

• Solaris holds performance and price/performance world records that demonstrate its speed and scalability on a variety of systems.

• Solaris is supported by Sun, the company dedicated to UNIX for more than two decades.

1. Lets see, the first fact is just blatant lying. Last I checked Linux supported IA-32, MIPS, x86-64, SPARC, DEC Alpha, Itanium, PowerPC, ARM, m68k, PA-RISC, s390, SuperH, M32R and many more platforms. While Solaris only supports SPARC, IA-32 and x86-64. Does anyone at Sun’s marketing department care to fact check?

2. Depends on your definition of “supported.” Marketing is most likely referring to commercial support. I don’t have the facts to back this up but I doubt this is hold true with Linux in 2009, maybe they had a case back in 1999. Majority of open source applications are developed against Linux and Solaris compatibility is just an after thought.

3. You win http://www.tpc.org/tpcc/results/tpcc_perf_results.asp

Sun develops some of the best hardware and software on the market, but their marketing department is a disaster. There can only be one Steve Jobs and his reality distortion field.

Once again I have been blind sided by yet another conservative out-of-the-box setting. IPFilter is tuned way too conservative with it’s state table size.

Here is how you can tell if your hitting any issues, run ipfstat and check for lost packets.

victori@opensolaris:~# ipfstat | grep lost fragment state(in): kept 0 lost 0 not fragmented 0 fragment state(out): kept 0 lost 0 not fragmented 0 packet state(in): kept 798 lost 100 packet state(out): kept 612 lost 234

Notice that the in and out lost state lines have a non-zero value. This means IPFilter has been dropping client connections, bummer.

The default settings are quite conservative.

victori@opensolaris:~# ipf -T list | grep fr_state
fr_statemax min 0×1 max 0×7fffffff current 4096
fr_statesize min 0×1 max 0×7fffffff current 5002

You need to shutdown IPFilter and apply larger table size limits.

victori@opensolaris:~# svcadm disable ipfilter
victori@opensolaris:~# /usr/sbin/ipf -T fr_statemax=18963,fr_statesize=27091

Lets confirm that it works.

victori@opensolaris:~# ipf -T list | grep fr_state
fr_statemax min 0×1 max 0×7fffffff current 18963
fr_statesize min 0×1 max 0×7fffffff current 27091

Awesome, now all we need to do is enable IPfilter and no more lost packets.

victori@opensolaris:~# svcadm enable ipfilter

To make this persistent across reboots edit ipf.conf

victori@opensolaris:~# vi /usr/kernel/drv/ipf.conf
name=”ipf” parent=”pseudo” instance=0 fr_statemax=18963 fr_statesize=27091;

Then update the contents

victori@opensolaris:~# devfsadm -i ipf

This can be applied to any OS that uses IPFilter.