Letsgetdugg

Random tech jargon

Browsing the 2009 December archive

Ted Dziuba beautifully articulated why deadlines go to crap and seemingly straight forward tasks go out the window. You sir have done a public service for us all, thank you.

What I hate is fording endless rivers of horseshit that are in the way of seemingly simple tasks. And I hate it even more when I have to explain to a non-programmer what I am doing, "building LXML against a different version of libiconv because I think it might be the source of a crash". "But all I asked you to do was parse some documents." Good times.

I needed a thread-safe JSMin library for compressing javascripts on the fly on UploadBooth, so I took an existing ruby implementation and made it thread safe. I don’t think there was license defined when I got it, so I am re-releasing it as-is.

require ‘monitor’

class JSMin
  EOF = -1
  include MonitorMixin

  # jsmin — Copy the input to the output, deleting the characters which are
  # insignificant to JavaScript. Comments will be removed. Tabs will be
  # replaced with spaces. Carriage returns will be replaced with linefeeds.
  # Most spaces and linefeeds will be removed.
  # thread safe
  def minimize(jstext)
    synchronize do
      @theA = ""
      @theB = ""
      @current = 0
      @output = ""

      @text = jstext
      @theA = "\n"
      action(3)
      while (@theA != JSMin::EOF)
          case @theA
          when " "
              if (isAlphanum(@theB))
                  action(1)
              else
                  action(2)
              end
          when "\n"
              case (@theB)
              when "{","[","(","+","-"
                  action(1)
              when " "
                  action(3)
              else
                  if (isAlphanum(@theB))
                      action(1)
                  else
                      action(2)
                  end
              end
          else
              case (@theB)
              when " "
                  if (isAlphanum(@theA))
                      action(1)
                  else
                      action(3)
                  end
              when "\n"
                  case (@theA)
                  when "}","]",")","+","-","\"","\\", "’", ‘"’
                      action(1)
                  else
                      if (isAlphanum(@theA))
                          action(1)
                      else
                          action(3)
                      end
                  end
              else
                  action(1)
              end
          end
      end
      @output
    end
  end
 
  private
  # isAlphanum — return true if the character is a letter, digit, underscore,
  # dollar sign, or non-ASCII character
  def isAlphanum(c)
     return false if !c || c == JSMin::EOF
     return ((c >= ‘a’ && c <= ‘z’) || (c >= ’0′ && c <= ’9′) ||
             (c >= ‘A’ && c <= ‘Z’) || c == ‘_’ || c == ‘$’ ||
             c == \’ || c[0] > 126)
  end

  # get — return the next character from stdin. Watch out for lookahead. If
  # the character is a control character, translate it to a space or linefeed.
  # thread safe
  def get
    return JSMin::EOF if @current>(@text.length-1)
    c = @text[@current]
    @current += 1
    c = c.chr
    return c if (c >= " " || c == "\n" || c.unpack("c") == JSMin::EOF)
    return "\n" if (c == "\r")
    return " "
  end

  # Get the next character without getting it.
  def peek
      lookaheadChar = @text[@current]
      return lookaheadChar.chr
  end

  # mynext — get the next character, excluding comments.
  # peek() is used to see if a ‘/‘ is followed by a ‘/‘ or ‘*‘.
  def mynext
      c = get
      if (c == "/")
          if(peek == "/")
              while(true)
                  c = get
                  if (c <= "\n")
                  return c
                  end
              end
          end
          if(peek == "*")
              get
              while(true)
                  case get
                  when "*"
                     if (peek == "/")
                          get
                          return " "
                      end
                  when JSMin::EOF
                      raise "Unterminated comment"
                  end
              end
          end
      end
      return c
  end

  # action — do something! What you do is determined by the argument: 1
  # Output A. Copy B to A. Get the next B. 2 Copy B to A. Get the next B.
  # (Delete A). 3 Get the next B. (Delete B). action treats a string as a
  # single character. Wow! action recognizes a regular expression if it is
  # preceded by ( or , or =.
  def action(a)
      if(a==1)
          @output << @theA
      end
      if(a==1 || a==2)
          @theA = @theB
          if (@theA == "\’" || @theA == "\"")
              while (true)
                  @output << @theA
                  @theA = get
                  break if (@theA == @theB)
                  raise "Unterminated string literal" if (@theA <= "\n")
                  if (@theA == "\\")
                      @output << @theA
                      @theA = get
                  end
              end
          end
      end
      if(a==1 || a==2 || a==3)
          @theB = mynext
          if (@theB == "/" && (@theA == "(" || @theA == "," || @theA == "=" ||
                               @theA == ":" || @theA == "[" || @theA == "!" ||
                               @theA == "&" || @theA == "|" || @theA == "?" ||
                               @theA == "{" || @theA == "}" || @theA == ";" ||
                               @theA == "\n"))
              @output << @theA
              @output << @theB
              while (true)
                  @theA = get
                  if (@theA == "/")
                      break
                  elsif (@theA == "\\")
                      @output << @theA
                      @theA = get
                  elsif (@theA <= "\n")
                      raise "Unterminated RegExp Literal" + @output
                  end
                  @output << @theA
              end
              @theB = mynext
          end
      end
  end
end

Tagged with ,

Since Varnish did not work out on Solaris yet again. I have decided to bite the bullet and write a headers normalization patch for Squid 2.7. This patch should produce much better cache hit rates with Squid. Efficiency++

What the patch does:

1. Removes Cache-Control request headers, don’t let clients by-pass cache if it is primed.
2. Normalize Accept-Encoding Headers for a higher cache hit rate.
3. Clear Accept-Encoding Headers for content that should not be compressed such as image,video and audio.

and the patch: squid-headers-normalization.patch

Update: Fixed a minor memory leak, all good now.
Update 2: Added audio exception to strip accept-encoding.

--- src/client_side.c.og 2010-01-20 12:00:56.000000000 -0800 +++ src/client_side.c 2010-01-19 20:35:31.000000000 -0800 @@ -3983,6 +3983,7 @@ errorAppendEntry(http->entry, err); return -1; } + /* compile headers */ /* we should skip request line! */ if ((http->http_ver.major >= 1) && !httpMsgParseRequestHeader(request, &msg)) { @@ -3992,10 +3993,59 @@ err->url = xstrdup(http->uri); http->al.http.code = err->http_status; http->log_type = LOG_TCP_DENIED; + http->entry = clientCreateStoreEntry(http, method, null_request_flags); errorAppendEntry(http->entry, err); return -1; } + + /* + * Normalize Request Cache-Control / If-Modified-Since Headers + * Don't let client by-pass the cache if there is cached content. + */ + if(httpHeaderHas(&request->header,HDR_CACHE_CONTROL)) { + httpHeaderDelByName(&request->header,"cache-control"); + } + + /* + * Un-comment this if you want Squid to always respond with the request + * instead of returning back with a 304 if the cache has not changed. + */ + /* + if(httpHeaderHas(&request->header,HDR_IF_MODIFIED_SINCE)) { + httpHeaderDelByName(&request->header,"if-modified-since"); + }*/ + + /* + * Normalize Accept-Encoding Headers sent from client + */ + if(httpHeaderHas(&request->header,HDR_ACCEPT_ENCODING)) { + String val = httpHeaderGetByName(&request->header,"accept-encoding"); + if(val.buf) { + if(strstr(val.buf,"gzip") != NULL) { + httpHeaderDelByName(&request->header,"accept-encoding"); + httpHeaderPutStr(&request->header,HDR_ACCEPT_ENCODING,"gzip"); + } else if(strstr(val.buf,"deflate") != NULL) { + httpHeaderDelByName(&request->header,"accept-encoding"); + httpHeaderPutStr(&request->header,HDR_ACCEPT_ENCODING,"deflate"); + } else { + httpHeaderDelByName(&request->header,"accept-encoding"); + } + } + stringClean(&val); + } + + /* + * Normalize Accept-Encoding Headers for video/image content + */ + char *mime_type = mimeGetContentType(http->uri); + if(mime_type) { + if(strstr(mime_type,"image") != NULL || strstr(mime_type,"video") != NULL || strstr(mime_type,"audio") != NULL) { + httpHeaderDelByName(&request->header,"accept-encoding"); + } + } + + /* * If we read past the end of this request, move the remaining * data to the beginning

Clearing stale cache by domain

You can clear a site’s cache by domain, this is really nifty if you have Varnish in front of multiple sites. You can log into Varnish’s administration console via telnet and execute the following purge command to wipe out the undesired cache.

purge req.http.host ~ letsgetdugg.com

Monitor Response codes

Worried that some of your clients might be receiving 503 Varnish response pages? Find out with varnishtop.

varnishtop -i TxStatus

Here is how the output looks like.

list length 7 web 4018.65 TxStatus 200 132.35 TxStatus 304 44.17 TxStatus 404 34.63 TxStatus 302 30.87 TxStatus 301 9.36 TxStatus 403 1.39 TxStatus 503
Tagged with , ,

Update 2010-02-19: Seems other people are also affected by the Varnish LINGER crash on OpenSolaris. This does not address the core problem but removes the “fail fast” behavior with no negative side effects.

r4576 has been running reliably with the fix below.

In varnishd/bin/cache_acceptor.c

if (need_linger)
                setsockopt(sp->fd, SOL_SOCKET, SO_LINGER,
                    &linger, sizeof linger);

Remove TCP_assert line encapsulating setsockopt().

Update 2010-02-17: This might be a random fluke but Varnish has connection issues when compiled under SunCC, stick to GCC. I have compiled Varnish with GCC 4.3.2 and the build seems to work well. Give r4572 a try, phk commited some solaris aware errno code.

Update 2010-02-16: r4567 seems stable. Errno isn’t thread-safe by default on Solaris like other platforms, you need to define -pthreads for GCC and -mt for SunCC in both the compile and linking flags.

GCC example:

VCC_CC=”cc -Kpic -G -m64 -o %o %s” CC=/opt/sfw/bin/gcc CFLAGS=”-O3 -L/opt/extra/lib/amd64 -pthreads -m64 -fomit-frame-pointer” LDFLAGS=”-lumem -pthreads” ./configure –prefix=/opt/extra

SunCC Example:

VCC_CC=”cc -Kpic -G -m64 -o %o %s” CC=/opt/SSX0903/bin/cc CFLAGS=”-xO3 -fast -xipo -L/opt/extra/lib/amd64 -mt -m64″ LDFLAGS=”-lumem -mt” ./configure –prefix=/opt/extra

Here are the sources on how I pieced it all together: sun docs, stack overflow answer

See what -pthreads define on GCC

# gcc -x c /dev/null -E -dM -pthreads | grep REENT
#define _REENTRANT 1

snippet from solaris’s /usr/include/errno.h to confirm that errno isn’t thread safe by default.

#if defined(_REENTRANT) || defined(_TS_ERRNO) || _POSIX_C_SOURCE – 0 >= 199506L
extern int *___errno();
#define errno (*(___errno()))
#else
extern int errno;
/* ANSI C++ requires that errno be a macro */
#if __cplusplus >= 199711L
#define errno errno
#endif
#endif /* defined(_REENTRANT) || defined(_TS_ERRNO) */

Update 2010-01-28: r4508 seems stable. No patches needed aside from removing an assert(AZ) in cache_acceptor.c on line 163.

Update 2010-01-21: If your using Varnish from trunk past r4445 apply this session cache_waiter_poll patch to avoid stalled connections.

Update 2009-21-12: Still using Varnish in production, the site is working beautifully with the settings below.

Update(new): I think I figured the last remaining piece of the puzzle. Switching Varnish’s default listener to poll fixed the long connection accept wait times.

Update: Monitor charts looked good, but persistent connections kept flaking under production traffic. I was forced to revert back to Squid 2.7. *Sigh* I think Squid might be the only viable option on Solaris when it comes to reverse proxy caching. The information below is useful if you still want to try out Varnish on Solaris.

I have finally wrangled Varnish to work reliably on Solaris without any apparent issues. The recent commit to trunk by phk(creator) fixed the last remaining Solaris issue that I am aware of.

There are three four requirements to get this working reliably on Solaris.

1. Run from trunk – r4508 is a known stable revision that works well. Remove the AZ() assert in cache_acceptor.c on line 163.

2. Set connect_timeout to 0, this is needed to work around a Varnish/Solaris TCP incompatibility that resides in lib/libvarnish/tcp.c#TCP_connect timeout code.

3. Switch the default waiter to poll. EventPorts seems bugged on OpenSolaris builds.

4. If you have issues starting Varnish, start Varnish in the foreground via -F argument.

Here is a Pingdom graph of our monitored service. Can you tell when Varnish was swapped in for Squid? Varnish does a better job of keeping content cached due to header normalization and larger cache size.

varnish latency improvement

There are a few “gotchas” to look out for to get it all running reliably. Here is the configuration that I used in production. I have annotated each setting with a brief description.

newtask -p highfile /opt/extra/sbin/varnishd -f /opt/extra/etc/varnish/default.vcl -a 0.0.0.0:82 # IP/Port to listen on -p listen_depth=8192 # Connections kernel buffers before rejecting. -p waiter=poll # Listener implementation to use. -p thread_pool_max=2000 # Max threads per pool -p thread_pool_min=50 # Min Threads per pool, crank this high -p thread_pools=4 # Thread Pool per CPU -p thread_pool_add_delay=2ms # Thread init delay, not to bomb OS -p cc_command='cc -Kpic -G -m64 -o %o %s' # 64-Bit if needed -s file,/sessions/varnish_cache.bin,512M # Define cache size -p sess_timeout=10s # Keep-Alive timeout -p max_restarts=12 # Amount of restart attempts -p session_linger=120ms # Milliseconds to keep thread around -p connect_timeout=0s # Important bug work around for Solaris -p lru_interval=20s # LRU interval checks -p sess_workspace=65536 # Space for headers -T 0.0.0.0:8086 # Admin console -u webservd # User to run varnish as

System configuration Optimizations

Solaris lacks SO_{SND|RCV}TIMEO BSD socket flags. These flags are used to define TCP timeout values per socket. Every other OS has it Mac OS X, Linux, FreeBSD, AIX but not Solaris. Meaning Varnish is unable to make use of custom defined timeout values on Solaris. You can do the next best thing with Solaris; optimize the TCP timeouts globally.

# Turn off Nagle. Nagle Adds latency. /usr/sbin/ndd -set /dev/tcp tcp_naglim_def 1 # 30 second TIME_WAIT timeout. (4 minutes default) /usr/sbin/ndd -set /dev/tcp tcp_time_wait_interval 30000 # 15 min keep-alive (2 hour default) /usr/sbin/ndd -set /dev/tcp tcp_keepalive_interval 900000 # 120 sec connect time out , 3 min default ndd -set /dev/tcp tcp_ip_abort_cinterval 120000 # Send ACKs right away - less latency on bursty connections. ndd -set /dev/tcp tcp_deferred_acks_max 0 # RFC says 1 segment, BSD/Win stack requires 2 segments. /usr/sbin/ndd -set /dev/tcp tcp_slow_start_initial 2

Varnish Settings Dissected

Here are the most important settings to look out for when deploying Varnish in production.

File Descriptors

Run Varnish under a Solaris project that gives the proxy enough file descriptors to handle the concurrency. If Varnish can not allocate enough file descriptors, it can’t serve the requests.

# Paste into /etc/project # Run the Application newtask -p highfile highfile:101::*:*:process.max-file-descriptor=(basic,32192,deny)

Threads

Give enough idle threads to Varnish so it does not stall on requests. Thread creation is slow and expensive, idle threads are not. Don’t go cheap with threads, allocate a minimum of 200. Modern browsers use 8 concurrent connections by default, meaning Varnish will need 8 threads to handle a single page view.

thread_pool_max=2000 # 2000 max threads per pool thread_pool_min=50 # 50 min threads per pool # 50 threads x 4 Pools = 200 threads thread_pools=4 # 4 Pools, Pool per CPU Core. session_linger=120ms # How long to keep a thread around # To handle further requests.
Tagged with , ,