Letsgetdugg

Random tech jargon

Browsing the topic main

I needed a thread-safe JSMin library for compressing javascripts on the fly on UploadBooth, so I took an existing ruby implementation and made it thread safe. I don’t think there was license defined when I got it, so I am re-releasing it as-is.

require ‘monitor’

class JSMin
  EOF = -1
  include MonitorMixin

  # jsmin — Copy the input to the output, deleting the characters which are
  # insignificant to JavaScript. Comments will be removed. Tabs will be
  # replaced with spaces. Carriage returns will be replaced with linefeeds.
  # Most spaces and linefeeds will be removed.
  # thread safe
  def minimize(jstext)
    synchronize do
      @theA = ""
      @theB = ""
      @current = 0
      @output = ""

      @text = jstext
      @theA = "\n"
      action(3)
      while (@theA != JSMin::EOF)
          case @theA
          when " "
              if (isAlphanum(@theB))
                  action(1)
              else
                  action(2)
              end
          when "\n"
              case (@theB)
              when "{","[","(","+","-"
                  action(1)
              when " "
                  action(3)
              else
                  if (isAlphanum(@theB))
                      action(1)
                  else
                      action(2)
                  end
              end
          else
              case (@theB)
              when " "
                  if (isAlphanum(@theA))
                      action(1)
                  else
                      action(3)
                  end
              when "\n"
                  case (@theA)
                  when "}","]",")","+","-","\"","\\", "’", ‘"’
                      action(1)
                  else
                      if (isAlphanum(@theA))
                          action(1)
                      else
                          action(3)
                      end
                  end
              else
                  action(1)
              end
          end
      end
      @output
    end
  end
 
  private
  # isAlphanum — return true if the character is a letter, digit, underscore,
  # dollar sign, or non-ASCII character
  def isAlphanum(c)
     return false if !c || c == JSMin::EOF
     return ((c >= ‘a’ && c <= ‘z’) || (c >= ’0′ && c <= ’9′) ||
             (c >= ‘A’ && c <= ‘Z’) || c == ‘_’ || c == ‘$’ ||
             c == \’ || c[0] > 126)
  end

  # get — return the next character from stdin. Watch out for lookahead. If
  # the character is a control character, translate it to a space or linefeed.
  # thread safe
  def get
    return JSMin::EOF if @current>(@text.length-1)
    c = @text[@current]
    @current += 1
    c = c.chr
    return c if (c >= " " || c == "\n" || c.unpack("c") == JSMin::EOF)
    return "\n" if (c == "\r")
    return " "
  end

  # Get the next character without getting it.
  def peek
      lookaheadChar = @text[@current]
      return lookaheadChar.chr
  end

  # mynext — get the next character, excluding comments.
  # peek() is used to see if a ‘/‘ is followed by a ‘/‘ or ‘*‘.
  def mynext
      c = get
      if (c == "/")
          if(peek == "/")
              while(true)
                  c = get
                  if (c <= "\n")
                  return c
                  end
              end
          end
          if(peek == "*")
              get
              while(true)
                  case get
                  when "*"
                     if (peek == "/")
                          get
                          return " "
                      end
                  when JSMin::EOF
                      raise "Unterminated comment"
                  end
              end
          end
      end
      return c
  end

  # action — do something! What you do is determined by the argument: 1
  # Output A. Copy B to A. Get the next B. 2 Copy B to A. Get the next B.
  # (Delete A). 3 Get the next B. (Delete B). action treats a string as a
  # single character. Wow! action recognizes a regular expression if it is
  # preceded by ( or , or =.
  def action(a)
      if(a==1)
          @output << @theA
      end
      if(a==1 || a==2)
          @theA = @theB
          if (@theA == "\’" || @theA == "\"")
              while (true)
                  @output << @theA
                  @theA = get
                  break if (@theA == @theB)
                  raise "Unterminated string literal" if (@theA <= "\n")
                  if (@theA == "\\")
                      @output << @theA
                      @theA = get
                  end
              end
          end
      end
      if(a==1 || a==2 || a==3)
          @theB = mynext
          if (@theB == "/" && (@theA == "(" || @theA == "," || @theA == "=" ||
                               @theA == ":" || @theA == "[" || @theA == "!" ||
                               @theA == "&" || @theA == "|" || @theA == "?" ||
                               @theA == "{" || @theA == "}" || @theA == ";" ||
                               @theA == "\n"))
              @output << @theA
              @output << @theB
              while (true)
                  @theA = get
                  if (@theA == "/")
                      break
                  elsif (@theA == "\\")
                      @output << @theA
                      @theA = get
                  elsif (@theA <= "\n")
                      raise "Unterminated RegExp Literal" + @output
                  end
                  @output << @theA
              end
              @theB = mynext
          end
      end
  end
end

Tagged with ,

http://github.com/victori/perlbal-plugin-mogilefs

Key features

- Asynchronous, does not stall the Perlbal event loop.
- Converts URL paths to MogileFS fetch keys.
- Failover to filesystem if key fetch failed.
- Pretty statistics in Perlbal’s Management console.

Its freaking awesome ;-)

On a side note, I have also updated my other two Perlbal plugins.

http://github.com/victori/perlbal-plugin-stickysessions

- Session affinity via Cookie.

http://github.com/victori/perlbal-plugin-backendheaders

- Appending Backend information on the served response.

Tagged with ,

*Update* Patches got accepted into MogileFS Trunk ;-)

Just go check out trunk, it has all my patches already included.

http://code.sixapart.com/svn/mogilefs/trunk/

The only thing you need is my mogstored disk patch which is still pending. All the issues revolving around postgresql and solaris have been already included in trunk.


I fixed a few issues with MogileFS and Solaris. MogileFS should run wonderfully on Solaris with my patches applied.

Directory for all my patches: http://victori.uploadbooth.com/patches

http://victori.uploadbooth.com/patches/solaris-disk-du.patch

This patch fixes mogstored to work with solaris’s df utility.

http://victori.uploadbooth.com/patches/store-max-requests.patch

This patch adds a new feature to the MogileFS Tracker – max_requests.

The default is 0, but it is suggested you set it to 1000 max_requests, to avoid memory leaks.

The tracker will give out the database handle up to the max_requests limit before expiring the connection for a new one. This avoids memory leaks with long running persistent connections. PostgreSQL has issues with long persistent connections, it accumulates a lot of ram and does not let go until the process/connection is killed off. This patch makes sure that the connection is expired after so many dbh handle requests.

http://victori.uploadbooth.com/patches/mogilefs-sunos-pg.patch

This patch applies the InactiveDestroy argument to avoid the MogileFS Tracker locking up with the PostgreSQL store on Solaris.

http://victori.uploadbooth.com/patches/solaris-mogilefs-full.patch

This is the full patch for all my fixes.

I am slowly migrating our fab40 static asset data to MogileFS. I have imported >300,000 images, no issues with my patches so far.

/ PLUG go make an account on uploadbooth!

Enjoy ;-)

I just received my “Guide to Open-Source Operating Systems” comparing Solaris with Linux from Sun’s marketing department. Here are some of the facts that made me cringe due to blatant lying and half truths. Hey Sun, don’t let the facts get in your way.

Believe it or not but this is actually verbatim from the guide.

• Solaris runs on more hardware platforms.

• Solaris is supported by more applications.

• Solaris holds performance and price/performance world records that demonstrate its speed and scalability on a variety of systems.

• Solaris is supported by Sun, the company dedicated to UNIX for more than two decades.

1. Lets see, the first fact is just blatant lying. Last I checked Linux supported IA-32, MIPS, x86-64, SPARC, DEC Alpha, Itanium, PowerPC, ARM, m68k, PA-RISC, s390, SuperH, M32R and many more platforms. While Solaris only supports SPARC, IA-32 and x86-64. Does anyone at Sun’s marketing department care to fact check?

2. Depends on your definition of “supported.” Marketing is most likely referring to commercial support. I don’t have the facts to back this up but I doubt this is hold true with Linux in 2009, maybe they had a case back in 1999. Majority of open source applications are developed against Linux and Solaris compatibility is just an after thought.

3. You win http://www.tpc.org/tpcc/results/tpcc_perf_results.asp

Sun develops some of the best hardware and software on the market, but their marketing department is a disaster. There can only be one Steve Jobs and his reality distortion field.

Once again I have been blind sided by yet another conservative out-of-the-box setting. IPFilter is tuned way too conservative with it’s state table size.

Here is how you can tell if your hitting any issues, run ipfstat and check for lost packets.

victori@opensolaris:~# ipfstat | grep lost fragment state(in): kept 0 lost 0 not fragmented 0 fragment state(out): kept 0 lost 0 not fragmented 0 packet state(in): kept 798 lost 100 packet state(out): kept 612 lost 234

Notice that the in and out lost state lines have a non-zero value. This means IPFilter has been dropping client connections, bummer.

The default settings are quite conservative.

victori@opensolaris:~# ipf -T list | grep fr_state
fr_statemax min 0×1 max 0x7fffffff current 4096
fr_statesize min 0×1 max 0x7fffffff current 5002

You need to shutdown IPFilter and apply larger table size limits.

victori@opensolaris:~# svcadm disable ipfilter
victori@opensolaris:~# /usr/sbin/ipf -T fr_statemax=18963,fr_statesize=27091

Lets confirm that it works.

victori@opensolaris:~# ipf -T list | grep fr_state
fr_statemax min 0×1 max 0x7fffffff current 18963
fr_statesize min 0×1 max 0x7fffffff current 27091

Awesome, now all we need to do is enable IPfilter and no more lost packets.

victori@opensolaris:~# svcadm enable ipfilter

To make this persistent across reboots edit ipf.conf

victori@opensolaris:~# vi /usr/kernel/drv/ipf.conf
name=”ipf” parent=”pseudo” instance=0 fr_statemax=18963 fr_statesize=27091;

Then update the contents

victori@opensolaris:~# devfsadm -i ipf

This can be applied to any OS that uses IPFilter.

Recently a primary boot disk went bad on our server and I got blind sided by a non-bootable secondary mirror disk. All the data was intact but I could not boot it. This required a slow re-installation and migration process that took a very long time.

• EFI partitioned drives are not ZFS bootable.
• ZFS attach automatically partitions the drive as EFI.
• ZFS send/recv transfers on gzip compressed data-slices is slow.

Here is the correct way of getting both the disks in the ZFS mirror to boot.

Plug the new drive into the server that you want to add to the ZFS mirror. If your hot swapping or adding a new drive while the server is still on, you need to use cfgadm to configure it.

victori@solaris:~# cfgadm -c configure sata1/1

Now that the drive is configured and seen by the server you need to repartition it with format so it can be used as a bootable device.

victori@solaris:~# format

AVAILABLE DISK SELECTIONS:
0. c4t0d0
/pci@0,0/pci8086,346c@1f,2/disk@0,0
1. c4t1d0
/pci@0,0/pci8086,346c@1f,2/disk@1,0
2. c4t2d0
/pci@0,0/pci8086,346c@1f,2/disk@2,0

* select your new drive *

# fdisk

* use fdisk to remove the EFI partition and add a solaris2 partition. *

Select the partition type to create:
1=SOLARIS2 2=UNIX 3=PCIXOS 4=Other
5=DOS12 6=DOS16 7=DOSEXT 8=DOSBIG
9=DOS16LBA A=x86 Boot B=Diagnostic C=FAT32
D=FAT32LBA E=DOSEXTLBA F=EFI 0=Exit?

This step is very important, if you did not repartition your drive, zfs attach will default the drive back to an EFI partition table that is not bootable.

c4t0d0s2 — primary drive.
c4t1d0s2 — new drive that we are setting up.

victori@solaris:~# prtvtoc /dev/rdsk/c4t0d0s2 | fmthard -s – /dev/rdsk/c4t1d0s2

You should now be able to attach the secondary drive to your mirror using the identical slice.

zpool attach rpool c4t0d0s0 c4t1d0s0

Once the mirror is done synchronizing you need to install the bootloader on the drive.

victori@solaris:~# installgrub -m /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c4t1d0s0
Updating master boot sector destroys existing boot managers (if any).
continue (y/n)?y
stage1 written to partition 0 sector 0 (abs 16065)
stage2 written to partition 0, 267 sectors starting at 50 (abs 16115)
stage1 written to master boot sector

Trouble Shooting

victori@solaris:~# installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c4t1d0s0

raw device must be a root slice (not s2)

You did not re-partition the drive to a solaris2 partition. EFI partitions can’t be made bootable. Use the format tool to reconfigure the drive with a solaris2 partition.

zpool attach rpool c4t0d0s0 c4t1d0s0

cannot open/stat device /dev/rdsk/c1t0d0s0

You did not copy your label information from your primary to your secondary disk with prtvtoc and fmthard.

Tagged with , ,

I have finally nailed out all our issues surrounding Varnish on Solaris, thanks to the help of sky from #varnish. Apparently Varnish uses a wrapper around connect() to drop stale connections to avoid thread pileups if the back-end ever dies. Setting connect_timeout to 0 will force Varnish to use connect() directly. This should eliminate all 503 back-end issues under Solaris that I have mentioned in an earlier blog post.

Here is our startup script for varnish that works for our needs. Varnish is a 64-bit binary hence the “-m64″ cc_command passed.

#!/bin/sh

rm /sessions/varnish_cache.bin

newtask -p highfile /opt/extra/sbin/varnishd -f /opt/extra/etc/varnish/default.vcl -a 72.11.142.91:80 -p listen_depth=8192 -p thread_pool_max=2000 -p thread_pool_min=12 -p thread_pools=4 -p cc_command=’cc -Kpic -G -m64 -o %o %s’ -s file,/sessions/varnish_cache.bin,4G -p sess_timeout=10s -p max_restarts=12 -p session_linger=50s -p connect_timeout=0s -p obj_workspace=16384 -p sess_workspace=32768 -T 0.0.0.0:8086 -u webservd -F

I noticed varnish had particular problem of keeping connections around in CLOSE_WAIT state for a long time, enough to cause issues. I did some tuning on Solaris’s TCP stack so it is more aggressive in closing sockets after the work has been done.

Here are my aggressive TCP settings to force Solaris to close off connections in a short duration of time, to avoid file descriptor leaks. You can merge the following TCP tweaks with the settings I have posted earlier to handle more clients.

# 67 seconds default 675 seconds
/usr/sbin/ndd -set /dev/tcp tcp_fin_wait_2_flush_interval 67500

# 30 seconds, aggressively close connections – default 4 minutes on solaris < 8
/usr/sbin/ndd -set /dev/tcp tcp_time_wait_interval 30000

# 1 minute, poll for dead connection - default 2 hours
/usr/sbin/ndd -set /dev/tcp tcp_keepalive_interval 60000

Last but not least, I have finally swapped out ActiveMQ for the FUSE message broker, an “enterprise” ActiveMQ distribution. Hopefully it won’t crash once a week like ActiveMQ does for us. The FUSE message broker is based off of ActiveMQ 5.3 sources that fix various memory leaks found in the current stable release of ActiveMQ 5.2 as of this writing.

If the FUSE message broker does not work out, I might have to give Kestrel a try. Hey, if it worked for twitter, it should work for us…right?

Want horoscopes on your site? Give our web service a try.

http://widgets.fabulously40.com/horoscope.json?sign=cancer http://widgets.fabulously40.com/horoscope.yml?sign=cancer http://widgets.fabulously40.com/horoscope.xml?sign=cancer

To pull by specific date…

YYYY-MM-DD month format

http://widgets.fabulously40.com/horoscope.json?sign=aries&date=2009-05-03 http://widgets.fabulously40.com/horoscope.yml?sign=aries&date=2009-05-03 http://widgets.fabulously40.com/horoscope.xml?sign=aries&date=2009-05-03

I have recently blogged about swapping malloc implementations for the JVM to help boost multi-threaded performance. Well there is yet another malloc implementation that solaris comes with that is optimized for single threaded performance; bsdmalloc. I just recently switched our perl interpreter to use bsdmalloc and got 33% faster performance with our perlbal proxy.

You can try out multiple malloc implementations by setting LD_PRELOAD environment variable.

LD_PRELOAD="/usr/lib/libbsdmalloc.so" perl somecode.pl

So here is the rule of thumb for which malloc implementation to use for your application.

libumem = For multithreaded applications. umem avoids thread heap contention and is highly optimized for multi-threaded applications.

bsdmalloc = For single threaded applications. PHP/Perl/Python and Ruby will fall into this category.

Applying the right malloc implementation to your resource intensive application can see a nice performance benefit.

Tagged with ,

I wrote a quick micro benchmark to test out ruby threads. Apparently ruby can’t make use of multiple CPUs with it’s threading implementation. I guess you have to resort to forking to scale up to multiple cpu cores while using mri ruby. However, there is an alternative solution, just use JRuby. JRuby utilizes all cores when running the benchmark.

Ruby 1.8.7 - Compiled with SunCC SSX0903 (-xO5 -fast -xipo) Total number of insane floating point divisions in 10 seconds is 5969107
Ruby 1.9.1 / Compiled with GCC 4.3.2 (-O3 -fomit-frame-pointer) Total number of insane floating point divisions in 10 seconds is 8596894

Ran as: jruby –fast cpuMax.rb
177% increase in performance

JRuby 1.3-dev / JDK7 b56 Total number of insane floating point divisions in 10 seconds is 15915896

Ran as: jruby –fast -J-Djruby.compile.mode=JIT -J-Djruby.jit.threshold=0 -J-server cpuMax.rb
374% increase in performance

JRuby 1.3-dev / JDK7 b56 Total number of insane floating point divisions in 10 seconds is 28334441

Looking at mpstat, I can see the MRI ruby implementation is not utilizing all 4 cores.

Ruby 1.8.7 MRI

CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0  828   0   13   428  204  487   26   93   16    0  1903    4   5   0  91
  1 2682   0    3    39    2  280   32   81   12    2  1189   13   2   0  85
  2 1902   0    0    34   11  259   16   57   13    0  1094   11   3   0  86
  3 1017   0    3   192  150  111   34   38    8    0   676   92   2   0   6
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    2   0    5   422  205  388   11   75    5    0   627    4   1   0  95
  1  161   0   13    20    2  196   11   61    8    0  1405    2   2   0  96
  2  292   0    6    32   15  272   10   57    7    0   700    2   1   0  97
  3    0   0    0   108   65   74   35   28    3    0   346   99   0   0   1

Now here is the JRuby implementation.

CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0  229   0  202   755  193 1977  365  329   94    1  2869   90   2   0   8
  1  328   0   86   371    1 1817  226  303  125    0  2809   86   2   0  12
  2  294   0  128   326    0 1771  248  287  109    0  2290   88   2   0  10
  3  320   0  172   402   62 1848  246  241  116    0  2238   86   3   0  11
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0  317   0  297   700  192 2047  323  341  136    1  2819   89   2   0   9
  1  320   0   61   279    2 1611  134  195  130    0  1960   85   2   0  13
  2  288   0  235   379    0 1941  291  299  115    0  2462   87   2   0  11
  3  308   0   78   316   43 1688  142  159  104    0  1706   85   2   0  13

I think I will stick to JRuby for production use.

#!/usr/bin/ruby

require ‘thread’
threads = []
counter = 0
mutex = Mutex.new

4.times do
     threads << Thread.new {
        x=0
        y=0
        time=Time.new

        while 1 do
                if Time.new - time >= 10 then
                        break
                else
                        x=1.00/24000000000.001
                        y+=1
                end
        end
        mutex.synchronize { counter+=y.to_i }
    }
end
threads.each { |t| t.join }

puts "Total number of insane floating point divisions in 10 seconds is "+counter.to_s