Browsing the topic main
I needed a thread-safe JSMin library for compressing javascripts on the fly on UploadBooth, so I took an existing ruby implementation and made it thread safe. I don’t think there was license defined when I got it, so I am re-releasing it as-is.
class JSMin
EOF = -1
include MonitorMixin
# jsmin — Copy the input to the output, deleting the characters which are
# insignificant to JavaScript. Comments will be removed. Tabs will be
# replaced with spaces. Carriage returns will be replaced with linefeeds.
# Most spaces and linefeeds will be removed.
# thread safe
def minimize(jstext)
synchronize do
@theA = ""
@theB = ""
@current = 0
@output = ""
@text = jstext
@theA = "\n"
action(3)
while (@theA != JSMin::EOF)
case @theA
when " "
if (isAlphanum(@theB))
action(1)
else
action(2)
end
when "\n"
case (@theB)
when "{","[","(","+","-"
action(1)
when " "
action(3)
else
if (isAlphanum(@theB))
action(1)
else
action(2)
end
end
else
case (@theB)
when " "
if (isAlphanum(@theA))
action(1)
else
action(3)
end
when "\n"
case (@theA)
when "}","]",")","+","-","\"","\\", "’", ‘"’
action(1)
else
if (isAlphanum(@theA))
action(1)
else
action(3)
end
end
else
action(1)
end
end
end
@output
end
end
private
# isAlphanum — return true if the character is a letter, digit, underscore,
# dollar sign, or non-ASCII character
def isAlphanum(c)
return false if !c || c == JSMin::EOF
return ((c >= ‘a’ && c <= ‘z’) || (c >= ’0′ && c <= ’9′) ||
(c >= ‘A’ && c <= ‘Z’) || c == ‘_’ || c == ‘$’ ||
c == ‘\’ || c[0] > 126)
end
# get — return the next character from stdin. Watch out for lookahead. If
# the character is a control character, translate it to a space or linefeed.
# thread safe
def get
return JSMin::EOF if @current>(@text.length-1)
c = @text[@current]
@current += 1
c = c.chr
return c if (c >= " " || c == "\n" || c.unpack("c") == JSMin::EOF)
return "\n" if (c == "\r")
return " "
end
# Get the next character without getting it.
def peek
lookaheadChar = @text[@current]
return lookaheadChar.chr
end
# mynext — get the next character, excluding comments.
# peek() is used to see if a ‘/‘ is followed by a ‘/‘ or ‘*‘.
def mynext
c = get
if (c == "/")
if(peek == "/")
while(true)
c = get
if (c <= "\n")
return c
end
end
end
if(peek == "*")
get
while(true)
case get
when "*"
if (peek == "/")
get
return " "
end
when JSMin::EOF
raise "Unterminated comment"
end
end
end
end
return c
end
# action — do something! What you do is determined by the argument: 1
# Output A. Copy B to A. Get the next B. 2 Copy B to A. Get the next B.
# (Delete A). 3 Get the next B. (Delete B). action treats a string as a
# single character. Wow! action recognizes a regular expression if it is
# preceded by ( or , or =.
def action(a)
if(a==1)
@output << @theA
end
if(a==1 || a==2)
@theA = @theB
if (@theA == "\’" || @theA == "\"")
while (true)
@output << @theA
@theA = get
break if (@theA == @theB)
raise "Unterminated string literal" if (@theA <= "\n")
if (@theA == "\\")
@output << @theA
@theA = get
end
end
end
end
if(a==1 || a==2 || a==3)
@theB = mynext
if (@theB == "/" && (@theA == "(" || @theA == "," || @theA == "=" ||
@theA == ":" || @theA == "[" || @theA == "!" ||
@theA == "&" || @theA == "|" || @theA == "?" ||
@theA == "{" || @theA == "}" || @theA == ";" ||
@theA == "\n"))
@output << @theA
@output << @theB
while (true)
@theA = get
if (@theA == "/")
break
elsif (@theA == "\\")
@output << @theA
@theA = get
elsif (@theA <= "\n")
raise "Unterminated RegExp Literal" + @output
end
@output << @theA
end
@theB = mynext
end
end
end
end
http://github.com/victori/perlbal-plugin-mogilefs
Key features
- Asynchronous, does not stall the Perlbal event loop.
- Converts URL paths to MogileFS fetch keys.
- Failover to filesystem if key fetch failed.
- Pretty statistics in Perlbal’s Management console.
Its freaking awesome
On a side note, I have also updated my other two Perlbal plugins.
http://github.com/victori/perlbal-plugin-stickysessions
- Session affinity via Cookie.
http://github.com/victori/perlbal-plugin-backendheaders
- Appending Backend information on the served response.
*Update* Patches got accepted into MogileFS Trunk
Just go check out trunk, it has all my patches already included.
http://code.sixapart.com/svn/mogilefs/trunk/
The only thing you need is my mogstored disk patch which is still pending. All the issues revolving around postgresql and solaris have been already included in trunk.
I fixed a few issues with MogileFS and Solaris. MogileFS should run wonderfully on Solaris with my patches applied.
Directory for all my patches: http://victori.uploadbooth.com/patches
http://victori.uploadbooth.com/patches/solaris-disk-du.patch
This patch fixes mogstored to work with solaris’s df utility.
http://victori.uploadbooth.com/patches/store-max-requests.patch
This patch adds a new feature to the MogileFS Tracker – max_requests.
The default is 0, but it is suggested you set it to 1000 max_requests, to avoid memory leaks.
The tracker will give out the database handle up to the max_requests limit before expiring the connection for a new one. This avoids memory leaks with long running persistent connections. PostgreSQL has issues with long persistent connections, it accumulates a lot of ram and does not let go until the process/connection is killed off. This patch makes sure that the connection is expired after so many dbh handle requests.
http://victori.uploadbooth.com/patches/mogilefs-sunos-pg.patch
This patch applies the InactiveDestroy argument to avoid the MogileFS Tracker locking up with the PostgreSQL store on Solaris.
http://victori.uploadbooth.com/patches/solaris-mogilefs-full.patch
This is the full patch for all my fixes.
I am slowly migrating our fab40 static asset data to MogileFS. I have imported >300,000 images, no issues with my patches so far.
/ PLUG go make an account on uploadbooth!
Enjoy
I just received my “Guide to Open-Source Operating Systems” comparing Solaris with Linux from Sun’s marketing department. Here are some of the facts that made me cringe due to blatant lying and half truths. Hey Sun, don’t let the facts get in your way.
Believe it or not but this is actually verbatim from the guide.
• Solaris is supported by more applications.
• Solaris holds performance and price/performance world records that demonstrate its speed and scalability on a variety of systems.
• Solaris is supported by Sun, the company dedicated to UNIX for more than two decades.
1. Lets see, the first fact is just blatant lying. Last I checked Linux supported IA-32, MIPS, x86-64, SPARC, DEC Alpha, Itanium, PowerPC, ARM, m68k, PA-RISC, s390, SuperH, M32R and many more platforms. While Solaris only supports SPARC, IA-32 and x86-64. Does anyone at Sun’s marketing department care to fact check?
2. Depends on your definition of “supported.” Marketing is most likely referring to commercial support. I don’t have the facts to back this up but I doubt this is hold true with Linux in 2009, maybe they had a case back in 1999. Majority of open source applications are developed against Linux and Solaris compatibility is just an after thought.
3. You win http://www.tpc.org/tpcc/results/tpcc_perf_results.asp
Sun develops some of the best hardware and software on the market, but their marketing department is a disaster. There can only be one Steve Jobs and his reality distortion field.
Once again I have been blind sided by yet another conservative out-of-the-box setting. IPFilter is tuned way too conservative with it’s state table size.
Here is how you can tell if your hitting any issues, run ipfstat and check for lost packets.
victori@opensolaris:~# ipfstat | grep lost fragment state(in): kept 0 lost 0 not fragmented 0 fragment state(out): kept 0 lost 0 not fragmented 0 packet state(in): kept 798 lost 100 packet state(out): kept 612 lost 234
Notice that the in and out lost state lines have a non-zero value. This means IPFilter has been dropping client connections, bummer.
The default settings are quite conservative.
fr_statemax min 0×1 max 0x7fffffff current 4096
fr_statesize min 0×1 max 0x7fffffff current 5002
You need to shutdown IPFilter and apply larger table size limits.
victori@opensolaris:~# /usr/sbin/ipf -T fr_statemax=18963,fr_statesize=27091
Lets confirm that it works.
fr_statemax min 0×1 max 0x7fffffff current 18963
fr_statesize min 0×1 max 0x7fffffff current 27091
Awesome, now all we need to do is enable IPfilter and no more lost packets.
To make this persistent across reboots edit ipf.conf
name=”ipf” parent=”pseudo” instance=0 fr_statemax=18963 fr_statesize=27091;
Then update the contents
This can be applied to any OS that uses IPFilter.
Recently a primary boot disk went bad on our server and I got blind sided by a non-bootable secondary mirror disk. All the data was intact but I could not boot it. This required a slow re-installation and migration process that took a very long time.
• ZFS attach automatically partitions the drive as EFI.
• ZFS send/recv transfers on gzip compressed data-slices is slow.
Here is the correct way of getting both the disks in the ZFS mirror to boot.
Plug the new drive into the server that you want to add to the ZFS mirror. If your hot swapping or adding a new drive while the server is still on, you need to use cfgadm to configure it.
Now that the drive is configured and seen by the server you need to repartition it with format so it can be used as a bootable device.
AVAILABLE DISK SELECTIONS:
0. c4t0d0
/pci@0,0/pci8086,346c@1f,2/disk@0,0
1. c4t1d0
/pci@0,0/pci8086,346c@1f,2/disk@1,0
2. c4t2d0
/pci@0,0/pci8086,346c@1f,2/disk@2,0
* select your new drive *
# fdisk
* use fdisk to remove the EFI partition and add a solaris2 partition. *
Select the partition type to create:
1=SOLARIS2 2=UNIX 3=PCIXOS 4=Other
5=DOS12 6=DOS16 7=DOSEXT 8=DOSBIG
9=DOS16LBA A=x86 Boot B=Diagnostic C=FAT32
D=FAT32LBA E=DOSEXTLBA F=EFI 0=Exit?
This step is very important, if you did not repartition your drive, zfs attach will default the drive back to an EFI partition table that is not bootable.
c4t0d0s2 — primary drive.
c4t1d0s2 — new drive that we are setting up.
You should now be able to attach the secondary drive to your mirror using the identical slice.
Once the mirror is done synchronizing you need to install the bootloader on the drive.
Updating master boot sector destroys existing boot managers (if any).
continue (y/n)?y
stage1 written to partition 0 sector 0 (abs 16065)
stage2 written to partition 0, 267 sectors starting at 50 (abs 16115)
stage1 written to master boot sector
Trouble Shooting
raw device must be a root slice (not s2)
You did not re-partition the drive to a solaris2 partition. EFI partitions can’t be made bootable. Use the format tool to reconfigure the drive with a solaris2 partition.
cannot open/stat device /dev/rdsk/c1t0d0s0
You did not copy your label information from your primary to your secondary disk with prtvtoc and fmthard.
I have finally nailed out all our issues surrounding Varnish on Solaris, thanks to the help of sky from #varnish. Apparently Varnish uses a wrapper around connect() to drop stale connections to avoid thread pileups if the back-end ever dies. Setting connect_timeout to 0 will force Varnish to use connect() directly. This should eliminate all 503 back-end issues under Solaris that I have mentioned in an earlier blog post.
Here is our startup script for varnish that works for our needs. Varnish is a 64-bit binary hence the “-m64″ cc_command passed.
rm /sessions/varnish_cache.bin
newtask -p highfile /opt/extra/sbin/varnishd -f /opt/extra/etc/varnish/default.vcl -a 72.11.142.91:80 -p listen_depth=8192 -p thread_pool_max=2000 -p thread_pool_min=12 -p thread_pools=4 -p cc_command=’cc -Kpic -G -m64 -o %o %s’ -s file,/sessions/varnish_cache.bin,4G -p sess_timeout=10s -p max_restarts=12 -p session_linger=50s -p connect_timeout=0s -p obj_workspace=16384 -p sess_workspace=32768 -T 0.0.0.0:8086 -u webservd -F
I noticed varnish had particular problem of keeping connections around in CLOSE_WAIT state for a long time, enough to cause issues. I did some tuning on Solaris’s TCP stack so it is more aggressive in closing sockets after the work has been done.
Here are my aggressive TCP settings to force Solaris to close off connections in a short duration of time, to avoid file descriptor leaks. You can merge the following TCP tweaks with the settings I have posted earlier to handle more clients.
/usr/sbin/ndd -set /dev/tcp tcp_fin_wait_2_flush_interval 67500
# 30 seconds, aggressively close connections – default 4 minutes on solaris < 8
/usr/sbin/ndd -set /dev/tcp tcp_time_wait_interval 30000
# 1 minute, poll for dead connection - default 2 hours
/usr/sbin/ndd -set /dev/tcp tcp_keepalive_interval 60000
Last but not least, I have finally swapped out ActiveMQ for the FUSE message broker, an “enterprise” ActiveMQ distribution. Hopefully it won’t crash once a week like ActiveMQ does for us. The FUSE message broker is based off of ActiveMQ 5.3 sources that fix various memory leaks found in the current stable release of ActiveMQ 5.2 as of this writing.
If the FUSE message broker does not work out, I might have to give Kestrel a try. Hey, if it worked for twitter, it should work for us…right?
Want horoscopes on your site? Give our web service a try.
http://widgets.fabulously40.com/horoscope.json?sign=cancer http://widgets.fabulously40.com/horoscope.yml?sign=cancer http://widgets.fabulously40.com/horoscope.xml?sign=cancer
To pull by specific date…
YYYY-MM-DD month format
http://widgets.fabulously40.com/horoscope.json?sign=aries&date=2009-05-03 http://widgets.fabulously40.com/horoscope.yml?sign=aries&date=2009-05-03 http://widgets.fabulously40.com/horoscope.xml?sign=aries&date=2009-05-03
Not all malloc implementations are created equal
Leave a comment | Filed under administration benchmark mainI have recently blogged about swapping malloc implementations for the JVM to help boost multi-threaded performance. Well there is yet another malloc implementation that solaris comes with that is optimized for single threaded performance; bsdmalloc. I just recently switched our perl interpreter to use bsdmalloc and got 33% faster performance with our perlbal proxy.
You can try out multiple malloc implementations by setting LD_PRELOAD environment variable.
LD_PRELOAD="/usr/lib/libbsdmalloc.so" perl somecode.pl
So here is the rule of thumb for which malloc implementation to use for your application.
libumem = For multithreaded applications. umem avoids thread heap contention and is highly optimized for multi-threaded applications.
bsdmalloc = For single threaded applications. PHP/Perl/Python and Ruby will fall into this category.
Applying the right malloc implementation to your resource intensive application can see a nice performance benefit.
I wrote a quick micro benchmark to test out ruby threads. Apparently ruby can’t make use of multiple CPUs with it’s threading implementation. I guess you have to resort to forking to scale up to multiple cpu cores while using mri ruby. However, there is an alternative solution, just use JRuby. JRuby utilizes all cores when running the benchmark.
Ruby 1.8.7 - Compiled with SunCC SSX0903 (-xO5 -fast -xipo) Total number of insane floating point divisions in 10 seconds is 5969107
Ruby 1.9.1 / Compiled with GCC 4.3.2 (-O3 -fomit-frame-pointer) Total number of insane floating point divisions in 10 seconds is 8596894
Ran as: jruby –fast cpuMax.rb
177% increase in performance
JRuby 1.3-dev / JDK7 b56 Total number of insane floating point divisions in 10 seconds is 15915896
Ran as: jruby –fast -J-Djruby.compile.mode=JIT -J-Djruby.jit.threshold=0 -J-server cpuMax.rb
374% increase in performance
JRuby 1.3-dev / JDK7 b56 Total number of insane floating point divisions in 10 seconds is 28334441
Looking at mpstat, I can see the MRI ruby implementation is not utilizing all 4 cores.
Ruby 1.8.7 MRI CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 828 0 13 428 204 487 26 93 16 0 1903 4 5 0 91 1 2682 0 3 39 2 280 32 81 12 2 1189 13 2 0 85 2 1902 0 0 34 11 259 16 57 13 0 1094 11 3 0 86 3 1017 0 3 192 150 111 34 38 8 0 676 92 2 0 6 CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 2 0 5 422 205 388 11 75 5 0 627 4 1 0 95 1 161 0 13 20 2 196 11 61 8 0 1405 2 2 0 96 2 292 0 6 32 15 272 10 57 7 0 700 2 1 0 97 3 0 0 0 108 65 74 35 28 3 0 346 99 0 0 1
Now here is the JRuby implementation.
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 229 0 202 755 193 1977 365 329 94 1 2869 90 2 0 8 1 328 0 86 371 1 1817 226 303 125 0 2809 86 2 0 12 2 294 0 128 326 0 1771 248 287 109 0 2290 88 2 0 10 3 320 0 172 402 62 1848 246 241 116 0 2238 86 3 0 11 CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 317 0 297 700 192 2047 323 341 136 1 2819 89 2 0 9 1 320 0 61 279 2 1611 134 195 130 0 1960 85 2 0 13 2 288 0 235 379 0 1941 291 299 115 0 2462 87 2 0 11 3 308 0 78 316 43 1688 142 159 104 0 1706 85 2 0 13
I think I will stick to JRuby for production use.
require ‘thread’
threads = []
counter = 0
mutex = Mutex.new
4.times do
threads << Thread.new {
x=0
y=0
time=Time.new
while 1 do
if Time.new - time >= 10 then
break
else
x=1.00/24000000000.001
y+=1
end
end
mutex.synchronize { counter+=y.to_i }
}
end
threads.each { |t| t.join }
puts "Total number of insane floating point divisions in 10 seconds is "+counter.to_s


(5 votes, average: 3.80 out of 5)