I wrote a quick micro benchmark to test out ruby threads. Apparently ruby can’t make use of multiple CPUs with it’s threading implementation. I guess you have to resort to forking to scale up to multiple cpu cores while using mri ruby. However, there is an alternative solution, just use JRuby. JRuby utilizes all cores when running the benchmark.


Ruby 1.8.7 - Compiled with SunCC SSX0903 (-xO5 -fast -xipo)
Total number of insane floating point divisions in 10 seconds is 5969107

Ruby 1.9.1 / Compiled with GCC 4.3.2 (-O3 -fomit-frame-pointer)
Total number of insane floating point divisions in 10 seconds is 8596894

Ran as: jruby –fast cpuMax.rb
177% increase in performance


JRuby 1.3-dev / JDK7 b56
Total number of insane floating point divisions in 10 seconds is 15915896

Ran as: jruby –fast -J-Djruby.compile.mode=JIT -J-Djruby.jit.threshold=0 -J-server cpuMax.rb
374% increase in performance


JRuby 1.3-dev / JDK7 b56
Total number of insane floating point divisions in 10 seconds is 28334441

Looking at mpstat, I can see the MRI ruby implementation is not utilizing all 4 cores.

Ruby 1.8.7 MRI

CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0  828   0   13   428  204  487   26   93   16    0  1903    4   5   0  91
  1 2682   0    3    39    2  280   32   81   12    2  1189   13   2   0  85
  2 1902   0    0    34   11  259   16   57   13    0  1094   11   3   0  86
  3 1017   0    3   192  150  111   34   38    8    0   676   92   2   0   6
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    2   0    5   422  205  388   11   75    5    0   627    4   1   0  95
  1  161   0   13    20    2  196   11   61    8    0  1405    2   2   0  96
  2  292   0    6    32   15  272   10   57    7    0   700    2   1   0  97
  3    0   0    0   108   65   74   35   28    3    0   346   99   0   0   1

Now here is the JRuby implementation.

CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0  229   0  202   755  193 1977  365  329   94    1  2869   90   2   0   8
  1  328   0   86   371    1 1817  226  303  125    0  2809   86   2   0  12
  2  294   0  128   326    0 1771  248  287  109    0  2290   88   2   0  10
  3  320   0  172   402   62 1848  246  241  116    0  2238   86   3   0  11
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0  317   0  297   700  192 2047  323  341  136    1  2819   89   2   0   9
  1  320   0   61   279    2 1611  134  195  130    0  1960   85   2   0  13
  2  288   0  235   379    0 1941  291  299  115    0  2462   87   2   0  11
  3  308   0   78   316   43 1688  142  159  104    0  1706   85   2   0  13

I think I will stick to JRuby for production use.

#!/usr/bin/ruby

require 'thread'
threads = []
counter = 0
mutex = Mutex.new

4.times do
     threads << Thread.new {
        x=0
        y=0
        time=Time.new

        while 1 do
                if Time.new - time >= 10 then
                        break
                else
                        x=1.00/24000000000.001
                        y+=1
                end
        end
        mutex.synchronize { counter+=y.to_i }
    }
end
threads.each { |t| t.join }

puts "Total number of insane floating point divisions in 10 seconds is "+counter.to_s