Does ruby have real multithreading?
Asked Answered
F

9

304

I know about the "cooperative" threading of ruby using green threads. How can I create real "OS-level" threads in my application in order to make use of multiple cpu cores for processing?

Foregut answered 11/9, 2008 at 9:1 Comment(0)
W
620

Updated with Jörg's Sept 2011 comment

You seem to be confusing two very different things here: the Ruby Programming Language and the specific threading model of one specific implementation of the Ruby Programming Language. There are currently around 11 different implementations of the Ruby Programming Language, with very different and unique threading models.

(Unfortunately, only two of those 11 implementations are actually ready for production use, but by the end of the year that number will probably go up to four or five.) (Update: it's now 5: MRI, JRuby, YARV (the interpreter for Ruby 1.9), Rubinius and IronRuby).

  1. The first implementation doesn't actually have a name, which makes it quite awkward to refer to it and is really annoying and confusing. It is most often referred to as "Ruby", which is even more annoying and confusing than having no name, because it leads to endless confusion between the features of the Ruby Programming Language and a particular Ruby Implementation.

It is also sometimes called "MRI" (for "Matz's Ruby Implementation"), CRuby or MatzRuby.

MRI implements Ruby Threads as Green Threads within its interpreter. Unfortunately, it doesn't allow those threads to be scheduled in parallel, they can only run one thread at a time.

However, any number of C Threads (POSIX Threads etc.) can run in parallel to the Ruby Thread, so external C Libraries, or MRI C Extensions that create threads of their own can still run in parallel.

  1. The second implementation is YARV (short for "Yet Another Ruby VM"). YARV implements Ruby Threads as POSIX or Windows NT Threads, however, it uses a Global Interpreter Lock (GIL) to ensure that only one Ruby Thread can actually be scheduled at any one time.

Like MRI, C Threads can actually run parallel to Ruby Threads.

In the future, it is possible, that the GIL might get broken down into more fine-grained locks, thus allowing more and more code to actually run in parallel, but that's so far away, it is not even planned yet.

  1. JRuby implements Ruby Threads as Native Threads, where "Native Threads" in case of the JVM obviously means "JVM Threads". JRuby imposes no additional locking on them. So, whether those threads can actually run in parallel depends on the JVM: some JVMs implement JVM Threads as OS Threads and some as Green Threads. (The mainstream JVMs from Sun/Oracle use exclusively OS threads since JDK 1.3)

  2. XRuby also implements Ruby Threads as JVM Threads. Update: XRuby is dead.

  3. IronRuby implements Ruby Threads as Native Threads, where "Native Threads" in case of the CLR obviously means "CLR Threads". IronRuby imposes no additional locking on them, so, they should run in parallel, as long as your CLR supports that.

  4. Ruby.NET also implements Ruby Threads as CLR Threads. Update: Ruby.NET is dead.

  5. Rubinius implements Ruby Threads as Green Threads within its Virtual Machine. More precisely: the Rubinius VM exports a very lightweight, very flexible concurrency/parallelism/non-local control-flow construct, called a "Task", and all other concurrency constructs (Threads in this discussion, but also Continuations, Actors and other stuff) are implemented in pure Ruby, using Tasks.

Rubinius can not (currently) schedule Threads in parallel, however, adding that isn't too much of a problem: Rubinius can already run several VM instances in several POSIX Threads in parallel, within one Rubinius process. Since Threads are actually implemented in Ruby, they can, like any other Ruby object, be serialized and sent to a different VM in a different POSIX Thread. (That's the same model the BEAM Erlang VM uses for SMP concurrency. It is already implemented for Rubinius Actors.)

Update: The information about Rubinius in this answer is about the Shotgun VM, which doesn't exist anymore. The "new" C++ VM does not use green threads scheduled across multiple VMs (i.e. Erlang/BEAM style), it uses a more traditional single VM with multiple native OS threads model, just like the one employed by, say, the CLR, Mono, and pretty much every JVM.

  1. MacRuby started out as a port of YARV on top of the Objective-C Runtime and CoreFoundation and Cocoa Frameworks. It has now significantly diverged from YARV, but AFAIK it currently still shares the same Threading Model with YARV. Update: MacRuby depends on apples garbage collector which is declared deprecated and will be removed in later versions of MacOSX, MacRuby is undead.

  2. Cardinal is a Ruby Implementation for the Parrot Virtual Machine. It doesn't implement threads yet, however, when it does, it will probably implement them as Parrot Threads. Update: Cardinal seems very inactive/dead.

  3. MagLev is a Ruby Implementation for the GemStone/S Smalltalk VM. I have no information what threading model GemStone/S uses, what threading model MagLev uses or even if threads are even implemented yet (probably not).

  4. HotRuby is not a full Ruby Implementation of its own. It is an implementation of a YARV bytecode VM in JavaScript. HotRuby doesn't support threads (yet?) and when it does, they won't be able to run in parallel, because JavaScript has no support for true parallelism. There is an ActionScript version of HotRuby, however, and ActionScript might actually support parallelism. Update: HotRuby is dead.

Unfortunately, only two of these 11 Ruby Implementations are actually production-ready: MRI and JRuby.

So, if you want true parallel threads, JRuby is currently your only choice – not that that's a bad one: JRuby is actually faster than MRI, and arguably more stable.

Otherwise, the "classical" Ruby solution is to use processes instead of threads for parallelism. The Ruby Core Library contains the Process module with the Process.fork method which makes it dead easy to fork off another Ruby process. Also, the Ruby Standard Library contains the Distributed Ruby (dRuby / dRb) library, which allows Ruby code to be trivially distributed across multiple processes, not only on the same machine but also across the network.

Wood answered 11/9, 2008 at 22:25 Comment(3)
but using fork will break usage on jruby... just sayingRosariorosarium
This is a great answer. However it is subject to a lot of link rot. I don't know where these resources may have moved though.Foxed
This one is the best one I came across, I was just wondering if Ruby support multi threading like java, had checked a few other blogs and all apparently seem to have knowledge on the specific implementation. I was confused which blog to follow ;) Thankyou very much @Jörg for such a comprehensive explanation.Falcongentle
P
23

Ruby 1.8 only has green threads, there is no way to create a real "OS-level" thread. But, ruby 1.9 will have a new feature called fibers, which will allow you to create actual OS-level threads. Unfortunately, Ruby 1.9 is still in beta, it is scheduled to be stable in a couple of months.

Another alternative is to use JRuby. JRuby implements threads as OS-level theads, there are no "green threads" in it. The latest version of JRuby is 1.1.4 and is equivalent to Ruby 1.8

Possing answered 11/9, 2008 at 9:5 Comment(3)
It's false that Ruby 1.8 has only green threads, several implementations of Ruby 1.8 have native threads: JRuby, XRuby, Ruby.NET and IronRuby. Fibers don't allow the creation of native threads, they are more lightweight than threads. They are actually semi-coroutines, i.e. they are cooperative.Operation
I think it's pretty obvious from Josh's answer that he means Ruby 1.8 the runtime, a.k.a. MRI, and not Ruby 1.8 the language, when he says Ruby 1.8.Hauberk
@Hauberk It's also obvious that he messes up concepts in his answer. Fibers are not a way to create native threads, as already mentioned, they are even more lightweight things than threads and current cruby has native threads but with GIL.Yoghurt
M
9

It depends on the implementation:

  • MRI doesn't have, YARV is closer.
  • JRuby and MacRuby have.




Ruby has closures as Blocks, lambdas and Procs. To take full advantage of closures and multiple cores in JRuby, Java's executors come in handy; for MacRuby I like GCD's queues.

Note that, being able to create real "OS-level" threads doesn't imply that you can use multiple cpu cores for parallel processing. Look at the examples below.

This is the output of a simple Ruby program which uses 3 threads using Ruby 2.1.0:

(jalcazar@mac ~)$ ps -M 69877
USER     PID   TT   %CPU STAT PRI     STIME     UTIME COMMAND
jalcazar 69877 s002    0.0 S    31T   0:00.01   0:00.04 /Users/jalcazar/.rvm/rubies/ruby-2.1.0/bin/ruby threads.rb
   69877         0.0 S    31T   0:00.01   0:00.00 
   69877        33.4 S    31T   0:00.01   0:08.73 
   69877        43.1 S    31T   0:00.01   0:08.73 
   69877        22.8 R    31T   0:00.01   0:08.65 

As you can see here, there are four OS threads, however only the one with state R is running. This is due to a limitation in how Ruby's threads are implemented.



Same program, now with JRuby. You can see three threads with state R, which means they are running in parallel.

(jalcazar@mac ~)$ ps -M 72286
USER     PID   TT   %CPU STAT PRI     STIME     UTIME COMMAND
jalcazar 72286 s002    0.0 S    31T   0:00.01   0:00.01 /Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/bin/java -Djdk.home= -Djruby.home=/Users/jalcazar/.rvm/rubies/jruby-1.7.10 -Djruby.script=jruby -Djruby.shell=/bin/sh -Djffi.boot.library.path=/Users/jalcazar/.rvm/rubies/jruby-1.7.10/lib/jni:/Users/jalcazar/.rvm/rubies/jruby-1.7.10/lib/jni/Darwin -Xss2048k -Dsun.java.command=org.jruby.Main -cp  -Xbootclasspath/a:/Users/jalcazar/.rvm/rubies/jruby-1.7.10/lib/jruby.jar -Xmx1924M -XX:PermSize=992m -Dfile.encoding=UTF-8 org/jruby/Main threads.rb
   72286         0.0 S    31T   0:00.00   0:00.00 
   72286         0.0 S    33T   0:00.00   0:00.00 
   72286         0.0 S    31T   0:00.09   0:02.34 
   72286         7.9 S    31T   0:00.15   0:04.63 
   72286         0.0 S    31T   0:00.00   0:00.00 
   72286         0.0 S    31T   0:00.00   0:00.00 
   72286         0.0 S    31T   0:00.00   0:00.00 
   72286         0.0 S    31T   0:00.04   0:01.68 
   72286         0.0 S    31T   0:00.03   0:01.54 
   72286         0.0 S    31T   0:00.00   0:00.00 
   72286         0.0 S    31T   0:00.01   0:00.01 
   72286         0.0 S    31T   0:00.00   0:00.01 
   72286         0.0 S    31T   0:00.00   0:00.03 
   72286        74.2 R    31T   0:09.21   0:37.73 
   72286        72.4 R    31T   0:09.24   0:37.71 
   72286        74.7 R    31T   0:09.24   0:37.80 


The same program, now with MacRuby. There are also three threads running in parallel. This is because MacRuby threads are POSIX threads (real "OS-level" threads) and there is no GVL

(jalcazar@mac ~)$ ps -M 38293
USER     PID   TT   %CPU STAT PRI     STIME     UTIME COMMAND
jalcazar 38293 s002    0.0 R     0T   0:00.02   0:00.10 /Users/jalcazar/.rvm/rubies/macruby-0.12/usr/bin/macruby threads.rb
   38293         0.0 S    33T   0:00.00   0:00.00 
   38293       100.0 R    31T   0:00.04   0:21.92 
   38293       100.0 R    31T   0:00.04   0:21.95 
   38293       100.0 R    31T   0:00.04   0:21.99 


Once again, the same program but now with the good old MRI. Due to the fact that this implementation uses green-threads, only one thread shows up

(jalcazar@mac ~)$ ps -M 70032
USER     PID   TT   %CPU STAT PRI     STIME     UTIME COMMAND
jalcazar 70032 s002  100.0 R    31T   0:00.08   0:26.62 /Users/jalcazar/.rvm/rubies/ruby-1.8.7-p374/bin/ruby threads.rb



If you are interested in Ruby multi-threading you might find my report Debugging parallel programs using fork handlers interesting.
For a more general overview of the Ruby internals Ruby Under a Microscope is a good read.
Also, Ruby Threads and the Global Interpreter Lock in C in Omniref explains in the source code why Ruby threads don't run in parallel.

Marysa answered 3/2, 2014 at 16:0 Comment(1)
By RMI, you mean MRI?Xanthous
S
4

How about using drb? It's not real multi-threading but communication between several processes, but you can use it now in 1.8 and it's fairly low friction.

Sear answered 11/9, 2008 at 11:57 Comment(0)
P
3

I'll let the "System Monitor" answer this question. I'm executing the same code (below, which calculates prime numbers) with 8 Ruby threads running on an i7 (4 hyperthreaded-core) machine in both cases... the first run is with:

jruby 1.5.6 (ruby 1.8.7 patchlevel 249) (2014-02-03 6586) (OpenJDK 64-Bit Server VM 1.7.0_75) [amd64-java]

The second is with:

ruby 2.1.2p95 (2014-05-08) [x86_64-linux-gnu]

Interestingly, the CPU is higher for JRuby threads, but the time to completion is slightly shorter for the interpreted Ruby. It's kind of difficult to tell from the graph, but the second (interpreted Ruby) run uses about 1/2 the CPUs (no hyperthreading?)

enter image description here

def eratosthenes(n)
  nums = [nil, nil, *2..n]
  (2..Math.sqrt(n)).each do |i|
    (i**2..n).step(i){|m| nums[m] = nil}  if nums[i]
  end
  nums.compact
end

MAX_PRIME=10000000
THREADS=8
threads = []

1.upto(THREADS) do |num|
  puts "Starting thread #{num}"
  threads[num]=Thread.new { eratosthenes MAX_PRIME }
end

1.upto(THREADS) do |num|
    threads[num].join
end
Peirsen answered 22/2, 2015 at 23:7 Comment(0)
T
1

If you are using MRI, then you can write the threaded code in C either as an extension or using the ruby-inline gem.

Tipsy answered 11/9, 2008 at 18:50 Comment(0)
S
1

If you really need parallelism in Ruby for a Production level system (where you cannot employ a beta) processes are probably a better alternative.
But, it is most definitely worth trying threads under JRuby first.

Also if you are interested in future of threading under Ruby, you might find this article useful.

Sheath answered 11/9, 2008 at 19:10 Comment(1)
JRuby is a good option. For parallel processing using processes I like github.com/grosser/parallel Parallel.map(['a','b','c'], :in_processes=>3){...Marysa
T
1

Here is some info on Rinda which is Ruby implementation of Linda (parallel processing and distributed computing paradigm) http://charmalloc.blogspot.com/2009/12/linda-tuples-rinda-drb-parallel.html

Tapping answered 1/1, 2010 at 0:24 Comment(0)
R
1

Because could not edit that answer, so add a new reply here.

Update(2017-05-08)

This article is very old, and information is not follow current (2017) tread, Following is some supplement:

  1. Opal is a Ruby to JavaScript source-to-source compiler. It also has an implementation of the Ruby corelib, It current very active develompent, and exist a great deal of (frontend) framework worked on it. and production ready. Because base on javascript, it not support parallel threads.

  2. truffleruby is a high performance implementation of the Ruby programming language. Built on the GraalVM by Oracle Labs,TruffleRuby is a fork of JRuby, combining it with code from the Rubinius project, and also containing code from the standard implementation of Ruby, MRI, still live development, not production ready. This version ruby seem like born for performance, I don't know if support parallel threads, but I think it should.

Rhinencephalon answered 8/5, 2017 at 14:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.