Understanding Celluloid Concurrency
Asked Answered
A

2

10

Following are my Celluloid codes.

  1. client1.rb One of the 2 clients. (I named it as client 1)

  2. client2.rb 2nd of the 2 clients. (named as client 2 )

Note:

the only the difference between the above 2 clients is the text that is passed to the server. i.e ('client-1' and 'client-2' respectively)

On testing this 2 clients (by running them side by side) against following 2 servers (one at time). I found very strange results.

  1. server1.rb (a basic example taken from the README.md of the celluloid-zmq)

    Using this as the example server for the 2 above clients resulted in parallel executions of tasks.

OUTPUT

ruby server1.rb

Received at 04:59:39 PM and message is client-1
Going to sleep now
Received at 04:59:52 PM and message is client-2

Note:

the client2.rb message was processed when client1.rb request was on sleep.(mark of parallelism)

  1. server2.rb

    Using this as the example server for the 2 above clients did not resulted in parallel executions of tasks.

OUTPUT

ruby server2.rb

Received at 04:55:52 PM and message is client-1
Going to sleep now
Received at 04:56:52 PM and message is client-2

Note:

the client-2 was ask to wait 60 seconds since client-1 was sleeping(60 seconds sleep)

I ran the above test multiple times all resulted in same behaviour.

Can anyone explain me from the results of the above tests that.

Question: Why is celluloid made to wait for 60 seconds before it can process the other request i.e as noticed in server2.rb case.?

Ruby version

ruby -v

ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-darwin13.0]

Ageold answered 28/1, 2016 at 12:42 Comment(3)
What ruby engine do you use?Astrometry
Have you tried NOT using the DisplayMessage class in server2.rb? That's also a difference.Astrometry
@Astrometry Updated Ruby Version in the Post.Ageold
A
6

Using your gists, I verified this issue can be reproduced in MRI 2.2.1 as well as jRuby 1.7.21 and Rubinius 2.5.8 ... The difference between server1.rb and server2.rb is the use of the DisplayMessage and message class method in the latter.


Use of sleep in DisplayMessage is out of Celluloid scope.

When sleep is used in server1.rb it is using Celluloid.sleep in actuality, but when used in server2.rb it is using Kernel.sleep ... which locks up the mailbox for Server until 60 seconds have passed. This prevents future method calls on that actor to be processed until the mailbox is processing messages ( method calls on the actor ) again.

There are three ways to resolve this:

  • Use a defer {} or future {} block.

  • Explicitly invoke Celluloid.sleep rather than sleep ( if not explicitly invoked as Celluloid.sleep, using sleep will end up calling Kernel.sleep since DisplayMessage does not include Celluloid like Server does )

  • Bring the contents of DisplayMessage.message into handle_message as in server1.rb; or at least into Server, which is in Celluloid scope, and will use the correct sleep.


The defer {} approach:

def handle_message(message)
  defer {
    DisplayMessage.message(message)
  }
end

The Celluloid.sleep approach:

class DisplayMessage
    def self.message(message)
      #de ...
      Celluloid.sleep 60
    end
end

Not truly a scope issue; it's about asynchrony.

To reiterate, the deeper issue is not the scope of sleep ... that's why defer and future are my best recommendation. But to post something here that came out in my comments:

Using defer or future pushes a task that would cause an actor to become tied up into another thread. If you use future, you can get the return value once the task is done, if you use defer you can fire & forget.

But better yet, create another actor for tasks that tend to get tied up, and even pool that other actor... if defer or future don't work for you.

I'd be more than happy to answer follow-up questions brought up by this question; we have a very active mailing list, and IRC channel. Your generous bounties are commendable, but plenty of us would help purely to help you.

Astrometry answered 30/1, 2016 at 17:52 Comment(12)
Sorry, but i noticed your answer after i had finished beautifying and posted mine. Anyways your answer has more approaches, so sounds better to me too. Here's my +1.Gomez
you might want to correct the statement in the 2nd bullet-point of the "three ways to resolve this" list. I have verified that invoking Celluloid.sleep() within DisplayMessage.message() does in-fact trigger the actor sleeping path. It appears to be that Celluloid can correctly determine the "actor" context even if methods of other objects are invoked by a Celluloid "actor".Gomez
@Gomez you mean change the wording in the parenthesis? Because invoking Celluloid.sleep does work, like in my code sample. The sleep method body I linked to does detect if it's being run from inside an actor, but it does need to be explicitly called ( if the call is made outside Celluloid scope )Astrometry
"which will end up calling Kernel.sleep" does not appear to be true. I added logs in my local copy of celluloid.rb and saw that invoking Celluloid.sleep() within DisplayMessage.message() does call actor.sleep (not kernel.sleep())Gomez
... we're saying the same thing ... and if we're not, I'm very confused about what you mean, because your entire answer reiterates my bullet point several times. If explicitly called, Celluloid.sleep will invoke the appropriate method, whereas sleep by itself, if called without DisplayMessage scope, will call Kernel.sleep ... I will try to tweak that line to make it clearer?Astrometry
I edited it, maybe that's clearer but the message is the same as I first wrote. But for the record, I favor defer { } over explicitly calling Celluloid.sleep because sometimes we do not have control over external classes ... or we don't wish to invade them to make them compatible with Celluloid ... defer {} is the universal cure, no matter whether one owns the offending class or not.Astrometry
Yep. Crystal clear now. Next to wait for OP to wake-up and [+300] your answer. (^.^)/ Cheers.Gomez
Thanks/good. Nice running into you -- followed you on Twitter. I am a Celluloid maintainer ( hence knowledge of defer which is obscure ) and would gladly invite you to work on Celluloid, as you apparently have put a lot of work into understanding it.Astrometry
Oh! but i had no idea celluloid existed before today. I was scanning for any unanswered/neglected questions and came across this. Kudos to the self-explanatory code and the fact that its on github.Gomez
@Astrometry So what I understand any potential IO that is not a part of Celluloid core classes would cause Celluloid to work seriallyAgeold
The purpose of Celluloid::IO is to prevent that, but when an actor gets tied up, its mailbox will not be processing new requests. They will be received when the actor is no longer tied up. Reason being, the actor and its mailbox are operating in one thread, and each method call against the actor is a task which occurs in a fiber... by default. It is possible to use tasks which are each themselves a thread, but that's not the default behavior. In general, it's best to assume that a long-running process will tie up an actor. This is why I recommend defer or future where needed.Astrometry
Using defer or future pushes a task that would cause an actor to become tied up into another thread. If you use future, you can get the return value once the task is done, if you use defer you can fire & forget. But better yet, you can create another actor for tasks that tend to get tied up, and even pool that other actor... if defer or future don't work for you. I'd be more than happy to answer follow-up questions brought up by this question, we have a very active mailing list, and IRC channel. Your generous bounties are commendable, but plenty of us would help purely to help you.Astrometry
G
3

Managed to reproduce and fix the issue. Deleting my previous answer. Apparently, the problem lies in sleep. Confirmed by adding logs "actor/kernel sleeping" to the local copy of Celluloids.rb's sleep().


In server1.rb,

the call to sleep is within server - a class that includes Celluloid.

Thus Celluloid's implementation of sleep overrides the native sleep.

class Server
  include Celluloid::ZMQ

  ...

  def run
    loop { async.handle_message @socket.read }
  end

  def handle_message(message)

        ...

        sleep 60
  end
end

Note the log actor sleeping from server1.rb. Log added to Celluloids.rb's sleep()

This suspends only the current "actor" in Celluloid i.e. only the current "Celluloid thread" handling the client1 sleeps.


In server2.rb,

the call to sleep is within a different class DisplayMessage that does NOT include Celluloid.

Thus it is the native sleep itself.

class DisplayMessage
    def self.message(message)

           ...

           sleep 60
    end
end

Note the ABSENCE of any actor sleeping log from server2.rb.

This suspends the current ruby task i.e. the ruby server sleeps (not just a single Celluloid actor).


The Fix?

In server2.rb, the appropriate sleep must be explicitly specified.

class DisplayMessage
    def self.message(message)
        puts "Received at #{Time.now.strftime('%I:%M:%S %p')} and message is #{message}"
        ## Intentionally added sleep to test whether Celluloid block the main process for 60 seconds or not.
        if message == 'client-1'
           puts 'Going to sleep now'.red

           # "sleep 60" will invoke the native sleep.
           # Use Celluloid.sleep to support concurrent execution
           Celluloid.sleep 60
        end
    end
end
Gomez answered 30/1, 2016 at 18:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.