pykka -- Actors are slow?
Asked Answered
M

1

7

I am currently experimenting with Actor-concurreny (on Python), because I want to learn more about this. Therefore I choosed pykka, but when I test it, it's more than half as slow as an normal function.

The Code is only to look if it works; it's not meant to be elegant. :)

Maybe I made something wrong?

from pykka.actor import ThreadingActor
import numpy as np

class Adder(ThreadingActor):
    def add_one(self, i):
        l = []
        for j in i:
            l.append(j+1)
        return l

if __name__ == '__main__':
    data = np.random.random(1000000)
    adder = Adder.start().proxy()
    adder.add_one(data)
    adder.stop()

This runs not so fast:

time python actor.py

real    0m8.319s
user    0m8.185s
sys     0m0.140s

And now the dummy 'normal' function:

def foo(i):
    l = []
    for j in i:
        l.append(j+1)
    return l

if __name__ == '__main__':
    data = np.random.random(1000000)
    foo(data)

Gives this result:

real    0m3.665s
user    0m3.348s
sys     0m0.308s
Mcginn answered 1/12, 2011 at 9:53 Comment(0)
D
14

So what is happening here is that your functional version is creating two very large lists which is the bulk of the time. When you introduce actors, mutable data like lists must be copied before being sent to the actor to maintain proper concurrency. Also the list created inside the actor must be copied as well when sent back to the sender. This means that instead of two very large lists being created we have four very large lists created instead.

Consider designing things so that data is constructed and maintained by the actor and then queried by calls to the actor minimizing the size of messages getting passed back and forth. Try to apply the principal of minimal data movement. Passing the List in the functional case is only efficient because the data is not actually moving do to leveraging a shared memory space. If the actor was on a different machine we would not have the benefit of a shared memory space even if the message data was immutable and didn't need to be copied.

Dulciedulcify answered 1/12, 2011 at 11:0 Comment(4)
And how could I make this efficient? I made a tuple out of the numpy-array, but the gain is not very heigh. (For example: distribute the loop efficient over serveral actors.)Mcginn
I updated my answer. Basically it's more challenging to design but it has more benefits like execution over a cluster of machines.Dulciedulcify
Thank you, I try now to build something.Mcginn
Agreed 100% with this answer, but specific to Pykka, the implementation does not copy messages by default. See last line in this section of the documentation.Patmore

© 2022 - 2024 — McMap. All rights reserved.