IPython.parallel not using multicore?
Asked Answered
N

1

11

I am experimenting with IPython.parallel and just want to launch several shell command on different engines.

I have the following Notebook:

Cell 0:

from IPython.parallel import Client
client = Client()
print len(client)
5

And launch the commands:

Cell 1:

%%px --targets 0 --noblock
!python server.py

Cell 2:

%%px --targets 1 --noblock
!python mincemeat.py 127.0.0.1

Cell 3:

%%px --targets 2 --noblock
!python mincemeat.py 127.0.0.1

What it does is it uses the mincemeat implementation of MapReduce. When I launch the first !python mincemeat.py 127.0.0.1 it uses roughly 100 % of one core, then when I launch the second it drops to 50 % each. I have 4 cores (+virtual cores) on the machine and can use them when launching directly from the terminal but not in the Notebook.

Is there something I am missing? I would like to use one core per !python mincemeat.py 127.0.0.1 command.

EDIT:
For clarity, here's another thing that's not using multiple cores:

Cell 1:

%%px --targets 0 --noblock

a = 0
for i in xrange(100000):
    for j in xrange(10000):
        a += 1

Cell 2:

%%px --targets 0 --noblock

a = 0
for i in xrange(100000):
    for j in xrange(10000):
        a += 1

I suppose I am missing something. I believe those two cells should run one different cores if available. However, it does not seem to be the case. Again the CPU usage shows that they share the same core and use 50 % of it. What did I do wrong?

Nigercongo answered 1/5, 2013 at 18:4 Comment(7)
I'm not sure what the point is of using IPython.parallel here, when you are just running single-line shell commands on one machine at a time, but it is unlikely that IPython.parallel has any ability to interfere with how many cores your subprocesses are using. What does it look like if you do this same example without IPython.parallel (since it's just three one-line shell calls)?Teshatesla
Hi @mnirk. Without Ipython.parallel the cells are blocking, it's much less interesting. To clarify things, I don't want to run one process on different cores, I rather would like that each process get one core. That's why I assign each command to a different target. However, it seems that all the engines (target 0 to 4)are running on the same core.Nigercongo
I mean do it in three plain terminal sessions - that's all you are doing right now, running a single shell command in three separate Sessions. IPython is not really involved at all.Teshatesla
The main thing I want to isolate is the fact that I doubt IPython actually has anything to do with this behavior. I would like you to demonstrate that your script will behave differently when run outside of IPython entirely, which I don't expect it will. You are not actually running any nontrivial code in an IPython process - ! starts a new shell subprocess.Teshatesla
@minrk, as I said in the post when using different shells it works as I expect it to work. I actually started by launching the commands in different terminal sessions and I got to 6x100% of CPU usage ( I actually launched 6 mappers). I would like to reproduce that in the notebook.Nigercongo
@minrk. CAn you tell me what behaviour you have on the second example in Ipython? The one with two dummy loops.Nigercongo
let us continue this discussion in chatNigercongo
T
15

Summary of the chat discussion:

CPU affinity is a mechanism for pinning a process to a particular CPU core, and the issue here is that sometimes importing numpy can end up pinning Python processes to CPU 0, as a result of linking against particular BLAS libraries. You can unpin all of your engines by running this cell:

%%px
import os
import psutil
from multiprocessing import cpu_count

p = psutil.Process(os.getpid())
p.set_cpu_affinity(range(cpu_count()))
print p.get_cpu_affinity()

Which uses multiprocessing.cpu_count to get the number of CPUs, and then associates every engine with all CPUs.

An IPython notebook exploring the issue.

Teshatesla answered 2/5, 2013 at 4:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.