Printing to stdout in IPython parallel processes
Asked Answered
C

1

8

I'm new to IPython and would like to print intermediate results to stdout while running IPython parallel cluster functions. (I'm aware that with multiple processes, this might mangle the output, but that's fine--it's just for testing/debugging, and the processes I'd be running are long enough that such a collision is unlikely.) I checked the documentation for IPython but can't find an example where the parallelized function prints. Basically, I'm looking for a way to redirect the print output of the subprocesses to the main stdout, the IPython equivalent of

subprocess.Popen( ... , stdout=...)

Printing inside the process doesn't work:

rc = Client()
dview = rc()
def ff(x):
    print(x)
    return x**2
sync = dview.map_sync(ff,[1,2,3,4])
print('sync res=%s'%repr(sync))
async = dview.map_async(ff,[1,2,3,4])
print('async res=%s'%repr(async))
print(async.display_outputs())

returns

sync res=[1, 4, 9, 16]
async res=[1, 4, 9, 16]

So the computation executes correctly, but the print statement in the function ff is never printed, not even when all the processes have returned. What am I doing wrong? How do I get "print" to work?

Cephalic answered 8/3, 2013 at 7:51 Comment(0)
D
9

It's actually more similar to subprocess.Popen( ... , stdout=PIPE) than you seem to be expecting. Just like the Popen object has a stdout attribute, which you can read to see the stdout of the subprocess, An AsyncResult has a stdout attribute that contains the stdout captured from the engines. It does differ in that AsyncResult.stdout is a list of strings, where each item in the list is the stdout of a single engine as a string.

So, to start out:

rc = parallel.Client()
dview = rc[:]
def ff(x):
    print(x)
    return x**2
sync = dview.map_sync(ff,[1,2,3,4])
print('sync res=%r' % sync)
async = dview.map_async(ff,[1,2,3,4])
print('async res=%r' % async)
async.get()

gives

sync res=[1, 4, 9, 16]
async res=<AsyncMapResult: ff>

We can see the AsyncResult.stdout list of strings:

print(async.stdout)
['1\n2\n', '3\n4\n']

We can see the stdout of the async result:

print('async output:')
async.display_outputs()

which prints:

async output:
[stdout:0] 
1
2
[stdout:1] 
3
4

And here is a notebook with all of this demonstrated.

Some things to note, based on your question:

  1. you have to wait for the AsyncResult to finish, before outputs are ready (async.get())
  2. display_outputs() does not return anything - it actually does the printing/displaying itself, so print(async.display_outputs()) doesn't make sense.
Dismal answered 8/3, 2013 at 19:54 Comment(4)
Very helpful answer. Is there any way to see the stdout printouts while computation is happening?Agosto
yes - for print statements, just do for out in asyncresult.stdout: print out, which you can do at any time, even while the output is partial.Dismal
Is there a way to achieve this without having access to the source code? There is a library that I am using that prints log messages in threads, and I'd like for it to print as it runs. Would I have to extend one of IPython's classes to do this?Teetotum
display_outputs is not real time print for the long time task.Collectivity

© 2022 - 2024 — McMap. All rights reserved.