Import custom modules on IPython.parallel engines with sync_imports()
Asked Answered
P

1

9

I've been playing around with IPython.parallel and I wanted to use some custom modules of my own, but haven't been able to do it as explained on the cookbook using dview.sync_imports(). The only thing that has worked for me was something like

def my_parallel_func(args):
    import sys
    sys.path.append('/path/to/my/module')
    import my_module
    #and all the rest

and then in the main just to

if __name__=='__main__':
     #set up dview...
     dview.map( my_parallel_func, my_args )

The correct way to do this would in my opinion be something like

 with dview.sync_imports():
     import sys
     sys.path.append('/path/to/my/module')
     import my_module

but this throws an error saying there is no module named my_module.

So, what is the right way of doing it using dview.sync_imports()??

Pitchman answered 2/9, 2013 at 9:42 Comment(1)
right, but sys.path does. Corrected it!Pitchman
L
9

The problem is that you're changing the PYTHONPATH just in the local process running the Client, and not in the remote processes running in the ipcluster.

You can observe this behaviour if you run the next piece of code:

from IPython.parallel import Client

rc = Client()
dview = rc[:]

with dview.sync_imports():
    import sys
    sys.path[:] = ['something']
   
def parallel(x):
    import sys
    return sys.path

print 'Local: ', sys.path
print 'Remote: ', dview.map_sync(parallel, range(1))

Basically all the modules that you want to use with sync_imports must already be in the PYTHONPATH.

If it's not in the PYTHONPATH then you must add it to the path in the function that you execute remotely, and then import the module in the function.

Langrage answered 2/9, 2013 at 10:14 Comment(7)
Ok, so I did that, but for some reason the remote engines are not getting the PYTHONPATH right. What environment do the remote engines get their PYTHONPATH from? The one I ran the script from? The default shell for the system? Python config files? I ran your exact script, PYTHONPATH is correctly configured, the local import works, the remote import doesn't.Pitchman
@AlexS How do you start the remote engines? ipcluster start -n 4?Langrage
Yes exactly. I managed to get around the problem by modifying the engine startup script top import sys and modifying sys.path directly. Then the imports work normally..Pitchman
@AlexS You should just modify the PYTHONPATH environment variable. I think it would be easier... If you're on posix system just run it as: PYTHONPATH=${PYTHONPATH}:/path/to/your/module ipcluster start -n 4Langrage
I'm on a debian box, using zshell, both bash and Zsh show the correct PYTHONPATH if i run echo $(PYTHONPATH). Not sure what I'm doing wrong. Right, if i run the ipcluster command that way it should work.Pitchman
Your module is in python path? Weird :-#Langrage
just as a hint: the working directory from which you run ipcluster start -n 4 is in the PYTHONPATH. If all required custom modules are in the same directory, running ipcluster from this directory is a possible way to go.Bilski

© 2022 - 2024 — McMap. All rights reserved.