I am trying to make use of my computer's multiple CPUs. However, the BeautifulSoup
object returned by my function as part of an SQLAlchemy
object is not picklable with pickle or cPickle so I am using pathos, a fork of the multiprocssing
package that uses dill
such that it can pickle any python object. I tested dill on the object that I could not pickle and it worked, so I thought my problem would be solved. However, when I use pathos' pool.map
I have the same problem that I did before, mainly that the function completes but the result is not returned. I confirmed this by using results = pool.amap(myfunc, myarglist)
which completes, but results.get()
which does not. Unfortunately, I cannot post the html for the page (it is not publicly available), and I have been unable to find a reproducible example of the problem. This answer includes a function for troubleshooting multiprocessing of large objects, but unfortunately it uses Queue
which does not seem to be implemented for pathos
by itself (only presumably under the hood within the pool.map
function). I am using the 0.2a1.dev
version of pathos (with dependencies installed with pip prior to compiling from source) on python 2.7. Here is the traceback for the keyboard interrupt:
Process PoolWorker-2:
Process PoolWorker-7:
Traceback (most recent call last):
Process PoolWorker-8:Process PoolWorker-6:Process PoolWorker-3:Process PoolWorker-5:Process PoolWorker-4:Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/process.py", line 227, in _bootstrap
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/process.py", line 227, in _bootstrap
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/process.py", line 227, in _bootstrap
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/process.py", line 227, in _bootstrap
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/process.py", line 227, in _bootstrap
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/process.py", line 227, in _bootstrap
self.run()
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/process.py", line 85, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/pool.py", line 59, in worker
self.run()
self.run()
self.run()
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/process.py", line 85, in run
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/process.py", line 85, in run
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/process.py", line 85, in run
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/pool.py", line 54, in worker
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/pool.py", line 54, in worker
self._target(*self._args, **self._kwargs)
self.run()
self.run()
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/pool.py", line 54, in worker
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/process.py", line 85, in run
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/process.py", line 85, in run
put((job, i, result))
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/queue.py", line 339, in put
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/pool.py", line 54, in worker
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/pool.py", line 54, in worker
for job, i, func, args, kwds in iter(inqueue.get, None):
for job, i, func, args, kwds in iter(inqueue.get, None):
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/queue.py", line 325, in get
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/queue.py", line 325, in get
wacquire()
KeyboardInterrupt
for job, i, func, args, kwds in iter(inqueue.get, None):
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/queue.py", line 325, in get
racquire()
racquire()
for job, i, func, args, kwds in iter(inqueue.get, None):
for job, i, func, args, kwds in iter(inqueue.get, None):
KeyboardInterrupt
KeyboardInterrupt
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/queue.py", line 325, in get
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/queue.py", line 325, in get
racquire()
KeyboardInterrupt
racquire()
racquire()
KeyboardInterrupt
KeyboardInterrupt
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/process.py", line 227, in _bootstrap
self.run()
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/process.py", line 85, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/pool.py", line 54, in worker
for job, i, func, args, kwds in iter(inqueue.get, None):
File "/usr/local/lib/python2.7/dist-packages/processing-0.52_pathos-py2.7-linux-x86_64.egg/processing/queue.py", line 327, in get
return recv()
KeyboardInterrupt
pathos.multiprocessing.Pool
orProcessingPool
?Pool
usesdill
instead ofpickle
, but doesn't have the rest of the augments thatProcessingPool
has. If your function is callingQueue
as you have indicated elsewhere, you may be out of luck. You could possibly use shared memory inmultiprocessing
withctypes
. I don't know, hard to say without seeing your code. There is an option indill
that provides compression, but it's turned off at the moment... – Nilgaipathos.multiprocessing.ProcessingPool
which is used in the pathos documentation. Doesn't that not usedill
? At any rate I just triedpathos.multiprocessing.Pool
and got the same result. – Raynolddill
, both do. You are using it as intended, it seems. Sorry for my confusion. Looks the size of the pickle causes an issue, as your trace says.dill
andpathos
have some compression options that I could try, given an example. There's also shared memory as I mentioned. – Nilgaipathos
multiprocessing package to pickle large objects differently? The biggest issue for me is that because the script simply hangs I cannot figure out how to catch this as an error so my script crashes. – Raynolddill.dumps(myobject)
it hangs. – Raynolddill
, but it is exposed in another package. It's hard to tell if it's compression or size, or what without seeing a sample. Would it be possible to post or send the code? – Nilgaidill
just hang on adump
. There are several methods to try indill.detect
that give you information on what is happening. If you still can't post a reduced example of your code for whatever reason, you could at least try some of thedill.detect
methods, and maybe find some clue to what the error is. – Nilgaidill
orpathos.multiprocessing
…cPickle
would simply work. Sorry I couldn't be of more help. – Nilgai