This issue may be related to https://github.com/ipython/ipyparallel/issues/207 which is also not marked as solved, yet.
I also opened this issue here https://github.com/ipython/ipyparallel/issues/286
I want to execute multiple tasks in parallel using python and ipyparallel in a jupyter notebook and using 4 local engines by executing ipcluster start
in a local console.
Besides that one can also use DirectView
, I use LoadBalancedView
to map a set of tasks. Each task takes around 0.2 seconds (can vary though) and each task does a MySQL query where it loads some data and then processes it.
Working with ~45000 tasks works fine, however, my memory grows really high. This is actually bad because I want to run another experiment with over 660000 tasks which I can't run anymore because it bloats up my memory limit of 16 GB and then the memory swapping on my local drive starts. However, when using the DirectView
my memory grows relatively small and is never full. But I actually need LoadBalancedView
.
Even when running a minimal working example without database query this happens (see below).
I am not perfectly familiar with the ipyparallel library but I've read something about logs and caches that the ipcontroler does which may cause this. I am still not sure if it is a bug or if I can change some settings to avoid my problem.
Running a MWE
For my Python 3.5.3 environment running on Windows 10 I use the following (recent) packages:
- ipython 6.1.0
- ipython_genutils 6.1.0
- ipyparallel 6.0.2
- jupyter 1.0.0
- jupyter_client 4.4.0
- jupyter_console 5.0.0
- jupyter_core 4.2.0
I would like the following example to work for LoadBalancedView
without the immense memory growth (if possible at all):
- Start
ipcluster start
on a console Run a jupyter notebook with the following three cells:
<1st cell> import ipyparallel as ipp rc = ipp.Client() lview = rc.load_balanced_view() <2nd cell> %%px --local import time <3rd cell> def sleep_here(i): time.sleep(0.2) return 42 amr = lview.map_async(sleep_here, range(660000)) amr.wait_interactive()