python - ipyparallel's LoadBalancedView bloats memory, how can I avoid that? -
this issue may related https://github.com/ipython/ipyparallel/issues/207 not marked solved, yet.
opened issue here https://github.com/ipython/ipyparallel/issues/286
i want execute multiple tasks in parallel using python , ipyparallel in jupyter notebook , using 4 local engines executing ipcluster start
in local console. besides 1 can use directview
, use loadbalancedview
map set of tasks. each task takes around 0.2 seconds (can vary though) , each task mysql query loads data , processes it.
working ~45000 tasks works fine, however, memory grows high. bad because want run experiment on 660000 tasks can't run anymore because bloats memory limit of 16 gb , memory swapping on local drive starts. however, when using directview
memory grows relatively small , never full. need loadbalancedview
.
when running minimal working example without database query happens (see below).
i not familiar ipyparallel library i've read logs , caches ipcontroler may cause this. still not sure if bug or if can change settings avoid problem.
running mwe
for python 3.5.3 environment running on windows 10 use following (recent) packages:
- ipython 6.1.0
- ipython_genutils 6.1.0
- ipyparallel 6.0.2
- jupyter 1.0.0
- jupyter_client 4.4.0
- jupyter_console 5.0.0
- jupyter_core 4.2.0
i following example work loadbalancedview
without immense memory growth (if possible @ all):
- start
ipcluster start
on console run jupyter notebook following 3 cells:
<1st cell> import ipyparallel ipp rc = ipp.client() lview = rc.load_balanced_view() <2nd cell> %%px --local import time <3rd cell> def sleep_here(i): time.sleep(0.2) return 42 amr = lview.map_async(sleep_here, range(660000)) amr.wait_interactive()
Comments
Post a Comment