How to run multiple julia functions from python multiprocessing pool using juliacall
Asked Answered
O

0

5

I want to run julia functions/scripts from within python. I managed to call julia scripts via the library juliacall. Now I want to parallelize this. Therefore I created a python multiprocessing Pool and call the julia script from each worker. However this fails with the following message:

python: /buildworker/worker/package_linux64/build/src/debuginfo.cpp:1634: void register_eh_frames(uint8_t*, size_t): Assertion `end_ip != 0' failed.

How can I further debug this? Here is my minimal working example:

import os
os.environ['PYTHON_JULIAPKG_EXE'] = "/home/user/.juliaup/bin/julia"
os.environ['PYTHON_JULIAPKG_OFFLINE'] = 'yes'
os.environ['PYTHON_JULIAPKG_PROJECT'] = '/home/user/julia/environments/v1.6/'

from juliacall import Main as jl, convert as jlconvert

from multiprocessing import Pool
from tqdm import tqdm

import ipdb

def init_worker():
    import os
    os.environ['PYTHON_JULIAPKG_EXE'] = "/home/user/juliaup/bin/julia"
    os.environ['PYTHON_JULIAPKG_OFFLINE'] = 'yes'
    os.environ['PYTHON_JULIAPKG_PROJECT'] = '/home/user/.julia/environments/v1.6/'

    from juliacall import Main as jl, convert as jlconvert

    print('in init_worker()...')
    jl.seval('using Pkg')
    jl.seval('Pkg.status()')
    print('...done')

def compute(jobid):
    print(f'in main({jobid})...')

    jl.seval('include("test_julia_simple.jl")')
    print('...done')
    return

def main():
    njobs = 10
    #start pool with init_worker() as initializer
    
    with Pool(2, initializer=init_worker) as p, tqdm(total=njobs) as pbar:
        res = []
        for jid in range(njobs):
            res.append(p.apply_async(compute, (jid,)))
        for r in res:
            r.get()
            pbar.update(1)


if __name__ == "__main__":
    main()

And the julia script test_julia_simple.jl

for i in 1:10
    println(i)
end
1+2

additional info:

$ python --version
Python 3.9.7
$ pip freeze | grep julia
juliacall==0.9.10
juliapkg==0.1.9

$ julia --version
The latest version of Julia in the `1.6` channel is 1.6.7+0.x64.linux.gnu. You currently have `1.6.6+0~x64` installed. Run:                           

  juliaup update

to install Julia 1.6.7+0.x64.linux.gnu and update the `1.6` channel to that version.                                                                  
julia version 1.6.6


not sure if this is related but the error message is nearly identical https://github.com/JuliaLang/julia/issues/44969


After some comment I tried using a thread pool but in that case python fails with Segmentation fault:

import os
os.environ['PYTHON_JULIAPKG_EXE'] = "/home/user/.juliaup/bin/julia"
os.environ['PYTHON_JULIAPKG_OFFLINE'] = 'yes'
os.environ['PYTHON_JULIAPKG_PROJECT'] = '/home/user/.julia/environments/v1.6/'

from juliacall import Main as jl, convert as jlconvert

import concurrent.futures
from tqdm import tqdm

import ipdb

def compute(jobid):
    print(f'in main({jobid})...')

    print('in init_worker()...')
    jl.seval('using Pkg')
    jl.seval('Pkg.status()')
    print('...done')


    jl.seval('include("test_julia_simple.jl")')
    print('...done')
    return

def main():
    njobs = 10
    #start pool with init_worker() as initializer
    
    with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor:    
        with tqdm(total=njobs) as pbar:
            jobs = {executor.submit(compute, jid):jid for jid in range(njobs)}
            
            for future in concurrent.futures.as_completed(jobs):
                jid = jobs[future]
                try:
                    data = future.result()
                except Exception as exc:
                    print('%r generated an exception: %s' % (jid, exc))
                else:
                    print('%r page is %d bytes' % (jid, len(data)))
                pbar.update(1)


if __name__ == "__main__":
    main()

Oria answered 14/2, 2023 at 17:18 Comment(3)
Do you get the same error if you switch to a multithreading pool instead?Ampersand
@Ampersand with thread pool I only get segmentation faults :/Oria
unfortunately as of today Python-Julia integration segfaults from time to time. Consider parallelizing the task in Julia and calling this from Python via a webservice API. This is the only stable and production-ready approach as of today (and this will work without any issues)Threecolor

© 2022 - 2024 — McMap. All rights reserved.