I'm working on some python code that extracts some image data from an ECW file using GDAL (http://www.gdal.org/) and its python bindings. GDAL was built from source to have ECW support.
The program is run on a cluster server that I ssh into. I have tested the program through the ssh terminal and it runs fine. However, I would now like to submit a job to the cluster using qsub, but it reports the following:
Traceback (most recent call last):
File "./gdal-test.py", line 5, in <module>
from osgeo import gdal
File "/home/h3/ctargett/.local/lib/python2.6/site-packages/GDAL-1.11.1-py2.6-linux-x86_64.egg/osgeo/__init__.py", line 21, in <module>
_gdal = swig_import_helper()
File "/home/h3/ctargett/.local/lib/python2.6/site-packages/GDAL-1.11.1-py2.6-linux-x86_64.egg/osgeo/__init__.py", line 17, in swig_import_helper
_mod = imp.load_module('_gdal', fp, pathname, description)
ImportError: /mnt/aeropix/prgs/.local/lib/libgdal.so.1: undefined symbol: H5Eset_auto2
I did a bit more digging and tried using LD_DEBUG=symbols
to try and work out where the difference was, but that's about as far as my knowledge/understanding has got me.
For reference, here's what happens with LD_DEBUG=symbols
and running the code in the ssh terminal (piping through grep H5Eset_auto2
to reduce some of the output):
Symbol debug output for code running in ssh terminal:
11359: symbol=H5Eset_auto2; lookup in file=/usr/bin/python26 [0]
11359: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libpython2.6.so.1.0 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libpthread.so.0 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libdl.so.2 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libutil.so.1 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libm.so.6 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libc.so.6 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
11359: symbol=H5Eset_auto2; lookup in file=/home/h3/ctargett/.local/lib/python2.6/site-packages/GDAL-1.11.1-py2.6-linux-x86_64.egg/osgeo/_gdal.so [0]
11359: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libpython2.6.so.1.0 [0]
11359: symbol=H5Eset_auto2; lookup in file=/mnt/aeropix/prgs/.local/lib/libgdal.so.1 [0]
11359: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libstdc++.so.6 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libm.so.6 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libgcc_s.so.1 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libpthread.so.0 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libc.so.6 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libdl.so.2 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libutil.so.1 [0]
11359: symbol=H5Eset_auto2; lookup in file=/mnt/aeropix/prgs/.local/lib/libhdf5.so.7 [0]
11359: symbol=H5Eset_auto2; lookup in file=/usr/bin/python26 [0]
11359: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libpython2.6.so.1.0 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libpthread.so.0 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libdl.so.2 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libutil.so.1 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libm.so.6 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libc.so.6 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
11359: symbol=H5Eset_auto2; lookup in file=/home/h3/ctargett/.local/lib/python2.6/site-packages/GDAL-1.11.1-py2.6-linux-x86_64.egg/osgeo/_gdal.so [0]
11359: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libpython2.6.so.1.0 [0]
11359: symbol=H5Eset_auto2; lookup in file=/mnt/aeropix/prgs/.local/lib/libgdal.so.1 [0]
11359: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libstdc++.so.6 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libm.so.6 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libgcc_s.so.1 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libpthread.so.0 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libc.so.6 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libdl.so.2 [0]
11359: symbol=H5Eset_auto2; lookup in file=/lib64/libutil.so.1 [0]
11359: symbol=H5Eset_auto2; lookup in file=/mnt/aeropix/prgs/.local/lib/libhdf5.so.7 [0]
Symbol debug output for code submitted using qsub:
16915: symbol=H5Eset_auto2; lookup in file=/usr/bin/python26 [0]
16915: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libpython2.6.so.1.0 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libpthread.so.0 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libdl.so.2 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libutil.so.1 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libm.so.6 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libc.so.6 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
16915: symbol=H5Eset_auto2; lookup in file=/home/h3/ctargett/.local/lib/python2.6/site-packages/GDAL-1.11.1-py2.6-linux-x86_64.egg/osgeo/_gdal.so [0]
16915: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libpython2.6.so.1.0 [0]
16915: symbol=H5Eset_auto2; lookup in file=/mnt/aeropix/prgs/.local/lib/libgdal.so.1 [0]
16915: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libstdc++.so.6 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libm.so.6 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libgcc_s.so.1 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libpthread.so.0 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libc.so.6 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libdl.so.2 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libutil.so.1 [0]
16915: symbol=H5Eset_auto2; lookup in file=/mnt/aeropix/prgs/.local/lib/libhdf5.so.7 [0]
16915: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libjpeg.so.62 [0]
16915: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libpng12.so.0 [0]
16915: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libpq.so.4 [0]
16915: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libcurl.so.3 [0]
16915: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libgssapi_krb5.so.2 [0]
16915: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libkrb5.so.3 [0]
16915: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libk5crypto.so.3 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libcom_err.so.2 [0]
16915: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libidn.so.11 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libssl.so.6 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libcrypto.so.6 [0]
16915: symbol=H5Eset_auto2; lookup in file=/mnt/aeropix/prgs/.local/lib/libNCSEcw.so.0 [0]
16915: symbol=H5Eset_auto2; lookup in file=/mnt/aeropix/prgs/.local/lib/libNCSEcwC.so.0 [0]
16915: symbol=H5Eset_auto2; lookup in file=/mnt/aeropix/prgs/.local/lib/libNCSCnet.so.0 [0]
16915: symbol=H5Eset_auto2; lookup in file=/mnt/aeropix/prgs/.local/lib/libNCSUtil.so.0 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/librt.so.1 [0]
16915: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libxml2.so.2 [0]
16915: symbol=H5Eset_auto2; lookup in file=/mnt/aeropix/prgs/.local/lib/libz.so.1 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libcrypt.so.1 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libresolv.so.2 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libnsl.so.1 [0]
16915: symbol=H5Eset_auto2; lookup in file=/usr/lib64/libkrb5support.so.0 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libkeyutils.so.1 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libselinux.so.1 [0]
16915: symbol=H5Eset_auto2; lookup in file=/lib64/libsepol.so.1 [0]
16915: /mnt/aeropix/prgs/.local/lib/libgdal.so.1: error: symbol lookup error: undefined symbol: H5Eset_auto2 (fatal)
ImportError: /mnt/aeropix/prgs/.local/lib/libgdal.so.1: undefined symbol: H5Eset_auto2
I guess I'm not sure why it seems to stop looking in libgdal.so.1 when submitted using qsub, when it continues to look when just run in the terminal. I also note that the qsub job is able to correctly locate libhdf5.so.7
(which is where it should find H5Eset_auto2
) as it can find a different symbol, H5Eprint
:
16915: symbol=H5Eprint; lookup in file=/usr/lib64/libpython2.6.so.1.0 [0]
16915: symbol=H5Eprint; lookup in file=/mnt/aeropix/prgs/.local/lib/libgdal.so.1 [0]
16915: symbol=H5Eprint; lookup in file=/usr/lib64/libstdc++.so.6 [0]
16915: symbol=H5Eprint; lookup in file=/lib64/libm.so.6 [0]
16915: symbol=H5Eprint; lookup in file=/lib64/libgcc_s.so.1 [0]
16915: symbol=H5Eprint; lookup in file=/lib64/libpthread.so.0 [0]
16915: symbol=H5Eprint; lookup in file=/lib64/libc.so.6 [0]
16915: symbol=H5Eprint; lookup in file=/lib64/libdl.so.2 [0]
16915: symbol=H5Eprint; lookup in file=/lib64/libutil.so.1 [0]
16915: symbol=H5Eprint; lookup in file=/mnt/aeropix/prgs/.local/lib/libhdf5.so.7 [0]
Any pointers on this would be incredibly useful at this stage (I hope that's enough information - I'm more than happy to provide more information, I'm just not sure what else might be useful at this stage).
EDIT:
It seems that the contents of /usr/bin
are different for jobs submitted using qsub
(specifically libtool
is missing). This is being investigated.
LD_LIBRARY_PATH
when you're running interactively and when it's running in the queue. Look for any discrepancies related to HDF5. – Spillerlibhdf5.so.7
as it correctly locates symbolH5Eprint
before getting stuck onH5Eset_auto2
As a minimum working example, the single linefrom osgeo import gdal
works when I run it through the ssh terminal, but fails when submitted using qsub. Also, I have setexport LD_LIBRARY_PATH=/mnt/aeropix/prgs/.local/lib
(which is wherelibhdf5.so.7
is) – Godoyenv >$HOME/myenvs
), then reproduce that same environment (env $(xargs <$HOME/myenvs) bash
) and see if it fails in the exact same way. If it does, examine and look for any suspicious discrepancies from your interactive environment. – Spillerlibhdf5.so.7
but then continues onward? – Godoyf, p, d = imp.find_module('_gdal', ['/home/h3/ctargett/.local/lib/python2.6/site-packages/GDAL-1.11.1-py2.6-linux-x86_64.egg/osgeo/']); print(f, p, d); imp.load_module('_gdal', f, p, d)
(Replace the semicolons with newlines if you wish.) The code essentially tries to replicate what is done by the GDAL bindings, and hopefully will help narrow down the problem. – Spillerimport imp;
) – SpillerLD_DEBUG
) Does loadinggdal
work if you execute it non-interactively? (E.g.ssh user@cluster 'python -c "import osgeo.gdal"'
) – Spillerqsub -I
), and that results in the same error (undefined symbol: H5Eset_auto2
) as submitting the job. – Godoyssh user@cluster 'export LD_LIBRARY_PATH=/mnt/aeropix/prgs/.local/lib ; python26 -c "from osgeo import gdal"'
works fine though – GodoyLD_LIBRARY_PATH
variable export in your.bashrc
and see if it affects the job? On another note, would you be able to paste the (full, ungrepped) logs from the dynamic loader? (useLD_DEBUG=all
) – SpillerLD_LIBRARY_PATH
in my.bashrc
but there was no change. Here are the links to the full debug logs (they're quite large). ssh output: drive.google.com/file/d/0By6cKos4YGCGYjl3Z2twaGVTYVE/… qsub output: drive.google.com/file/d/0By6cKos4YGCGT2t5ZGpycGYzaU0/… – Godoylibtool
is of the same version? – Spillernm -D /mnt/aeropix/prgs/.local/lib/libhdf5.so.7 | grep H5Eset_auto2
– Spillernm -D
using ssh gives0000003126a915b0 T H5Eset_auto2
, and in the queue gives nothing at all. – Godoylibtool --version
in the ssh terminal I get that it's version1.5.22 (1.1220.2.365)
, while in the queue I getlibtool: command not found
. I tried doingwhereis libtool
and got/usr/bin/libtool
for the ssh terminal, and nothing found when in the queue. Doingls /usr/bin
I find that they appear to be different for the ssh terminal and the queue - I am investigating this further. – Godoylibhdf5.so.7
somewhere else and see if the queue sees the same file? (If it does, try putting that new location inLD_LIBRARY_PATH
instead.) – Spillerlibhdf5.so.7
was symlinked to a location that was apparently different in the queue. I modified the symlink and now everything seems to be working correctly. Would you like to answer the question? You did most of the work so I'd say you earned it. – Godoy/usr/bin
was different for the queue. For further reference, after being symlinked all over the place,libhdf5.so.7
ended up pointing to/usr/lib64/libhdf5.so.7.0.1
. Thanks for your help, you're a champ. – Godoy