Why is my Linux application pulling in the wrong .so library?
Asked Answered
L

1

5

I have an application I'm building that's using the NetCDF C++ library, and NetCDF is pulling in the HDF-4 libary. However, it's pulling in the wrong HDF-4 library.

Here's how my app is linked:

/apps1/intel/bin/icpc -gxx-name=/apps1/gcc-4.5.0/bin/g++ -shared -o lib/libMyCustom.so
  -Llib  -L/apps1/boost-1.48.0/lib -Wl,-rpath=/apps1/boost-1.48.0/lib
  -L/apps1/gdal-1.8.0-jasper/lib -Wl,-rpath=/apps1/gdal-1.8.0-jasper/lib
  -L/new_apps1/hdf4/lib -Wl,-rpath=/new_apps1/hdf4/lib -L/new_apps1/netcdf/lib
  -Wl,-rpath=/new_apps1/netcdf/lib -lboost_system -lboost_serialization
  -lboost_date_time -lboost_thread -lgdal -ldf -lmfhdf -lnetcdf_c++ 
  MyProj/obj/ProjUtility.o  MyProj/obj/ProjMetadataException.o
  MyProj/obj/ProjTimestampUtil.o 

I have set my LD_LIBRARY_PATH very short:

LD_LIBRARY_PATH=/new_apps1/hdf4/lib:/new_apps1/hdf5/lib:
  /apps1/intel/composerxe/lib/intel64:/apps1/gcc-4.5.0/lib64:/apps1/gcc-4.5.0/lib

And here is an excerpt from the ldd -v output:

    libdf.so.0 => /new_apps1/hdf4/lib/libdf.so.0 (0x00002af5baabc000)
    libmfhdf.so.0 => /new_apps1/hdf4/lib/libmfhdf.so.0 (0x00002af5bad61000)
    libnetcdf_c++.so.5 => /new_apps1/netcdf/lib/libnetcdf_c++.so.5 (0x00002af5baf85000)
    libhdf5.so.6 => /new_apps1/hdf5/lib/libhdf5.so.6 (0x00002af5bd1e7000)
    libgif.so.4 => /usr/lib64/libgif.so.4 (0x0000003a6bc00000)
    libpng12.so.0 => /usr/lib64/libpng12.so.0 (0x0000003a71000000)
    libnetcdf.so.6 => /new_apps1/netcdf/lib/libnetcdf.so.6 (0x00002af5bd682000)
    libhdf5_hl.so.6 => /new_apps1/hdf5/lib/libhdf5_hl.so.6 (0x00002af5be272000)

    /new_apps1/hdf4/lib/libdf.so.0:
            libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
            libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
    /new_apps1/hdf4/lib/libmfhdf.so.0:
            libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
    /new_apps1/netcdf/lib/libnetcdf_c++.so.5:
            libgcc_s.so.1 (GCC_3.0) => /apps1/gcc-4.5.0/lib64/libgcc_s.so.1
            libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
            libstdc++.so.6 (CXXABI_1.3) => /apps1/gcc-4.5.0/lib64/libstdc++.so.6
            libstdc++.so.6 (GLIBCXX_3.4) => /apps1/gcc-4.5.0/lib64/libstdc++.so.6
    /new_apps1/hdf5/lib/libhdf5.so.6:
            libm.so.6 (GLIBC_2.2.5) => /lib64/libm.so.6
            libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
            libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
    /new_apps1/netcdf/lib/libnetcdf.so.6:
            libm.so.6 (GLIBC_2.2.5) => /lib64/libm.so.6
            libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
            libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
    /new_apps1/hdf5/lib/libhdf5_hl.so.6:
            libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6

So far, everything in the LD_LIBRARY_PATH, rpath, and ldd indicate that it's pointing to the HDF that I want to reference (/new_apps1/hdf4/lib/libmfhdf.so.0). But when I run, Valgrind is telling me that it's dying in the OLD HDF-4 library (which is probably why it's segfaulting), instead of the HDF-4 library I'm attempting to link against:

 Invalid read of size 4
    at 0x67CF765: NC_var_shape (in /apps1/hdf-4.2.6/lib/libmfhdf.so.0.0.0)
    by 0x91327CA: nc_get_NC (v1hpg.c:1113)
    by 0x91303C0: l3nc__open_mp (nc.c:1096)
    by 0x915B279: nc3d__open_mp (dapdispatch3.c:336)
    by 0x914A752: nc3d_open (ncdap3.c:94)
    by 0x911F8A2: l4nc_open_file (nc4file.c:2338)
    by 0x916A290: nc4d_open_file (ncdap4.c:122)
    by 0x911CDDF: nc__open (nc4file.c:2407)
    by 0x69E85F8: NcFile::NcFile(char const*, NcFile::FileMode, unsigned long*, unsigned long, NcFile::FileFormat) (netcdf.cpp:384)
    by 0x710F0B8: getData(std::string const&) (ProjTimestampUtil.cc:593)
    by 0x70E9BEA: (anonymous namespace)::parseOptions(int, char**) (ProjUtility.cc:190)
    by 0x70EAAFB: main(int, char**) (ProjUtility.cc:243)
  Address 0x1051 is not stack'd, malloc'd or (recently) free'd


 Process terminating with default action of signal 11 (SIGSEGV)
  Access not within mapped region at address 0x1051
    at 0x67CF765: NC_var_shape (in /apps1/hdf-4.2.6/lib/libmfhdf.so.0.0.0)
    by 0x91327CA: nc_get_NC (v1hpg.c:1113)
    by 0x91303C0: l3nc__open_mp (nc.c:1096)
    by 0x915B279: nc3d__open_mp (dapdispatch3.c:336)
    by 0x914A752: nc3d_open (ncdap3.c:94)
    by 0x911F8A2: l4nc_open_file (nc4file.c:2338)
    by 0x916A290: nc4d_open_file (ncdap4.c:122)
    by 0x911CDDF: nc__open (nc4file.c:2407)
    by 0x69E85F8: NcFile::NcFile(char const*, NcFile::FileMode, unsigned long*, unsigned long, NcFile::FileFormat) (netcdf.cpp:384)
    by 0x710F0B8: getData(std::string const&) (ProjTimestampUtil.cc:593)
    by 0x70E9BEA: (anonymous namespace)::parseOptions(int, char**) (ProjUtility.cc:190)
    by 0x70EAAFB: main(int, char**) (ProjUtility.cc:243)

Where else is my app getting path info when dynamically pulling in other libraries?

Lilalilac answered 2/3, 2012 at 19:57 Comment(0)
L
7

I'm not exactly sure of all the details of how -rpath and LD_LIBRARY_PATH work, and their precedence, but I did find some useful environment variables:

  • LD_DEBUG=all - This env variable turns on verbose dynamic linker debugging. Now doing an ldd on your app will spew output about the details of how all its dependencies find their dependencies.
  • LD_DEBUG_OUTPUT=<filename_prefix> - Used in conjunction with LD_DEBUG to specify output files to log the debugging info to.

The LD_DEBUG env variable helped me track down that /apps1/gdal-1.8.0-jasper/lib/libgdal.so.1 was compiled with an -rpath option that was pulling the old (wrong) versions of my libraries. It gave this helpful debug output:

search path=/pathXYZ/lib/tls/x86_64:/pathXYZ/lib/tls:/pathXYZ/lib/x86_64:
  /pathABC/jasper/lib:/pathABC/hdf5/lib/tls/x86_64:/pathABC/hdf5/lib/tls:
  /pathABC/hdf5/lib/x86_64:/pathABC/hdf5/lib:/pathABC/netcdf/lib/tls/x86_64:
  /pathABC/netcdf/lib/tls:/pathABC/netcdf/lib/x86_64:/pathABC/netcdf/lib

          (RPATH from file /apps1/gdal-1.8.0-jasper/lib/libgdal.so.1)

So the rpath of how the GDAL library was compiled seemed to be making an end-run around my LD_LIBRAR_PATH. Until I can get my lab team to rebuild libgdal correctly, I found this env var, which helped me load the "right" library versions that I wanted:

  • LD_PRELOAD=<path/to/libName.so> - Point this to the location of a library (or a space-separated list of libraries) that should be loaded before all others. See the ld.so man page.
Lilalilac answered 2/3, 2012 at 22:33 Comment(3)
This is the most helpful solution I've seen for any problem of this sort, and I ended up having a very similar problem with the RPATH issue. Try readelf on your binaries to see if there is an RPATH defined and if it is incorrect. If it is incorrect, make sure your .profile/.bashrc aren't setting your PATH or LD_LIBRARY_PATH to something you don't expect.Gove
Great solution. Quick addendum, since this is an old post, this information might be helpful to people. When running the ldd with the env var (which implies an export), the following command should get you the info you need if you don't know how to export: LD_DEBUG=all ldd /path/to/binary. Also en.wikibooks.org/wiki/Linux_Applications_Debugging_Techniques/…Tetrasyllable
Yes, this was very helpful, its basically debug mode for dynamic linking. I notice that ALL executables now output this info, even ls and similar. It showed me that my bins are getting linked to other .so files even though the compile used the local .so file in the same directory that I want.Lenard

© 2022 - 2024 — McMap. All rights reserved.