Conda packaging and the C library as a dependency
Asked Answered
H

0

12

Conda does a nice job at keeping control of the necessary dependencies of a package, but apparently most packages exclude the C library as a traceable dependency. For example, let's install Gnuastro with this command:

conda install -c conda-forge gnuastro

Then, I look into the libraries that one of Gnuastro's programs links with (for example astnoisechisel):

$ ldd $(which astnoisechisel)
    linux-vdso.so.1 (0x00007ffdbd336000)
    libgnuastro.so.9 => /path/to/conda/install/envs/testenv/bin/../lib/libgnuastro.so.9 (0x00007fe039ce1000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fe039b86000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe0399c6000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe0399a5000)
    libgit2.so.28 => /path/to/conda/install/envs/testenv/bin/../lib/./libgit2.so.28 (0x00007fe039882000)
    libtiff.so.5 => /path/to/conda/install/envs/testenv/bin/../lib/./libtiff.so.5 (0x00007fe039800000)
    liblzma.so.5 => /path/to/conda/install/envs/testenv/bin/../lib/./liblzma.so.5 (0x00007fe0397d7000)
    libjpeg.so.9 => /path/to/conda/install/envs/testenv/bin/../lib/./libjpeg.so.9 (0x00007fe039799000)
    libwcs.so.5 => /path/to/conda/install/envs/testenv/bin/../lib/./libwcs.so.5 (0x00007fe03963e000)
    libcfitsio.so.8 => /path/to/conda/install/envs/testenv/bin/../lib/./libcfitsio.so.8 (0x00007fe039311000)
    libcurl.so.4 => /path/to/conda/install/envs/testenv/bin/../lib/./libcurl.so.4 (0x00007fe03928b000)
    libz.so.1 => /path/to/conda/install/envs/testenv/bin/../lib/./libz.so.1 (0x00007fe03926f000)
    libgsl.so.23 => /path/to/conda/install/envs/testenv/bin/../lib/./libgsl.so.23 (0x00007fe038fc6000)
    libcblas.so.3 => /path/to/conda/install/envs/testenv/bin/../lib/./libcblas.so.3 (0x00007fe03740d000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fe03a049000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fe037402000)
    libssl.so.1.1 => /path/to/conda/install/envs/testenv/bin/../lib/././libssl.so.1.1 (0x00007fe037372000)
    libcrypto.so.1.1 => /path/to/conda/install/envs/testenv/bin/../lib/././libcrypto.so.1.1 (0x00007fe0370c4000)
    libssh2.so.1 => /path/to/conda/install/envs/testenv/bin/../lib/././libssh2.so.1 (0x00007fe03708f000)
    libzstd.so.1 => /path/to/conda/install/envs/testenv/bin/../lib/././libzstd.so.1 (0x00007fe036fd3000)
    libbz2.so.1.0 => /path/to/conda/install/envs/testenv/bin/../lib/././libbz2.so.1.0 (0x00007fe036fbf000)
    libgssapi_krb5.so.2 => /path/to/conda/install/envs/testenv/bin/../lib/././libgssapi_krb5.so.2 (0x00007fe036f70000)
    libkrb5.so.3 => /path/to/conda/install/envs/testenv/bin/../lib/././libkrb5.so.3 (0x00007fe036e99000)
    libk5crypto.so.3 => /path/to/conda/install/envs/testenv/bin/../lib/././libk5crypto.so.3 (0x00007fe036e78000)
    libcom_err.so.3 => /path/to/conda/install/envs/testenv/bin/../lib/././libcom_err.so.3 (0x00007fe036e72000)
    libgfortran.so.4 => /path/to/conda/install/envs/testenv/bin/../lib/././libgfortran.so.4 (0x00007fe036d44000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fe036d3f000)
    libkrb5support.so.0 => /path/to/conda/install/envs/testenv/bin/../lib/./././libkrb5support.so.0 (0x00007fe036d31000)
    libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007fe036d17000)
    libquadmath.so.0 => /path/to/conda/install/envs/testenv/bin/../lib/././libquadmath.so.0 (0x00007fe036cdd000)
    libgcc_s.so.1 => /path/to/conda/install/envs/testenv/bin/../lib/././libgcc_s.so.1 (0x00007fe036cc9000)

All the higher-level libraries used are within the Conda environment, except for the C library: libm.so.6, libc.so.6, libpthread.so.0, librt.so.1, libdl.so.2, libresolv.so.2 and ofcourse ld-linux-x86-64.so.2.

I couldn't find the GNU C Library on Conda-forge, but when I done a search I found some other projects that have it. So for example I tried:

conda install -c neok.m4700 glibc

This installed GNU C Library 2.30 (Conda tarball created 3 months ago) and the ldd command above gave me a beautiful list with everything in the Conda installation. In one test Conda environment, the call to astnoisechisel --version gave a segmentation fault and in another it succeeded. Then I tried another Conda C library (in a clean environment):

conda install -c asmeurer glibc

This one is an older version of the C library (last updated 5 years ago: glibc 2.19). In this environment, my astnoisechisel --version command would only give a segmentation fault and crash.

In this conda-forge discussion, it is said that "glibc is something that is not good to ship, and if we can't use the system glibc, I'm worried about this package. It is strongly tied to kernel versions, and it's also a security risk to have old versions in use". So I guess its not their policy to include the GNU C Library (at least on GNU/Linux systems).

I also see a similar issue with "base" packages in Anaconda. For example when I check the linking flags of curl or zstd.

So my question is this: if the C library is not officially defined as a dependency (like all the other dependencies), how reliable are Conda packages (especially for older versions of software) in the not-too-distant future (like 5 years in the case above)?

On a similar note: Assume I need to manually fetch the proper C library that was used to build a Conda package (to be able to run the executable). Is the version of the C library used in the build of that package documented anywhere within the downloaded tarball?

Hydrogen answered 19/12, 2019 at 17:50 Comment(4)
My understanding of this is that Conda packages should be built against the oldest reasonably available C-library. I believe that the defaults channel uses a version of CentOS with a somewhat old glibc for their Linux packages. The advantage of this approach is that, in general, glibc is backwards compatible so compiling against the older version will work on OS with newer glibc but not the other way around. How this affects reproducibility is unanswered, AFAIK, but I don't think anyone has a good answer for this ¯_(ツ)_/¯Sprightly
While libc is not shipped as a dependency, the package is still linked to it and it will find the system libc, instead of being tied by a shipped libc in a conda environment. This is confirmed by the ldd output you have shown. @darthbith's comment is the proper answer to this question.Flagship
I should also note that this is not unique to Conda packages, Python wheels suffer from the same problem. In fact, any binary dependency that depends on libc will have the same problem. The buck has to stop somewhere... This is one of the big advantages of source distributions, that the user is able to compile against their local libc, so they know it will work on their computer. Of course, they have to have the compiler toolchain installed...Sprightly
The packages built in the conda-forge ecosystem target a specific kernel version and glibc version using a sysroot on Linux. As to how will old versions of packages continue to work on modern systems 5 years down the line, give this a read: developers.redhat.com/blog/2019/08/01/…Nettienetting

© 2022 - 2024 — McMap. All rights reserved.