How to be able to "move" all necessary libraries that a script requires when moving to a new machine

Asked 15/9, 2016 at 13:39 Answered 25/9, 2016 at 15:56

linux shared-libraries cluster-computing static-libraries hpc

We work on scientific computing and regularly submit calculations to different computing clusters. For that we connect using linux shell and submitting jobs through SGE, Slurm, etc (it depends on the cluster). Our codes are composed of python and bash scripts and several binaries. Some of them depend on external libraries such as matplotlib. When we start to use a new cluster, it is a nightmare since we need to tell the admins all the libraries we need, and sometimes they can not install all of them, or they only have old versions that can not be upgraded. So we wonder what could we do here. I was wondering if we could somehow "pack" all libraries we need along with our codes. Do you think it is possible? Otherwise, how could we move to new clusters without the need for admins to install anything?

Freetown answered 15/9, 2016 at 13:39 Comment(0)

The key is to compile all the code you need by yourself, using the compiler/library/MPI toolchains installed by the admins of the clusters, so that

your software is compiled properly for the cluster hardware, and
you do not depend on the admin to install the software.

The following are very useful in this case:

Ansible, to upload/manage configuration files, rc files, set permissions, compile your binaries, etc. and deploy a new environment easily on new clusters
Easybuild to install your version of Python with all the needed dependencies, and install other scientific software thanks to the community supported build procedures
CDE to build a package with all dependencies for your binaries on your laptop and use it as-is on the clusters.

More specifically for Python, you can use

virtual envs to setup a consistent set of Python modules across all clusters, independently from the modules already installed; or
Anaconda or Canopy to use a Python scientific distribution

to have a consistent Python install across all clusters.

Maxama answered 19/9, 2016 at 13:22 Comment(0)

Don't get me wrong, but I think what you have to do so: stop behaving like amateurs.

Meaning: the integrity of your "system configuration" is one of the core assets of your "business". And you just told us that you are basically unable of easily re-producing your system configuration.

So, the real answer here can't be a recommendation to use this or that technology. The real answer is: you, and the other teams involved in running your operations need to come together and define a serious strategy how to fix this.

Maybe you then decide that the way to go is that your development team provides Docker buildfiles, so that your operations team can easily create images on new machines. Or you decide that you need to use something like ansible to enable centralized control over your complete environment.

Farfamed answered 15/9, 2016 at 13:56 Comment(4)

You need to realise that the operations team in charge of managing the clusters often do not belong to the scientist's organisation/institution, they most of the time belong to other (sometimes foreign) universities/NGO's. And they allow access to their equipment to dozens or hundreds of different scientist teams whose requirements are often very different if not incompatible. This is not a one-to-one situation to deal with. – Maxama 19/9, 2016 at 13:38

@Maxama Sure. I see that point (now). Thanks for that hint (although it might still be true that he is simply working in some large org where people do not talk to each other). I have experienced that myself in the past. – Farfamed 19/9, 2016 at 18:22

Yes you are right I do not know the exact context of the question so I took caution with 'often' in my comment. And I wholly agree with the 'stop behaving like amateurs'. Though in this scientific computing context, and as an HPC sysadmin myself, I believe it means 'try no to depend on the admins' :) – Maxama 20/9, 2016 at 7:36

You can rely on the admins, but often in HPC (research) there is not a thing such as production: many users have very different requirements and is impossible for a single group of people to satisfy all the necessities of the users. – Tolbert 25/9, 2016 at 16:0

That's what venv is for, it allows you to create a portable customized environment easily, with exactly what you need and nothing more.

Fractostratus answered 21/9, 2016 at 12:50 Comment(0)

I completely agree with https://stackoverflow.com/users/1531124/ghostcat but here is the really bad answer that will cause you a lot of problems in near future!!!:

if you need some dynamic library and you are not planning to upgrade them in future, you can try copying all needed libs to a folder in your app and use an script to launch the app:

#!/bin/sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/your/lib/folder
./myAPP

but keep in mind that this is bad practice.

Armada answered 21/9, 2016 at 14:3 Comment(0)

Create a chroot image, like here - click. Install everything you need and then you can just chroot into it on any machine.

High answered 23/9, 2016 at 7:6 Comment(0)

I work on scientific clusters as well, and you are going to find that wherever you go.

I would only rely on the admins on installing the most basic stuff. That is: - Software necessary to build your software or run the most basic stuff: compilers and most basic utilities (python, perl, binutils, autotools, cmake, etc.).

Software libraries that make use of I/O devices: MPI, file I/O libraries...
A queue system (they already have it most of the time).
Environment modules. This is not a must, but it really helps you get the job done, specially if you mess with different library versions or implementations (that's my case, for example).

From that point on, you can build and install on your own directories all the software you use most of the time.

This does not mean that you cannot ask an admin to install some libraries. If you feel that many people is going to benefit from that, then you should request its installation. In addition, you may need some specific version or some special features which are not used most of the time, but you really need them. A very good example is with BLAS libraries (basic lineal algebra subroutines):

You have lots of BLAS implementations available: the original BLAS, Intel MKL, OpenBLAS, ATLAS, cuBLAS
If that is not enough, the open source versions usually offer multiple configuration options: serial version, parallel version with PThreads, parallel version with OpenMP, parallel version with MPI...

In my particular case, most of the software that I felt was necessary for many users in the cluster ended up being installed by the admins without any problem (either me or other users requested it), but you also have to keep in mind that in a cluster there can be many users and a single person/team is not able to attend the specific requirements you need, specially if you are able to do so.

Tolbert answered 25/9, 2016 at 15:56 Comment(0)

-1

I think you want to containerize your application in some way. Two main options (because docker/rkt and similar things are way too heavyweight for your task if I understand it correctly) in my opinion are runc and snappy.

Runc relies on OCI runtime specification, you need to create an environment (that is very similar to chroot environment in that you need to copy everything you software uses in one directory) and then you'll be able to run your application with runc tool. Runc itself is just one binary, at the moment it requires root privileges to run (hello, cluster admins), but there are patches at least partly solving that, so if you build your own runc and there are no blocking things wrt root privilege requirements you may be able to run your application with no administration overhead at all.

Snappy is similar in that you need to prepare a snap package for your application, this time using snapcraft as an assistant tool. Snappy is probably a bit easier in creating an application image and IMO is certainly better for long-term support because it clearly separates your application from the data (kinda W^X, application image is a read-only squashfs file and application can only write to a limited set of directories). But at the moment it will require your cluster admins to install snapd and to perform some operations like snap installation that require root privileges. Still, it should be better than your current situation, because that's just one non-intrusive package to install.

If these tools don't fit for some reason, there is always an option to make something of your own. That won't be easy and there are many subtle details that can bite you when doing that, but it can be done, compile all of your dependencies and applications into some path, create wrapper scripts to set up PATH and LD_LIBRARY_PATH environment for your components and then bring that directory into the new cluster, run wrapper scripts instead of target binaries and that's it. It's similar to what XAMPP does, they have quite a number of integrated things packaged into one directory that works across many distributions.

update

Let's also add AppImage into the mix, theoretically it can be a savior for your case, as it specifically does not require root privileges. It's kinda inbetween Snappy and rolling your own, as you need to prepare your application directory yourself (snappy can manage some of dependencies with snapcraft when you just specify "I need this Ubuntu package"), add appropriate metadata and then it can be packaged into single executable.

Homophile answered 20/9, 2016 at 14:29 Comment(0)

update

Recommended topics

Hot tags