how should I persistently save Julia packages in a Docker container
Asked Answered
O

3

7

I'm running Julia on the raspberry pi 4. For what I'm doing, I need Julia 1.5 and thankfully there is a docker image of it here: https://github.com/Julia-Embedded/jlcross

My challenge is that, because this is a work-in-progress development I find myself adding packages here and there as I work. What is the best way to persistently save the updated environment?

Here are my problems:

  1. I'm having a hard time wrapping my mind around volumes that will save packages from Julia's package manager and keep them around the next time I run the container

  2. It seems kludgy to commit my docker container somehow every time I install a package.

Is there a consensus on the best way or maybe there's another way to do what I'm trying to do?

Odell answered 27/9, 2020 at 21:54 Comment(0)
O
5

You can persist the state of downloaded & precompiled packages by mounting a dedicated volume into /home/your_user/.julia inside the container:

$ docker run --mount source=dot-julia,target=/home/your_user/.julia [OTHER_OPTIONS]

Depending on how (and by which user) julia is run inside the container, you might have to adjust the target path above to point to the first entry in Julia's DEPOT_PATH.

You can control this path by setting it yourself via the JULIA_DEPOT_PATH environment variable. Alternatively, you can check whether it is in a nonstandard location by running the following command in a Julia REPL in the container:

julia> println(first(DEPOT_PATH))
/home/francois/.julia
Obolus answered 28/9, 2020 at 12:49 Comment(1)
You nailed it! Thanks for helping me find how to easily do this with docker. to make it work for this particular case this is what worked: $ docker run --mount source=juliadotfolder,target=/root/.julia -it terasakisatoshi/jlcross:rpizero-v1.5.0 juliaOdell
W
3

You can manage the package and their versions via a Julia Project.toml file. This file can keep both the list of your dependencies.

Here is a sample Julia session:

julia> using Pkg

julia> pkg"generate MyProject"
 Generating  project MyProject:
    MyProject\Project.toml
    MyProject\src/MyProject.jl

julia> cd("MyProject")

julia> pkg"activate ."
 Activating environment at `C:\Users\pszufe\myp\MyProject\Project.toml`

julia> pkg"add DataFrames"

Now the last step is to provide package version information to your Project.toml file. We start by checking the version number that "works good":

julia> pkg"st DataFrames"
Project MyProject v0.1.0
Status `C:\Users\pszufe\myp\MyProject\Project.toml`
  [a93c6f00] DataFrames v0.21.7

Now you want to edit Project.toml file [compat] to fix that version number to always be v0.21.7:

name = "MyProject"
uuid = "5fe874ab-e862-465c-89f9-b6882972cba7"
authors = ["pszufe <pszufe@******.com>"]
version = "0.1.0"

[deps]
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"

[compat]
DataFrames = "= 0.21.7"

Note that in the last line the equality operator is twice to fix the exact version number see also https://julialang.github.io/Pkg.jl/v1/compatibility/.

Now in order to reuse that structure (e.g. different docker, moving between systems etc.) all you do is

cd("MyProject")
using Pkg
pkg"activate ."
pkg"instantiate"

Additional note

Also have a look at the JULIA_DEPOT_PATH variable (https://docs.julialang.org/en/v1/manual/environment-variables/). When moving installations between dockers here and there it might be also sometimes convenient to have control where all your packages are actually installed. For an example you might want to copy JULIA_DEPOT_PATH folder between 2 dockers having the same Julia installations to avoid the time spent in installing packages or you could be building the Docker image having no internet connection etc.

Wyn answered 27/9, 2020 at 22:16 Comment(4)
It might also be interesting to version-control the Manifest.toml file instead of manually pinning dependencies to specific versions within Project.tomlGreening
Yes! But somehow I feel that the Julia package manager is more robust when re-instantiating the project in each new environment. I used to have some bad experience recovering the state from Manifest.toml. Moreover if you add a new package it will rewrite you Manifest each time and package manager operations can update your version. Docker environments are always more fragile so I prefer to have the control myself. In the end it is of course the matter of taste ;-)Wyn
thanks for the detailed response! i'm still trying to figure out how to apply it to my docker containers though. can you help me out there? i'm assuming that i wouldn't have to reinstall and precompile each package whenever i run the container.Odell
All package installation state along with pre-compilation is kept at JULIA_DEPOT_PATH. You could also build a Julia sysimage and distribute it over dockers but this is kind of less convenient.Wyn
J
0

In my Dockerfile I simply install the packages just like you would do with pip:

FROM jupyter/datascience-notebook

RUN julia -e 'using Pkg; Pkg.add.(["CSV", "DataFrames", "DataFramesMeta", "Gadfly"])'

Here I start with a base datascience notebook which includes Julia, and then call Julia from the commandline instructing it to execute the code needed to install the packages. Only downside for now is that package precompilation is triggered each time I load the container in VS Code.

If I need new packages, I simply add them to the list.

Jevons answered 19/2, 2023 at 9:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.