use julia language without internet connection (mirror?)
Asked Answered
K

3

8

Problem: I would like to make julia available for our developers on our corporate network, which has no internet access at all (no proxy), due to sensitive data.

As far as I understand julia is designed to use github. For instance julia> Pkg.init() tries to access: git://github.com/JuliaLang/METADATA.jl

Example: I solved this problem for R by creating a local CRAN repository (rsync) and setting up a local webserver. I also solved this problem for python the same way by creating a local PyPi repository (bandersnatch) + webserver.

Question: Is there a way to create a local repository for metadata and packages for julia?

Thank you in advance. Roman

Koblenz answered 14/1, 2015 at 10:28 Comment(0)
R
8

Yes, one of the benefits from using the Julia package manager is that you should be able to fork METADATA and host it anywhere you'd like (and keep a branch where you can actually check new packages before allowing your clients to update). You might be one of the first people to actually set up such a system, so expect that you will need to submit some issues (or better yet; pull requests) in order to get everything working smoothly.

See the extra arguments to Pkg.init() where you specify the METADATA repo URL.

If you want a simpler solution to manage I would also think about having a two tier setup where you install packages on one system (connected to the internet), and then copy the resulting ~/.julia directory to the restricted system. If the packages you use have binary dependencies, you might run into problems if you don't have similar systems on both sides, or if some of the dependencies is installed globally, but Pkg.build("Pkgname") might be helpful.

Ronaronal answered 14/1, 2015 at 11:42 Comment(1)
Thank you for your reply. As far as I understand using ~/.julia directory I can install packages on tier one (internet) by using Pkg.add("Packagename"). Move it to tier two (no internet), set JULIA_PKGDIR environment variable to point to .julia. I have to test it. Regarding your first option, I cloned METADATA to a local directory. I could point to it using JULIA_PKGDIR. But then its only the package meta data, everything must still be downloaded from github when I actually add a package. The optimal case would be if I could create a repo that includes avery thing, like a CRAN 100GB mirror.Koblenz
K
2

This is how I solved it (for now), using second suggestion by ivarne.I use a two tier setup, two networks one connected to internet (office network), one air gapped network (development network).

System information: openSuSE-13.1 (both networks), julia-0.3.5 (both networks)

Tier one (office network)

  • installed julia on an NFS share, /sharename/local/julia.
  • soft linked /sharename/local/bin/julia to /sharename/local/julia/bin/julia
  • appended /sharename/local/bin/ to $PATH using a script in /etc/profile.d/scriptname.sh
  • created /etc/gitconfig on all office network machines: [url "https://"] insteadOf = git:// (to solve proxy server problems with github)
  • now every user on the office network can simply run # julia
  • Pkg.add("PackageName") is then used to install various packages.

The two networks are connected periodically (with certain security measures ssh, firewall, routing) for automated data exchange for a short period of time.

Tier two (development network)

  • installed julia on NFS share equal to tier one.
  • When the networks are connected I use a shell script with rsync -avz --delete to synchronize the .julia directory of tier one to tier two for every user.

Conclusion (so far): It seems to work reasonably well. As ivarne suggested there are problems if a package is installed AND something more than just file copying is done (compiled?) on tier one, the package wont run on tier two. But this can be resolved with Pkg.build("Pkgname").

Koblenz answered 4/2, 2015 at 15:39 Comment(2)
Hi, It's been four years since you wrote this. Does the above workflow still work for you or do you recommend something else?Cay
I have not used it for about 3 years, so I cannot confirm that this still works. If Julia package management has not changed significantly it should still be possible.Koblenz
I
0

PackageCompiler.jl seems like the best tool for using modern Julia (v1.8) on secure systems. The following approach requires a build server with the same architecture as the deployment server, something your institution probably already uses for developing containers, etc.

  1. Build a sysimage with PackageCompiler's create_sysimage()
  2. Upload the build (sysimage and depot) along with the Julia binaries to the secure system
  3. Alias a script to julia, similar to the following example:
#!/bin/bash
set -Eeu -o pipefail

unset JULIA_LOAD_PATH

export JULIA_PROJECT=/Path/To/Project
export JULIA_DEPOT_PATH=/Path/To/Depot
export JULIA_PKG_OFFLINE=true

/Path/To/julia -J/Path/To/sysimage.so "$@"

I've been able to run a research pipeline on my institution's secure system, for which there is a public version of the approach.

Imagination answered 14/12, 2022 at 15:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.