How to install and manage many versions of R packages
Asked Answered
A

3

21

I am developing a framework for reproducible computing with R. One problem that I am struggling with is that some R code might run perfectly in version X.Y-Z of a package, but then why you try to reproduce it 3 years later, the packages have updated, some functions are changed, and the code doesn't run anymore. This problem affects also for example Sweave documents that use packages.

The only way to confidently reproduce the results is by installing the R version and version of the packages that were used by the original author. If this was a single case, one could pull stuff from the CRAN archives and install appropriate versions. But for my framework this is impractical, and I need to have the package versions preinstalled.

Assume for now that I restrict myself to a single version of R, e.g. 2.14. What would be a practical way to install many versions of R packages, so that I can load them on the fly? I suppose I can do something like creating separate library directories for every version of every package and then using custom lib.loc arguments while loading them. This is going to be messy though. Any tips or previous attempts to do something similar?

My framework runs on Ubuntu server.

Abase answered 14/1, 2012 at 7:25 Comment(9)
are you familiar with dev_mode in the devtools package? IIRC it's tackling a similar problem.Moreen
Not really. It just changes your libpath to some temporary sandbox dir. But it doesn't provide any system beyond that.Abase
It is a duplicate. See my answer here: #8344186Coacervate
@Oz123: Not really - your answer there is completely unrelated (and using --prefix is a pretty bad idea - if you want separate R version directories you should use rhome instead). This question is about versioning packages, not R installations.Stockist
@simon urbanek, why is --prefix a bad idea?Eskill
"Those who don't understand a package management system are doomed to reinvent it, poorly." With apologies to Herbert Spencer. Simon is correct below. You can handle all this via .libPath et al.Aragon
@PaulHiemstra because the prefix defines the system environment you're installing to (dependent libraries etc.), not the location of R. The location of R is defined by rhome variable at install time. The typical setup for parallel R versions is to use common prefix (typically default /usr/local) and set rhome to version-specific directories (e.g. /usr/local/R/2.14). This is typically how R is used in organization-wide installations.Stockist
I like the idea of project specific libraries: so each project has its own library containing the correct version of each package.Blunder
following Gentleman and Temple Lang, a compendium package could be developed to provide a framework for managing multiple projects and their associated libraries. It would keep a record (database) of each project and its required dependencies, and e.g. switch the libpath to obtain a working and reproducible environment for that particular project.Moreen
S
4

You could install packages with versions (e.g. rename to foo_1.0 directory instead of foo) and softlink the versions you want to re-create a given R + packages snapshot into one library. Obviously, the packages could actually live in a separate tree, so you could have library.projectX/foo -> library.all/foo/1.0.

Stockist answered 14/1, 2012 at 8:36 Comment(1)
In addition, you could then change the environment variable R_LIBS to the appropriate directory for that projectEskill
A
1

The operating system gives you even more handles for complete separation, and the Debian / Ubuntu stack as a ton of those available. Two I have played with are

  • chroot environments: We use this to complete separate build environments from host machines. For example, all Debian uploads I produced are built in a i386 pbuilder chroot hosted on my amd64 Ubuntu server. Chroot is a very powerful Unix system call. Chroots, and particularly the pbuilder system built on top of it (for Debian package building) are meant to operate headless.

  • Virtual machines: This gives you full generality. My not-so-powerful box easily handles three virtual machines: Debian i386, Ubuntu i386 as well as Windoze XP. For this, I currently use KVM along with libvirt; this is Linux specific. I have also used VirtualBox and VMware in the past.

Aragon answered 14/1, 2012 at 18:23 Comment(0)
U
-1

I would try to modify the DESCRIPTION file, and change the field "Package" there by adding the version number.

For example, you download the package source a from CRAN page (http://cran.r-project.org/web/packages/pls/). Unpack the compressed file (pls_2.3-0.zip) to a directory ("pls/"). The following steps are to change the package name in DESCRIPTION ("pls/DESCRIPTION") and installation with R command 'R CMD INSTALL pls/', where 'pls/' is a path to the package source with modified DESCRIPTION file.

Playing with R library paths seems a dangerous thing to me.

Unsex answered 14/1, 2012 at 14:45 Comment(1)
Playing with package names is even more dangerous because you break all dependencies. Library paths are designed to be played with unlike packages names.Stockist

© 2022 - 2024 — McMap. All rights reserved.