Offline installation of a list of packages: getting dependencies in order
Asked Answered
W

1

16

I've got the source files for a bunch of packages and their dependencies that I want to install on computers that have no internet access. I want to install all of these on other computers using as USB stick, but the install fails for some packages because the dependencies are not installing before the packages. How can I get the dependencies to be installed in order, before the packages that needs them?

Here's my current method to obtain the packages, their dependencies, and get them in the correct order:

# find the dependencies for the packages I want
# from https://mcmap.net/q/424691/-only-download-sources-of-a-package-and-all-dependencies
getPackages <- function(packs){
  packages <- unlist(
    tools::package_dependencies(packs, available.packages(),
                                which=c("Depends", "Imports"), recursive=TRUE)
  )
  packages <- union(packages, packs)
  packages
}

# packages I want 
my_packages <- c('stringr', 'devtools', 'ggplot2', 'dplyr', 'tidyr', 'rmarkdown', 'knitr', 'reshape2', 'gdata')

# get names of dependencies and try to get them in the right order, this seems ridiculous... 
my_packages_and_dependencies <- getPackages(my_packages)
dependencies_only <- setdiff(my_packages_and_dependencies, my_packages)
deps_of_deps <- getPackages(dependencies_only)
deps_of_deps_of_deps <- getPackages(deps_of_deps)
my_packages_and_dependencies <- unique(c(deps_of_deps_of_deps, deps_of_deps, dependencies_only, my_packages))

# where to keep the source?
local_CRAN <- paste0(getwd(), "/local_CRAN")

# get them from CRAN, source files
download.packages(pkgs = my_packages_and_dependencies, destdir = local_CRAN, type = "source")
# note that 'tools', 'methods', 'utils, 'stats', etc. art not on CRAN, but are part of base

# from https://mcmap.net/q/321304/-offline-install-of-r-package-and-dependencies
library(tools)
write_PACKAGES(local_CRAN)

Now assume I'm on another computer with a fresh install of R and RStudio (and Rtools or Xcode) and no internet connection, I plug in the USB stick, open the RProj file to set the working directory, and run this script:

#############################################################

## Install from source (Windows/OSX/Linux)

# What do I want to install?
my_packages_and_dependencies <- c("methods", "tools", "bitops", "stats", "colorspace", "graphics", 
                                  "tcltk", "Rcpp", "digest", "jsonlite", "mime", "RCurl", "R6", 
                                  "stringr", "brew", "grid", "RColorBrewer", "dichromat", "munsell", 
                                  "plyr", "labeling", "grDevices", "utils", "httr", "memoise", 
                                  "whisker", "evaluate", "rstudioapi", "roxygen2", "gtable", "scales", 
                                  "proto", "MASS", "assertthat", "magrittr", "lazyeval", "DBI", 
                                  "stringi", "yaml", "htmltools", "caTools", "formatR", "highr", 
                                  "markdown", "gtools", "devtools", "ggplot2", "dplyr", "tidyr", 
                                  "rmarkdown", "knitr", "reshape2", "gdata")

# where are the source files? 
local_CRAN <- paste0(getwd(), "/local_CRAN")

# scan all packages and get files names of wanted source pckgs
# I've got other things in this dir also
wanted_package_source_filenames <- list.files(local_CRAN, pattern = "tar.gz", full.names = TRUE)

# put them in order to make sure deps go first, room for improvement here...
trims <- c(local_CRAN, "/",  "tar.gz")
x1 <- gsub(paste(trims, collapse = "|"), "", wanted_package_source_filenames)
x2 <- sapply( strsplit(x1, "_"), "[[", 1)
idx <- match(my_packages_and_dependencies, x2)
wanted_package_source_filenames <- na.omit(wanted_package_source_filenames[idx])

install.packages(wanted_package_source_filenames, 
                 repos = NULL, 
                 dependencies = TRUE, 
                 contrib.url = local_CRAN, # I thought this would take care of getting dependencies automatically...
                 type  = "source" )

This works reasonably well, but still some packages fail to install:

sapply(my_packages_and_dependencies, require, character.only = TRUE) 

 methods        tools       bitops        stats 
        TRUE         TRUE         TRUE         TRUE 
  colorspace     graphics        tcltk         Rcpp 
        TRUE         TRUE         TRUE         TRUE 
      digest     jsonlite         mime        RCurl 
        TRUE         TRUE         TRUE        FALSE 
          R6      stringr         brew         grid 
        TRUE         TRUE         TRUE         TRUE 
RColorBrewer    dichromat      munsell         plyr 
        TRUE         TRUE         TRUE         TRUE 
    labeling    grDevices        utils         httr 
        TRUE         TRUE         TRUE        FALSE 
     memoise      whisker     evaluate   rstudioapi 
        TRUE         TRUE         TRUE         TRUE 
    roxygen2       gtable       scales        proto 
        TRUE         TRUE         TRUE         TRUE 
        MASS   assertthat     magrittr     lazyeval 
        TRUE         TRUE         TRUE         TRUE 
         DBI      stringi         yaml    htmltools 
        TRUE         TRUE         TRUE         TRUE 
     caTools      formatR        highr     markdown 
        TRUE         TRUE         TRUE         TRUE 
      gtools     devtools      ggplot2        dplyr 
        TRUE        FALSE        FALSE         TRUE 
       tidyr    rmarkdown        knitr     reshape2 
       FALSE        FALSE         TRUE         TRUE 
       gdata 
        TRUE 
Warning messages:
1: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘RCurl’
2: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘httr’
3: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘devtools’
4: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘ggplot2’
5: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘tidyr’
6: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘rmarkdown’

Seems that knitr must come before rmarkdown, reshape2 before tidyr and ggplot2, etc. etc.

There must be a simpler and more complete solution to the problem of getting the list of source files in the very specific order needed the put all the dependencies in the right order. What's the simplest way to do that (without using any contributed packages)?

This is the system I am currently working on, I'm using the source versions of packages in an attempt to prepare for anything with the offline computers (OSX/Linux/Windows):

> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
 [1] tcltk     grid      tools     stats     graphics 
 [6] grDevices utils     datasets  methods   base     

other attached packages:
 [1] gdata_2.13.3       reshape2_1.4.1    
 [3] knitr_1.9          dplyr_0.4.1       
 [5] gtools_3.4.1       markdown_0.7.4    
 [7] highr_0.4          formatR_1.0       
 [9] caTools_1.17.1     htmltools_0.2.6   
[11] yaml_2.1.13        stringi_0.4-1     
[13] DBI_0.3.1          lazyeval_0.1.10   
[15] magrittr_1.5       assertthat_0.1    
[17] proto_0.3-10       scales_0.2.4      
[19] gtable_0.1.2       roxygen2_4.1.0    
[21] rstudioapi_0.2     evaluate_0.5.5    
[23] whisker_0.3-2      memoise_0.2.1     
[25] labeling_0.3       plyr_1.8.1        
[27] munsell_0.4.2      dichromat_2.0-0   
[29] RColorBrewer_1.1-2 brew_1.0-6        
[31] stringr_0.6.2      R6_2.0.1          
[33] mime_0.2           jsonlite_0.9.14   
[35] digest_0.6.8       Rcpp_0.11.4       
[37] colorspace_1.2-5   bitops_1.0-6      
[39] MASS_7.3-35       

loaded via a namespace (and not attached):
[1] parallel_3.1.2

EDIT following Andrie's helpful comment, I've had a go with miniCRAN, the bit that's missing from the vignette is how to actually install the packages from the local repo. This is what I've tried:

library("miniCRAN")

# Specify list of packages to download
pkgs <- c('stringr', 'devtools', 'ggplot2', 'dplyr', 'tidyr', 'rmarkdown', 'knitr', 'reshape2', 'gdata')

# Make list of package URLs
revolution <- c(CRAN="http://cran.revolutionanalytics.com")
pkgList <- pkgDep(pkgs, repos=revolution, type="source" )
pkgList

# Set location to store source files 
local_CRAN <- paste0(getwd(), "/local_CRAN")

# Make repo for source
makeRepo(pkgList, path = local_CRAN, repos = revolution, type = "source")

# install...
install.packages(pkgs, 
                 repos = local_CRAN, # do I really need "file:///"?
                 dependencies = TRUE, 
                 contrib.url = local_CRAN,
                 type  = "source" )

And the result is:

Installing packages into ‘C:/emacs/R/win-library/3.1’
(as ‘lib’ is unspecified)
Warning in install.packages :
  unable to access index for repository C:/Users/.../local_CRAN/src/contrib
Warning in install.packages :
  packages ‘stringr’, ‘devtools’, ‘ggplot2’, ‘dplyr’, ‘tidyr’, ‘rmarkdown’, ‘knitr’, ‘reshape2’, ‘gdata’ are not available (for R version 3.1.2)

What am I missing here?

EDIT Yes, I was missing proper use of file:///, which should be like this:

install.packages(pkgs, 
                 repos = paste0("file:///", local_CRAN),
                 type = "source")

That's moved me along heaps, it all basically works as expected now. Thanks very much. Now I just have this to look in to: fatal error: curl/curl.h: No such file or directory, which is stopping RCurl and httr from installing.

Wessex answered 5/3, 2015 at 18:43 Comment(5)
My package miniCRAN can help with this. You tell miniCRAN the list of packages you would ever want to install, then it downloads those packages and creates a repository on your local machine that behaves like CRAN, i.e. it respects install.packages() etc. See cran.r-project.org/web/packages/miniCRAN/index.htmlCoridon
@Coridon Seems like a reasonable thing to post in an answerAmorist
Could you loop over the DESCRIPTION files, counting the number of dependencies? Then install all the packages with no dependencies first, the ones with only one dependency second (somehow sorting by whether their names were in the list of any other single dependency packages) and so on "down the line"?Nudity
@BondedDust Yes, you could. That's exactly how I determine the recursive dependencies in miniCRAN, using the function pkgDep(). This function is essentially a wrapper around tools::package_dependencies(). I'll add that this took me ages to debug comprehensively.Coridon
@Andrie: I'm not surprised it took debugging. I've never mastered recursive programming in R myself. Glad to know I don't need to do it in this instance. Thanks for your work.Nudity
C
16

The package miniCRAN can help with this. You tell miniCRAN the list of packages you would ever want to install, it then figures out the dependencies, downloads those packages and creates a repository on your local machine that behaves like CRAN, i.e. it respects install.packages() etc.

More information:

To install from the local miniCRAN repository, you have two options.

  1. Firstly, you can use the URI convention file:///. e.g.

    install.packages("ggplot2", repos="file:///path/to/file/")
    
  2. Alternatively, you can configure the destination as an HTTP server and make your repository available via a URL. In this case, your local repository will look and feel exactly like a CRAN mirror, other than it only contains your desired packages.

Coridon answered 5/3, 2015 at 18:54 Comment(7)
Thanks very much, that looks very relevant. I've had a go with it and got a bit stuck, would you mind to check my edit of my Q?Wessex
Thanks again, that's pretty much solved the problem. Just an RCurl puzzle to solve now, but that's another question...Wessex
Is miniCRAN still alive? i tried to install it on 2 different machines, with 2 different OS(1 centos, 1 debian) and in both i got: In install.packages("miniCRAN") : installation of package ‘miniCRAN’ had non-zero exit statusArciniega
@LuisLeal You don't provide enough information to diagnose why this doesn't work for you. Please post a new question (and tag me) or post on the github issue tracker at github.com/RevolutionAnalytics/miniCRAN/issuesCoridon
Thank you so much, sorry i didn't provide many details, my question was if the project was still under maintainance because i tested to install in 3 different machines(centos,debian,ubuntu) without successs but after checking error messages in detail, and googling i found what i was missing was(for centos): yum install libxml2 yum install libxml2-devel yum install libcurl yum install libcurl-devel yum install openssl yum install openssl-develArciniega
@LuisLeal Good point. Those are system dependencies of curl and XML, so strictly speaking nothing to do with miniCRAN. However, it won't harm to add a note to this effect to miniCRAN. I've added an issue to the project at github.com/RevolutionAnalytics/miniCRAN/issues/98Coridon
Can this package also hold bioconductor package? @CoridonKayser

© 2022 - 2024 — McMap. All rights reserved.