R: source() and path to source files
Asked Answered
D

6

19

There must be something that I don't understand about the source() command in R. I'm still new to it, but I cannot for the life of me understand how it gets its directories from! My problem is this:

I have a wrapper script, wrapper.R, and a source file containing some functions, functions.R. Both of these are in the same directory. If I call source('functions.R') inside the wrapper script, while standing inside the directory where both files are located, everything is fine. However, I want to be able to run my wrapper.R script from some other directory, i.e. not the one where these script are located. If I run my wrapper for another directory, it doesn't work, and I get a cannot open the file error.

I googled and found lots of different threads, but this question seemed to be very clear. The way I understand it, the way I'm doing it should work. Clearly, I'm misunderstanding something. My reading of that thread leads me to believe that source() works on the directory in which the file that calls source() is located in. My reading also leads me to believe that I should not be using chdir = TRUE, as I want to keep the advertised relative directory.

Seeing as it doesn't work... what am I misunderstanding? How can I source files in the same directory as my wrapper script when called from somewhere else?

Damron answered 15/3, 2017 at 16:45 Comment(4)
This should all just come down to the working directory. R needs to know where to look for the files. You can find your current working directory by typing getwd() and you can reset it with setwd(). But you could always just do something like source("c:\...") and that should work.Swift
Sorry, I was being unclear. I can set the working directory, but what if I'm trying to distribute these scripts to a colleague? I won't know exactly where he put them. Is there a way to source the files without actually knowing the directory, and still call the wrapper function from some other directory?Damron
I believe if you write a bat file to run the scripts it will automatically use the directory it is in as the working directory.Swift
Linking related older question: stackoverflow.com/questions/7222107Lillalillard
M
22

If you are distributing a script to colleagues, you should really not be writing a script that sources other scripts. What if you want to rename or move functions.R in the future? What if you need to modify a function in functions.R, but wrapper.R relies on the older version of that function? It's a flimsy solution that will cause headache. I would recommend either of the following instead.

  1. Put everything needed into a single, self-contained script and distribute that.

  2. If you really want to separate code into different files, write a package. Might sound like overkill, but packages can actually be very simple and lightweight. In the simplest form a package is just a directory with a DESCRIPTION and NAMESPACE file along with an R/ directory. Hadley breaks this down nicely: https://r-pkgs.org/whole-game.html.

Malefactor answered 15/3, 2017 at 17:50 Comment(8)
Aaah, okay, that was a concept I didn't really envision... If I want to go the "package"-route, what if my project is a mixed-language project? If I have a small number of Python scripts in addition to the R ones? What would be a good way to distribute such a project?Damron
If you want to go the package route, create a directory in your package called inst/python/ and put the python scripts in there. More info from Hadley: r-pkgs.had.co.nz/inst.html#inst-other-langs.Malefactor
If you want to do a single R script followed by a single python script, feather is probably your best option. It is the best way to share data.frame-like objects between R and python (I'm assuming this is data-pipeline script): blog.rstudio.org/2016/03/29/featherMalefactor
@Sajber feel free to accept this answer if it helped :)Malefactor
It did help, thanks! Related question: now that I've gone the package route (it was easier than I initially thought, at least the R parts), how would I distribute the wrapper script? The wrapper should only be run from the command line, so it's not really part of the package itself (if I understand package strucutre correctly), but calls all the functions in the package. Where should I put the file? How do I make it available together with the package, but make it clear that it should be run from the command line?Damron
You should place them in exec/, and when you install the package place a sym link in your $PATH that links to /path/to/your/installed/package/exec/your_script. Also make sure you learn how to use devtools, particularly load_all().Malefactor
"Put everything needed into a single, self-contained script and distribute that." Bad, bad, bad programming practice. Readability counts.Waldron
"In the simplest form a package is just a directory..." this is oversimplified. The source for a package is a directory with certain files and subdirs, but to actually load the package into R, it has to be compiled first. This can be tricky for a beginner.Spill
G
30

You can do this using the here package. It uses the "current working directory at the time when the package is loaded". In other words, the directory you start your R session from.

In your case the code would be:

source(here::here('functions.R'))

This will work even if the wrapper script wrapper.R is in a different directory in the project.

If functions.R is in a subdirectory of the project, just add it to the call to here(), to complete the relative path:

source(here::here('subdirectory', 'functions.R'))
Granulose answered 3/2, 2018 at 22:36 Comment(2)
I just discovered the here package and it is solving all sorts of file path issues for me. I encourage folks to try it out.Faletti
"In other words, the directory you start your R session from." — Which is generally not the directory where the code resides. For interactive RStudio projects and exploratory notebooks it happens to be the case but for other types of project it isn’t; notably, for any kind of self-contained application (e.g. command line utilities). Those are almost always launched in a directory that is completely unrelated to the location of the code. here::here does not work in those cases.Pandarus
M
22

If you are distributing a script to colleagues, you should really not be writing a script that sources other scripts. What if you want to rename or move functions.R in the future? What if you need to modify a function in functions.R, but wrapper.R relies on the older version of that function? It's a flimsy solution that will cause headache. I would recommend either of the following instead.

  1. Put everything needed into a single, self-contained script and distribute that.

  2. If you really want to separate code into different files, write a package. Might sound like overkill, but packages can actually be very simple and lightweight. In the simplest form a package is just a directory with a DESCRIPTION and NAMESPACE file along with an R/ directory. Hadley breaks this down nicely: https://r-pkgs.org/whole-game.html.

Malefactor answered 15/3, 2017 at 17:50 Comment(8)
Aaah, okay, that was a concept I didn't really envision... If I want to go the "package"-route, what if my project is a mixed-language project? If I have a small number of Python scripts in addition to the R ones? What would be a good way to distribute such a project?Damron
If you want to go the package route, create a directory in your package called inst/python/ and put the python scripts in there. More info from Hadley: r-pkgs.had.co.nz/inst.html#inst-other-langs.Malefactor
If you want to do a single R script followed by a single python script, feather is probably your best option. It is the best way to share data.frame-like objects between R and python (I'm assuming this is data-pipeline script): blog.rstudio.org/2016/03/29/featherMalefactor
@Sajber feel free to accept this answer if it helped :)Malefactor
It did help, thanks! Related question: now that I've gone the package route (it was easier than I initially thought, at least the R parts), how would I distribute the wrapper script? The wrapper should only be run from the command line, so it's not really part of the package itself (if I understand package strucutre correctly), but calls all the functions in the package. Where should I put the file? How do I make it available together with the package, but make it clear that it should be run from the command line?Damron
You should place them in exec/, and when you install the package place a sym link in your $PATH that links to /path/to/your/installed/package/exec/your_script. Also make sure you learn how to use devtools, particularly load_all().Malefactor
"Put everything needed into a single, self-contained script and distribute that." Bad, bad, bad programming practice. Readability counts.Waldron
"In the simplest form a package is just a directory..." this is oversimplified. The source for a package is a directory with certain files and subdirs, but to actually load the package into R, it has to be compiled first. This can be tricky for a beginner.Spill
P
5

source fundamentally doesn’t support this. The other answers show some workarounds that work in limited cases but all fail in some (common) cases. In particular, chdir = TRUE isn’t a good option as you noted yourself.

A better solution is to box::use from the ‘box’ package. This package allows you to treat R source code as proper modules. One property of this is that modules can load local modules.

Inside your wrapper.R, replace the source call by

box::use(./functions[...])

Or, if you want to export these functions from your wrapper.R module (rather than just using them internally), do the following instead:

#' @export
box::use(./functions[...])

And to load wrapper.R itself, use

box::use(project/wrapper)

Where project is the project name, and needs to correspond to the name of the folder that your wrapper.R script is saved in.

Please consult the Get started vignette for more information on the usage of ‘box’ modules.

Pandarus answered 4/6, 2021 at 15:39 Comment(0)
L
3

Maybe you can define a helper function in wrapper.R that will try to load other files from the same directory. For example

source_here <- function(x, ...) {
    dir <- "."
    if(sys.nframe()>0) {
        frame <- sys.frame(1)
        if (!is.null(frame$ofile)) {
            dir <- dirname(frame$ofile)
        }
    }
    source(file.path(dir, x), ...)
}

Then you would call

# inside wrapper.R
source_here("functions.R")

Then you would just have source wrapper.R and it will look for functions.R in the same directory.

Limestone answered 15/3, 2017 at 17:28 Comment(4)
I'm trying to understand your code a little better. What is the point of having the if(sys.nframe()>0)?Swift
Its probably overkill. I was at first running the code outside a function so there wasn't always a parent frame.Limestone
If source_here is in a different file, wrapper.R then how do your source it?Albuminoid
@Albuminoid source_here will only work for the file it's contained it. It cannot be included in a separate wrapper.R file or it will only look in the folder where that file is located.Limestone
W
2

One answer I didn't see yet is to just use absolute paths. When you source("myfunctions.R") it's using the implicit relative path from getwd(). Use the full path to avoid problems when you change working directory. Though, when sharing the work, the others will have to change all the paths by themselves.

Wineshop answered 12/5, 2020 at 20:24 Comment(1)
Using absolute paths solves some problems, but creates others when the code is run in a slightly different tree. Other languages offer a little more, though I'm interested in looking at the "here" package mentioned above.Neuralgia
L
0

I know it's a bit late, but I ran into this answer on getting the directory where the script being sourced is.

I think you should be able to do this without installing additional packages:

cur_dir = utils::getSrcDirectory(function(){})[1]
source(file.path(cur_dir, 'functions.R'))
Lillalillard answered 14/9, 2023 at 20:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.