R: source() and path to source files

D

6

19

There must be something that I don't understand about the source() command in R. I'm still new to it, but I cannot for the life of me understand how it gets its directories from! My problem is this:

I have a wrapper script, wrapper.R, and a source file containing some functions, functions.R. Both of these are in the same directory. If I call source('functions.R') inside the wrapper script, while standing inside the directory where both files are located, everything is fine. However, I want to be able to run my wrapper.R script from some other directory, i.e. not the one where these script are located. If I run my wrapper for another directory, it doesn't work, and I get a cannot open the file error.

I googled and found lots of different threads, but this question seemed to be very clear. The way I understand it, the way I'm doing it should work. Clearly, I'm misunderstanding something. My reading of that thread leads me to believe that source() works on the directory in which the file that calls source() is located in. My reading also leads me to believe that I should not be using chdir = TRUE, as I want to keep the advertised relative directory.

Seeing as it doesn't work... what am I misunderstanding? How can I source files in the same directory as my wrapper script when called from somewhere else?

Damron answered 15/3, 2017 at 16:45 Comment(4)

This should all just come down to the working directory. R needs to know where to look for the files. You can find your current working directory by typing getwd() and you can reset it with setwd(). But you could always just do something like source("c:\...") and that should work. – Swift 15/3, 2017 at 16:54

Sorry, I was being unclear. I can set the working directory, but what if I'm trying to distribute these scripts to a colleague? I won't know exactly where he put them. Is there a way to source the files without actually knowing the directory, and still call the wrapper function from some other directory? – Damron 15/3, 2017 at 16:58

I believe if you write a bat file to run the scripts it will automatically use the directory it is in as the working directory. – Swift 15/3, 2017 at 17:0

Linking related older question: stackoverflow.com/questions/7222107 – Lillalillard 14/9, 2023 at 21:8

M

22

If you are distributing a script to colleagues, you should really not be writing a script that sources other scripts. What if you want to rename or move functions.R in the future? What if you need to modify a function in functions.R, but wrapper.R relies on the older version of that function? It's a flimsy solution that will cause headache. I would recommend either of the following instead.

Put everything needed into a single, self-contained script and distribute that.
If you really want to separate code into different files, write a package. Might sound like overkill, but packages can actually be very simple and lightweight. In the simplest form a package is just a directory with a DESCRIPTION and NAMESPACE file along with an R/ directory. Hadley breaks this down nicely: https://r-pkgs.org/whole-game.html.

Malefactor answered 15/3, 2017 at 17:50 Comment(8)

Aaah, okay, that was a concept I didn't really envision... If I want to go the "package"-route, what if my project is a mixed-language project? If I have a small number of Python scripts in addition to the R ones? What would be a good way to distribute such a project? – Damron 15/3, 2017 at 20:40

If you want to go the package route, create a directory in your package called inst/python/ and put the python scripts in there. More info from Hadley: r-pkgs.had.co.nz/inst.html#inst-other-langs. – Malefactor 15/3, 2017 at 21:2

If you want to do a single R script followed by a single python script, feather is probably your best option. It is the best way to share data.frame-like objects between R and python (I'm assuming this is data-pipeline script): blog.rstudio.org/2016/03/29/feather – Malefactor 15/3, 2017 at 21:4

@Sajber feel free to accept this answer if it helped :) – Malefactor 16/3, 2017 at 20:41

It did help, thanks! Related question: now that I've gone the package route (it was easier than I initially thought, at least the R parts), how would I distribute the wrapper script? The wrapper should only be run from the command line, so it's not really part of the package itself (if I understand package strucutre correctly), but calls all the functions in the package. Where should I put the file? How do I make it available together with the package, but make it clear that it should be run from the command line? – Damron 17/3, 2017 at 8:56

You should place them in exec/, and when you install the package place a sym link in your $PATH that links to /path/to/your/installed/package/exec/your_script. Also make sure you learn how to use devtools, particularly load_all(). – Malefactor 17/3, 2017 at 13:42

"Put everything needed into a single, self-contained script and distribute that." Bad, bad, bad programming practice. Readability counts. – Waldron 16/1, 2018 at 14:36

"In the simplest form a package is just a directory..." this is oversimplified. The source for a package is a directory with certain files and subdirs, but to actually load the package into R, it has to be compiled first. This can be tricky for a beginner. – Spill 11/3, 2020 at 19:16

G

30

You can do this using the here package. It uses the "current working directory at the time when the package is loaded". In other words, the directory you start your R session from.

In your case the code would be:

source(here::here('functions.R'))

This will work even if the wrapper script wrapper.R is in a different directory in the project.

If functions.R is in a subdirectory of the project, just add it to the call to here(), to complete the relative path:

source(here::here('subdirectory', 'functions.R'))

Granulose answered 3/2, 2018 at 22:36 Comment(2)

I just discovered the here package and it is solving all sorts of file path issues for me. I encourage folks to try it out. – Faletti 6/5, 2021 at 16:36

"In other words, the directory you start your R session from." — Which is generally not the directory where the code resides. For interactive RStudio projects and exploratory notebooks it happens to be the case but for other types of project it isn’t; notably, for any kind of self-contained application (e.g. command line utilities). Those are almost always launched in a directory that is completely unrelated to the location of the code. here::here does not work in those cases. – Pandarus 3/4, 2023 at 7:37

M

22

If you are distributing a script to colleagues, you should really not be writing a script that sources other scripts. What if you want to rename or move functions.R in the future? What if you need to modify a function in functions.R, but wrapper.R relies on the older version of that function? It's a flimsy solution that will cause headache. I would recommend either of the following instead.

Put everything needed into a single, self-contained script and distribute that.
If you really want to separate code into different files, write a package. Might sound like overkill, but packages can actually be very simple and lightweight. In the simplest form a package is just a directory with a DESCRIPTION and NAMESPACE file along with an R/ directory. Hadley breaks this down nicely: https://r-pkgs.org/whole-game.html.