R equivalent of Stata local or global macros
Asked Answered
A

3

5

I am a Stata user trying to learn R.

I have a couple of lengthy folder paths which, in my Stata code, I stored as local macros. I have multiple files in both those folders to use in my analysis.

I know, in R, I can change the working directory each time I want to refer to a file in one of the folders but it is definitely not a good way to do it. Even if I store the folder paths as strings in R, I can't figure out how to refer to those. For example, in Stata I would use `folder1'.

I am wondering if trying to re-write Stata code line by line in R is not the best way to learn R.

Can someone please help?

Appomattox answered 28/3, 2013 at 20:19 Comment(5)
I think you're looking for list.files(.). Look here. Also check ?list.files for all possible options.Hoff
"Even if I store the folder paths as strings in R, I can't figure out how to refer to those (like using `folder1' in Stata)." Can you give a concrete example of this problem, with code?Adroit
@Adroit folder1 is the name of the variable. Surrounding it with backtick/tick resolves the name and returns the value. Thinking about Stata is going to give me nightmares...Mayotte
@Adroit folder1 is the name of the local. An example is local folder1 "Z:/Project/Data/Raw". Suppose this folder Raw has a bunch of datasets I need to use, each time I want to load the dataset, I don't want to repeat "Z:/Project/Data/Raw". Instead, in Stata I stored it as a local and say use "folder1'/file1.dta"`Appomattox
I think the short answer is that there is no one-to-one equivalent of Stata's local macros in R, so you need to learn how to do things differently, and in fact more directly.Clemenciaclemency
A
7

First, as a former Stata user, let me recommend R for Stata Users. There is also this article on Macros in R. I think @Nick Cox is right that you need to learn to do things more differently. But like you (at least in this case), I often find myself starting a new task with my prior knowledge of how to do it in Stata and going from there. Sometimes I find the approaches are similar. Sometimes I can make R act like Stata when a different approach would be better (e.g., loops vs. vectorization).

I'm not sure if I will capture your question with the following, but let me try.

In Stata, it would be common to write:

global mydata "path to my data directory/"

To import the data, I would just type:

insheet using "${mydata}myfile.csv"

As a former Stata user, I want to do something similar in R. Here is what I do:

mydata <- "path to my data directory/"

To import a csv file located in this directory and create a data frame called myfile, I would use:

myfile <- read.csv(paste(mydata, "myfile.csv", sep=""))

or more efficiently...

myfile <- read.csv(paste0(mydata, "myfile.csv"))

I'm not a very efficient R user yet, so maybe others will see some flaws in this approach.

Azerbaijani answered 28/3, 2013 at 23:6 Comment(3)
Stata calls retrieval of a named string (i.e. character vector) a 'macro'?Yepez
It is an example of a use of a global macro. see here. There are many more interesting uses.Azerbaijani
Ah. My impression is that R uses list structures more than Stata. There is a function which I have mainly seen used in lattice graphics called modifyList which might allow similar uses. There are also expressions and the substitute function in the language manipulation domain that might be needed to get something like that functionality. It appears that the Stata presume it will get ordered text arguments without as many separators while R has a greater degree of separation of character vectors from actual language elements.Yepez
C
7

Maybe you want file.path()?

a <- "c:"
b <- "users"
c <- "charles"
d <- "desktop"

setwd(file.path(a,b,c,d))
getwd()
#----
[1] "c:/users/charles/desktop"

You can wrap source or read.XXX or whatever else around that to do what you want.

Candelabrum answered 28/3, 2013 at 20:35 Comment(5)
@Appomattox I'm glad you got an answer that solved your problem! It helps improve the quality of the site if you indicate this by clicking the check mark by the answer that solved your problem. (You are never under any obligation to do so, but it helps signal to others which answer actually solved your problem.)Adroit
@Adroit Sorry about that. I joined Stack Overflow just a couple of days ago. I have been using it on and off when it popped up in my search results when working on stata but I never had an account until now. I still don't have enough reputation to upvote or downvote anything. I will remember to revisit these answers to upvote when I get the reputation needed.Appomattox
@Appomattox You don't need any rep to click on the check mark to indicate which answer solved your problem. In fact, doing so will earn you some rep!Adroit
@Adroit I checked the most appropriate answer. You are right. It improved my reputation, although I still don't have enough to vote up or down anything.Appomattox
@Appomattox Now you do! :)Adroit
A
7

First, as a former Stata user, let me recommend R for Stata Users. There is also this article on Macros in R. I think @Nick Cox is right that you need to learn to do things more differently. But like you (at least in this case), I often find myself starting a new task with my prior knowledge of how to do it in Stata and going from there. Sometimes I find the approaches are similar. Sometimes I can make R act like Stata when a different approach would be better (e.g., loops vs. vectorization).

I'm not sure if I will capture your question with the following, but let me try.

In Stata, it would be common to write:

global mydata "path to my data directory/"

To import the data, I would just type:

insheet using "${mydata}myfile.csv"

As a former Stata user, I want to do something similar in R. Here is what I do:

mydata <- "path to my data directory/"

To import a csv file located in this directory and create a data frame called myfile, I would use:

myfile <- read.csv(paste(mydata, "myfile.csv", sep=""))

or more efficiently...

myfile <- read.csv(paste0(mydata, "myfile.csv"))

I'm not a very efficient R user yet, so maybe others will see some flaws in this approach.

Azerbaijani answered 28/3, 2013 at 23:6 Comment(3)
Stata calls retrieval of a named string (i.e. character vector) a 'macro'?Yepez
It is an example of a use of a global macro. see here. There are many more interesting uses.Azerbaijani
Ah. My impression is that R uses list structures more than Stata. There is a function which I have mainly seen used in lattice graphics called modifyList which might allow similar uses. There are also expressions and the substitute function in the language manipulation domain that might be needed to get something like that functionality. It appears that the Stata presume it will get ordered text arguments without as many separators while R has a greater degree of separation of character vectors from actual language elements.Yepez
Y
1

I'm guessing from context that the term "local" when applied to files means that they have been loaded into memory for efficiency purposes? If so, then you need to realize that pretty much all ordinary R objects are handled that way. See ?read.table and ?load. The only way data can remain non-local is to have it reside in a database that has an interface package that supports SQL queries or use specialized packages such as ff or bycol.

Other than that and Chase's idea to use file.path(), any reference to files or connections is done using the proper read/load/scan functions to which character values are given as (variously named) arguments. You can see a variety of low-level functions with ?file and perhaps following some of the additional links from that help page. You could store one or more results of a file.path construction in a character vector which could be named for easy reference.

 pathvecs <- c(User= "~/", hrtg="~/Documents/Heritage/")
 pathvecs
#                   User                    hrtg 
#                   "~/" "~/Documents/Heritage/" 
pathvecs["hrtg"]
#                   hrtg 
#"~/Documents/Heritage/" 
Yepez answered 28/3, 2013 at 20:44 Comment(2)
By local, I mean the macro local in Stata. I am sorry but I guess I wasn't clear with my question. I know how to load the data. I am figuring out how to avoid repeating lengthy file paths by storing them as a small "local" and using the local-name instead.Appomattox
That does not help me understand. I cannot figure out why a named character vector is not an effective way to document and store paths.Yepez

© 2022 - 2024 — McMap. All rights reserved.