In Stata the lookfor
command offers a quick way to search for variables in a dataset (and it searches both the variable names and labels). So lookfor education
quickly finds you variables related to education. Is there an equivalent shortcut function in R?
You can simply grep
the data.frame for the information necessary. Then you'll get much more information than simply the list of names of variables for which somebody is matched. You can also use regular expressions, thus enhancing your search capabilities. Here is the example of a function which does what you want (works with data.frame only):
lookfor <-
function (pattern, data, ...)
{
l <- lapply(data, function(x, ...) grep(pattern, x, ...))
res <- rep(FALSE, ncol(data))
res[grep(pattern, names(data), ...)] <- TRUE
res <- sapply(l, length) > 0 | res
names(res) <- names(data)
names(res)[res]
}
First I grep each column, then I grep the column names. Then I keep only information whether grep matched anything and record it for each column separately. Instead of ...
you can pass any arguments to grep
. If you omit it, this function will do a simple string matching.
Here is an example:
> dt<- data.frame(y=1:10,x=letters[1:10],a=rnorm(10))
> lookfor("a",dt)
[1] "x" "a"
How about this as a oneliner which I run at the start of a session:
lkf <- function(d,p) names(d)[grep(p,names(d))]
where d
is the name of your data.frame and p
is the pattern.
So
d <- data.frame(a=letters[1:10],b=1:10,c=month.name[1:10])
lkf(d,'c')
# [1] "c"
And here's a version that doesn't require you to quote the variable names
lookfor <- function(string_to_find, data){
# Extract the arguments and force conversion to string
pars <- as.list(match.call()[-1])
data.name <- as.character(pars$data)
var <- as.character(pars$string_to_find)
# Regular expression search through names
result <- names(data)[grep(var, names(data))]
if(length(result) == 0) {
warning(paste(var, "not found in", data.name))
return(NULL)
}
else {
return(result)
}
}
If you just need to search though the list of variables to find the one you are looking for then it is possible to use the code completion function in RStudio (v0.99 onwards). Simply start typing and you will get a list of possible matches. So in your case type education$
and a list of variables contained in the data frame will appear. Scroll though these and select the one you want.
© 2022 - 2024 — McMap. All rights reserved.
which()
command with thenames()
command for this if you're working with a data frame, orcolnames()
if you're working with a matrix – Comanche