I am currently writing a function for an R package. Part of what this function is aimed to do is (a) take data as an input and (b) check one of its columns against a list of acceptable values.
These acceptable values are given to me from another organization. They are within a .csv file. What I would like to do is load this .csv file and use it as a reference to check if the column from the user has valid values.
For example, let's say the user has these data:
set.seed(1839)
user <- data.frame(x=sample(letters,10),
y=rnorm(10))
user
x y
1 v -0.7025836
2 p -1.4586245
3 f 0.1987113
4 y 1.0544690
5 o -0.7112214
6 m 0.2956671
7 b 0.3016737
8 a -0.0945271
9 x -0.2790357
10 c 0.1681388
And the .csv contains many (useful) columns, but I only care about one (z
) for the moment:
ref <- data.frame(z=letters[1:4], a=rnorm(4), b=(rnorm(4)))
ref
z a b
1 a -0.3563105 1.4536406
2 b 1.6841862 1.3232985
3 c 1.3073516 -0.6978598
4 d 0.4352904 -0.3971175
The code I would like to run is (note: I am not calling library
in the actual function, I am just doing it here for simplicity's sake):
library(dplyr)
valid_values <- ref %>%
select(z) %>%
unname() %>%
unlist() %>%
as.character()
summary <- user %>%
mutate(x_valid=ifelse(x %in% valid_values, TRUE, FALSE))
summary
tells me which values of x
in user
are valid:
x y x_valid
1 v -0.7025836 FALSE
2 p -1.4586245 FALSE
3 f 0.1987113 FALSE
4 y 1.0544690 FALSE
5 o -0.7112214 FALSE
6 m 0.2956671 FALSE
7 b 0.3016737 TRUE
8 a -0.0945271 TRUE
9 x -0.2790357 FALSE
10 c 0.1681388 TRUE
Now, what do I use to replace ref
with in my function code? Where should I store this data in my package? How do I load it? And what type of file should I covert it to?
The function should look something like:
x_check <- function(data) {
# get valid values
valid_values <- ??? %>%
select(z) %>%
unname() %>%
unlist() %>%
as.character()
# compare against valid values
return(
data %>%
mutate(x_valid=ifelse(x %in% valid_values, TRUE, FALSE))
)
}
What do I replace the ???
with to get my data? I do not care much whether or not the user is able to see this ref
data I wish to load in.
I am using devtools::load_all("directory/for/my/package")
to test my package. Relevant session information:
R version 3.4.0 (2017-04-21)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server 7.3 (Maipo)
other attached packages:
[1] roxygen2_6.0.1 devtools_1.13.2
data/
folder, you load it usingdata()
(if it's not lazy loaded). And you can usedevtools::use_data()
to set that up for you. – Manilledata/
folder and tried to usedevtools::use_data(admit_source.RData)
, whereadmit_source
is the name of the file, but I received the error:Error: Could not find package root.
– OgleDESCRIPTION
file has also specifiedLazyData: true
– Ogle?use_data
- you should giveuse_data
an R object, it will take care of creating the RData file. And if you have errors like that, maybe your working directory isn't set to the package folder? It seems like your question would be "why isn'tuse_data
working? How can I avoid this error?" All the stuff about your function seems unrelated. – Manilledevtools::use_data
; I just want to figure out a way to access that data when someone runs the function. I may be just confused, but it seems like Hadley specifically says to give it an.RData
file generated usingsave()
.I wasn't sure ofuse_data
is what I wanted anyways, because the documentation asks for an existing object, which corresponds to why his example involves creating an objectx <- c(1:10)
. Ifuse_data
takes an existing object, how do I actually put the file into an R object? That's what I want, anyways. – Ogle