Include data examples in developing R packages
Asked Answered
P

3

27

I am eager to learn how to incorporate data examples as comments written above the functions, such as:

##' @examples 
##' ## Set working directory...
##' ## Load data into R session:
##' data <- system.file("extdata", "data.txt", package="...", sep="\t", header=TRUE, stringsAsFactors = FALSE)
##'
##' ## For reproducible results:
##' set.seed(999)

I put my "data.txt" file in the directory: /pkg_Name/inst/extdata/. However, R CMD check indicates error in this step. If I proceed to R CMD build and R CMD install, then after loading the package, I cannot get the data into R session... Could anyone tell me what went wrong? Is this the correct way to include data examples at the end of the function help document?

Thanks a lot!

Peccant answered 12/9, 2012 at 15:5 Comment(0)
D
13

Please look at CRAN packages that include data and copy their approach. I just added data to a at-work-only package a few weeks back and it just works...

For what it is worth, the manual has a section 1.1.5 Data in packages which explains it.

Dropline answered 12/9, 2012 at 15:16 Comment(0)
M
35

Hadley Wickham has a chapter in his book "R Packages" on how to incorporate data into an R Package.

Dirk points to the official documentation on data in packages.

Alternatively, here's an example of learning from the ggplot2 package for one way of how to incorporate data using rda files and roxygen.

Here is the data directory in the ggplot2 package. In this example, each data file is stored in a separate rda file (e.g., generated using save(foo, file='foo.rda').

enter image description here

See the file data.r for the Roxygen commands to generate the Rmd help files for the data: E.g.,

#' Prices of 50,000 round cut diamonds
#'
#' A dataset containing the prices and other attributes of almost 54,000
#'  diamonds. The variables are as follows:
#'
#' @format A data frame with 53940 rows and 10 variables:
#' \itemize{
#'   \item price: price in US dollars (\$326--\$18,823)
#'   \item carat: weight of the diamond (0.2--5.01)
#'   \item cut: quality of the cut (Fair, Good, Very Good, Premium, Ideal)
#'   \item color: diamond colour, from J (worst) to D (best)
#'   \item clarity: a measurement of how clear the diamond is
#'      (I1 (worst), SI1, SI2, VS1, VS2, VVS1, VVS2, IF (best))
#'   \item x: length in mm (0--10.74)
#'   \item y: width in mm (0--58.9)
#'   \item z: depth in mm (0--31.8)
#'   \item depth: total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43--79)
#'   \item table: width of top of diamond relative to widest point (43--95)
#' }
"diamonds"
Myrmecophagous answered 7/2, 2014 at 8:23 Comment(2)
Note that hadley's book on package development now also contains a chapter on the various ways of adding data to a package: r-pkgs.had.co.nz/data.htmlContrived
Datasets descriptions have moved to data.R.Indeterminate
D
13

Please look at CRAN packages that include data and copy their approach. I just added data to a at-work-only package a few weeks back and it just works...

For what it is worth, the manual has a section 1.1.5 Data in packages which explains it.

Dropline answered 12/9, 2012 at 15:16 Comment(0)
I
2
x <- sample(1000)
usethis::use_data(x, mtcars)

http://r-pkgs.had.co.nz/data.html

Incipit answered 2/8, 2017 at 5:21 Comment(4)
Thank you for the code snippet, which might provide some limited, immediate help. A proper explanation would greatly improve its long-term value by describing why this is a good solution to the problem, and would make it more useful to future readers with other similar questions. Please edit your answer to add some explanation, including the assumptions you've made.Diarist
should be usethis::use_data(x, mtcars) as the one shown is depricatedGurtner
@Incipit Do you want to update devtools::use_data(x, mtcars) to sethis::use_data(x, mtcars) ?Stoneblind
It's done! Edited!Incipit

© 2022 - 2024 — McMap. All rights reserved.