How to compress saves in R package build
Asked Answered
T

2

11

I'm trying to include a (somewhat) large dataset in an R package. I keep getting the Warning during the check in Rstudio saying that I could save space with compression:

* checking data for ASCII and uncompressed saves ... WARNING

  Note: significantly better compression could be obtained
        by using R CMD build --resave-data
          old_size new_size compress
  slp.rda    499Kb    310Kb    bzip2
  sst.rda    1.3Mb    977Kb       xz

I've tried adding -- resave-data to RStudio's "Configure Buid Tools" to no effect.

enter image description here

Tremblay answered 16/9, 2015 at 10:8 Comment(0)
A
10

The devtools function use_data takes a parameter for the type of compression and makes adding data to pkgs much easier in general. Using it, or just save on your own), use xz compression when you save your data (for save it's the compression_level parameter).

If you want to use --resave-data then you can try --resave-data=best since just using --resave-data defaults to gzip (gaining you pretty much nothing in this case).

See Building package tarballs for more information.

Arieariel answered 16/9, 2015 at 10:32 Comment(3)
Thanks for your answer - I have tried save with compression. The compression error is now gone, but now I get the warning: Warning: package needs dependence on R (>= 2.10). Any experience with that?Tremblay
That's due to the extra compression. Add R (>= 2.10) to your DESCRIPTION file.Arieariel
Thanks! In the meantime, the use_data function has been moved from devtools to usethisUncle
A
12

Another alternative, if you have a large dataset that you don't want to re-create, is to use tools::resaveRdaFiles from within R. Point it at the dataset file, or the entire data directory, and it will compress your data in a format of your choosing. See its manual page for more information.

Amortize answered 2/11, 2017 at 12:1 Comment(0)
A
10

The devtools function use_data takes a parameter for the type of compression and makes adding data to pkgs much easier in general. Using it, or just save on your own), use xz compression when you save your data (for save it's the compression_level parameter).

If you want to use --resave-data then you can try --resave-data=best since just using --resave-data defaults to gzip (gaining you pretty much nothing in this case).

See Building package tarballs for more information.

Arieariel answered 16/9, 2015 at 10:32 Comment(3)
Thanks for your answer - I have tried save with compression. The compression error is now gone, but now I get the warning: Warning: package needs dependence on R (>= 2.10). Any experience with that?Tremblay
That's due to the extra compression. Add R (>= 2.10) to your DESCRIPTION file.Arieariel
Thanks! In the meantime, the use_data function has been moved from devtools to usethisUncle

© 2022 - 2024 — McMap. All rights reserved.