What type of object is an R package?
Asked Answered
S

4

22

Probably a pretty basic question but a friend and I tried to run str(packge_name) and R threw us an error. Now that I'm looking at it, I'm wondering if an R package is like a .zip file in that it is a collection of objects, say pictures and songs, but not a picture or song itself.

If I tried to open a zip of pictures with an image viewer, it wouldn't know what to do until I unzipped it - just like I can't call str(forecast) but I can call str(ts) once I've loaded the forecast package into my library...

Can anyone set me straight?

Skvorak answered 13/1, 2015 at 16:12 Comment(10)
You might be more impressed with ls.str("package:packageName")Sinnard
A package is just a bundle of R functions (with documentation) glued together and organized by DESCRIPTION and NAMESPACE files. A package itself is not an R object.Dipody
Well, sometimes more than strictly functions. Sometimes there are also data sets and other non-function objects necessary to make the package runSinnard
@Dipody so if it's not an object....what is it?Skvorak
@RichardScriven Sure, but let's cover the most basic case first.Dipody
@RichardScriven nope - > library(fpp) > ls.str(fpp) Error in ls.str(fpp) : object 'fpp' not foundSkvorak
@Canuckish - you have to type it as I did, ls.str("package:fpp") The function ls.str needs to know that you want to view the package contentsSinnard
@Canuckish I'm not sure there really is an object type for packages, but along the lines of @RichardScriven's comment, I would guess it most closely resembles an environment, at least in the sense that you can call things like ls(name="package:ggplot2") or ls.str(name="package:ggplot2").Muth
You might find r-pkgs.had.co.nz/package.html helpfulCorvine
Awesome - thanks @hadley! (Someone also referenced this excellent resource, below)Skvorak
Q
22

R packages are generally distributed as compressed bundles of files. They can either be in "binary" form which are preprocessed at a repository to compile any C or Fortran source and create the proper headers, or they can be in source form where the various required files are available to be used in the installation process, but this requires that the users have the necessary compilers and tools installed at locations where the R build process using OS system resources can get at them.

If you read the documentation for a package at CRAN you see they are distributed in set of compressed formats that vary depending on the OS-targets:

Package source:     Rcpp_0.11.3.tar.gz  # the Linus/UNIX targets
Windows binaries:   r-devel: Rcpp_0.11.3.zip, r-release: Rcpp_0.11.3.zip, r-oldrel: Rcpp_0.11.3.zip
OS X Snow Leopard binaries:     r-release: Rcpp_0.11.3.tgz, r-oldrel: Rcpp_0.11.3.tgz
OS X Mavericks binaries:    r-release: Rcpp_0.11.3.tgz
Old sources:    Rcpp archive   # not really a file but a web link

Once installed an R package will have a specified directory structure. The DESCRIPTION file is a text file with specific entries for components that determine whether the local installation meets the dependencies of the package. There are NAMESPACE, LICENSE, and INDEX files. There are directories named '/help', '/html', '/Meta', '/R', and possibly '/libs', '/demo', '/data', '/unitTests', and others.

This is the tree at the top of the ../library/Rcpp package directory:

$ ls
CITATION    NAMESPACE   THANKS      examples    libs
DESCRIPTION NEWS.Rd     announce    help        prompt
INDEX       R       discovery   html        skeleton
Meta        README      doc     include     unitTests

So in the "life-cycle" of a package, there will be initially a series of required and optional files, which then get processed by the BUILD and CHECK mechanisms into an installed package, which than then get compressed for distribution, and later unpacked into a specified directory tree on the users machine. See these help pages:

?.libPaths  # also describes .Library()
?package.skeleton
?install.packages
?INSTALL

And of course read Writing R Extensions, a document that ships with every installation of R.

Quoits answered 13/1, 2015 at 16:34 Comment(0)
C
19

Your question is:

What type of object is an R package?

Somehow, I’m still missing an answer to this exact question. So here goes:

As far as R is concerned, an R package is not an object. That is, it’s not an object in R’s type system. R is being a bit difficult, because it allows you to write

library(pkg_name)

Without requiring you to define pkg_name anywhere prior. In contrast, other objects which you are using in R have to be defined somewhere – either by you, or by some package that’s loaded either explicitly or implicitly.

This is unfortunate, and confuses people. Therefore, when you see library(pkg_name), think

library('pkg_name')

That is, imagine the package name in quotes. This does in fact work just as expected. The fact that the code also works without quotes is a peculiarity of the library function, known as non-standard evaluation. In this case, it’s mostly an unfortunate design decision (but there are reasons).

So, to repeat the answer: a package isn’t a type of R object1. For R, it’s simply a name which refers to a known location in the file system, similar to what you’ve assumed. BondedDust’s answer goes into detail to explain that structure, so I shan’t repeat it here.


1 For super technical details, see Joshua’s and Richard’s comments below.

Cardoon answered 13/1, 2015 at 16:49 Comment(6)
Tsk, tsk... s/lib_name/pkg_name. :) The only thing I might add is that pkg_name is an object (an unbound symbol in the pairlist containing the function arguments)... though that might be too technical.Faun
I think this is a damn fine answer, and I'm stuck between checking it, or @BondedDust's above. As his was the first I checked, I'm going to put it back there. But I do really like this response. Much thanks.Skvorak
Just a note, the TypeTable structure found in src/main/util.c shows all the base types. Package is not one of them. Thought that might be useful to someone. :)Sinnard
Agree the answer is useful. It is written from the perspective of someone viewing the world from an R console and interpreting the errors one gets. Mine was written from the perspective of someone using R in one of the three target OSes.Quoits
@Joshua To be honest, I’m unhappy with that aspect of my answer, in particular since library isn’t the only relevant place where you may encounter this (think pkg::obj). I’m still mulling over whether to update my answer, or whether this would be more confusing than helpful.Cardoon
Yeah, and pkg::obj is even worse because it's less obvious that :: is a function call.Faun
Z
5

From R's own documentation:

Packages provide a mechanism for loading optional code, data and documentation as needed.…A package is a directory of files which extend R, a source package (the master files of a package), or a tarball containing the files of a source package, or an installed package, the result of running R CMD INSTALL on a source package. On some platforms (notably OS X and Windows) there are also binary packages, a zip file or tarball containing the files of an installed package which can be unpacked rather than installing from sources. A package is not a library.

So yes, a package is not the functions within it; it is a mechanism to have R be able to use the functions or data which comprise the package. Thus, it needs to be loaded first.

Zurn answered 13/1, 2015 at 16:30 Comment(3)
Very useful. But as @Dipody noted it's not an object - so is it a simply a directory?Skvorak
@Canuckish, no the directory in which the package lives is called the library. Which is confusing, as packages are loaded using the library(foo) function call. I've fixed the hyperlink in the above answer to point to the proper manual page.Zurn
Really was trying to get at the "Type" of a package - it's interesting to me that even though "everything in R is a vector", I have to use a special (list structure?) call like `ls.str("package:package_name") as recommended above.Skvorak
I
4

I am reading Hadley's book Advanced-R (Chapter 6.3 - functions, p.79) and this quote will cover you I think:

Every operation is a function call
“To understand computations in R, two slogans are helpful:

Everything that exists is an object.
Everything that happens is a function call."
— John Chambers

According to that using library(name_of_library) is a function call that will load the package. Every little bit that has been loaded i.e. functions or data sets are objects which you can use by calling other functions. In that sense a package is not an object in any of R's environments until it is loaded. Then you can say that it is a collection of the objects it contains and which are loaded.

Impressive answered 13/1, 2015 at 16:23 Comment(5)
Somewhat along the lines of what I'm looking for, but why would str(package_name) throw an error if str is to "Compactly display the internal structure of an R object"Skvorak
Because it is all about the environments. You need to have a look at the above link on the environments chapter. If you load a package its environment is added before the global environment in a set of environments. The environments are practically where R looks for every object. Unless you load a package R is not able to find where that name is and hence you get an error.Impressive
I don’t think this quote really helps OP. In fact, it’s actually quite misleading because “everything that exists is an object” but, despite appearances to the contrary, the package_name in str(package_name) does not exist as far as R is concerned (and isn’t an object), unless OP has defined it previously.Cardoon
@KonradRudolph But I am mentioning above, that a package is a collection of objects and not an object itself.Impressive
Ultimately, the OP was concerned with what type of object the package is. "Collection of objects" is a fine answer, but doesn't answer the question. That said, your response was illustrative, and I value your input.Skvorak

© 2022 - 2024 — McMap. All rights reserved.