What is the practical use of the identity function in R?
Asked Answered
B

9

39

Base R defines an identity function, a trivial identity function returning its argument (quoting from ?identity).

It is defined as :

identity <- function (x){x}

Why would such a trivial function ever be useful? Why would it be included in base R?

Barfuss answered 18/8, 2011 at 14:14 Comment(4)
I've seen it used in the context of curve(identity(x)) (rather than the slightly (?) more opaque curve(x*1) or curve(x+0) ...Exfoliate
@BenBolker Why not simply curve(x)?Barfuss
try it -- it doesn't work (Error in eval(expr, envir, enclos) : could not find function "x") because curve uses funny evaluation rules ...Exfoliate
The first few answers allude to functional programming. Some useful questions about R and functional programming: stackoverflow.com/q/4874867/602276, https://mcmap.net/q/409897/-efficient-functional-programming-using-mapply-in-r-for-a-quot-naturally-quot-procedural-problem/602276 and https://mcmap.net/q/407203/-higher-level-functions-in-r-is-there-an-official-compose-operator-or-curry-function/602276Barfuss
S
18

Don't know about R, but in a functional language one often passes functions as arguments to other functions. In such cases, the constant function (which returns the same value for any argument) and the identity function play a similar role as 0 and 1 in multiplication, so to speak.

Selby answered 18/8, 2011 at 14:25 Comment(8)
Can you please explain this a bit more? I understand the bit about being able to pass functions as arguments. But what do you mean with it plays the role of 0 and 1?Barfuss
And R is a relatively functional language. At least, it has first class functions, closures, focus on immutable data structures, maps and filters, anonymous functions etc. So, yes, this is probably why identity is included.Oreilly
A good example of functions that take functions as arguments in R is the apply family of functions: ats.ucla.edu/stat/r/library/advanced_function_r.htm#apply . These functions are very powerful for manipulating data sets.Oreilly
@Barfuss - regarding 0 and 1: think multiplication. 0*n = 0, 1*n = nSelby
@Selby I am familiar with the mathematical concept of identity. I have also used many of the functions in R that make use of passing functions as arguments (apply, ddply, aggregate, outer and a multitude of others). But I have never had the need for the identify function myself. So my question is still: in functional programming, why would you need an identify function? What is the practical use of it? PS. Your answer doesn't have to be specifically about R, since this seems like a more generic concept.Barfuss
The practical use of such a thing can be, for example, as a default value for a parameter that is of type 'transformation'. This is similar to having a parameter of type 'multiplicative factor' and setting its default value to 1.0. In both cases, the default value has the effect of being a no-op. It is useful not to have to specify this explicitly (say by setting a boolean do_not_transform), but rather implicitly, simply as a property of the parameter (namely, it acts as the identity operator).Outrange
+1 @Selby and @Outrange Thank you. It's starting to make sense to me, especially when read together with the apply example provided by @gsk3Barfuss
@Barfuss - well I can give an example in Haskell. Say I have a value of type Maybe Int, and I am interested in the number. There is a function maybe d f x = case x of { Nothing -> d; Just i -> f i } which I can use to extract it. It gives me the result of applying a function to the payload or a default value if x was Nothing. So, if I want the value unchanged, I just pass id (which is the name of the identity function in haskell). I know this sounds conrtrieved to someone who never felt the need, but then, maybe you are just not programming "functional" enough yet.Selby
M
13

I use it from time to time with the apply function of commands.

For instance, you could write t() as:

dat <- data.frame(x=runif(10),y=runif(10))
apply(dat,1,identity)

       [,1]      [,2]      [,3]      [,4]      [,5]      [,6]       [,7]
x 0.1048485 0.7213284 0.9033974 0.4699182 0.4416660 0.1052732 0.06000952
y 0.7225307 0.2683224 0.7292261 0.5131646 0.4514837 0.3788556 0.46668331
       [,8]      [,9]      [,10]
x 0.2457748 0.3833299 0.86113771
y 0.9643703 0.3890342 0.01700427
Mirandamire answered 18/8, 2011 at 14:32 Comment(3)
+1 In other words, identity is used as a no-operation swtich to really make use of the transforming qualities of the function being called. Nice.Barfuss
@Barfuss Exactly. And stated more elegantly than I could have mustered :-)Mirandamire
It's not me, guv. I'm paraphrasing from a deleted answer by @tripleee which I actually found very useful to give context to the other answers.Barfuss
A
9

One use that appears on a simple code base search is as a convenience for the most basic type of error handling function in tryCatch.

tryCatch(...,error = identity)

which is identical (ha!) to

tryCatch(...,error = function(e) e)

So this handler would catch an error message and then simply return it.

Anthropocentric answered 18/8, 2011 at 14:44 Comment(1)
I'll have to do some more thinking before this makes sense to me. I've always avoided to learn how try and tryCatch works. Thanks for the example and reference.Barfuss
E
6

For whatever it's worth, it is located in funprog.R (the functional programming stuff) in the source of the base package, and it was added as a "convenience function" in 2008: I can imagine (but can't give an immediate example!) that there would be some contexts in the functional programming approach (i.e. using Filter, Reduce, Map etc.) where it would be convenient to have an identity function ...

r45063 | hornik | 2008-04-03 12:40:59 -0400 (Thu, 03 Apr 2008) | 2 lines

Add higher-order functions Find() and Position(), and convenience
function identity().
Exfoliate answered 18/8, 2011 at 14:31 Comment(1)
It's also the same as forceOverreach
R
2

Stepping away from functional programming, identity is also used in another context in R, namely statistics. Here, it is used to refer to the identity link function in generalized linear models. For more details about this, see ?family or ?glm. Here is an example:

> x <- rnorm(100)
> y <- rpois(100, exp(1+x))
> glm(y ~x, family=quasi(link=identity))

Call:  glm(formula = y ~ x, family = quasi(link = identity))

Coefficients:
(Intercept)            x
      4.835        5.842

Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
Null Deviance:      6713
Residual Deviance: 2993         AIC: NA

However, in this case parsing it as a string instead of a function will achieve the same: glm(y ~x, family=quasi(link="identity"))

EDIT: As noted in the comments below, the function base::identity is not what is used by the link constructor, and it is just used for parsing the link name. (Rather than deleting this answer, I'll leave it to help clarify the difference between the two.)

Rachael answered 18/8, 2011 at 15:35 Comment(2)
This was mentioned in a (now deleted) answer...identity in this context apparently does not actually refer to base::identity. See the code in make.link; it's just matching the name "identity".Anthropocentric
@Rachael : Indeed. The answer was mine and I deleted it, because it is incorrect. The family constructors for glm don't use the identity function, they evaluate a string (even though you can pass the argument unquoted).Baptistry
C
1

I just used it like this:

fit_model <- function(lots, of, parameters, error_silently = TRUE) {

  purrr::compose(ifelse(test = error_silently, yes = tryNA, no = identity),
                 fit_model_)(lots, of, parameters)
}

tryNA <- function(expr) {
  suppressWarnings(tryCatch(expr = expr,
                            error = function(e) NA,
                            finally = NA))
}
Clamshell answered 7/3, 2019 at 21:48 Comment(0)
A
1

As this question has already been viewed 8k times it maybe worth updating even 9 years after it has been written.

In a blog post called "Simple tricks for Debugging Pipes (within magrittr, base R or ggplot2)" the author points out how identity() can be very usefull at the end of different kinds of pipes. The blogpost with examples can be found here: https://rstats-tips.net/2021/06/06/simple-tricks-for-debugging-pipes-within-magrittr-base-r-or-ggplot2/

If pipe chains are written in a way, that each "pipe" symbol is at the end of a line, you can exclude any line from execution by commenting it out. Except for the last line. If you add identity() as the last line, there will never be a need to comment that out. So you can temporarily exclude any line that changes the data by commenting it out.

Ambie answered 8/6, 2021 at 10:50 Comment(0)
T
0

Here is usage example:

    Map<Integer, Long> m = Stream.of(1, 1, 2, 2, 3, 3)
            .collect(Collectors.groupingBy(Function.identity(),
                    Collectors.counting()));
    System.out.println(m);
    output:
    {1=2, 2=2, 3=2}

here we are grouping ints into a int/count map. Collectors.groupingBy accepts a Function. In our case we need a function which returns the argument. Note that we could use e->e lambda instead

Thunderstorm answered 23/10, 2018 at 5:39 Comment(0)
F
0

I have just used it to split a matrix into the list of its columns. An example:

(m <- matrix(1:9, 3))
#      [,1] [,2] [,3]
# [1,]    1    4    7
# [2,]    2    5    8
# [3,]    3    6    9

apply(m, 2, identity, simplify = FALSE)
# [[1]]
# [1] 1 2 3
#
# [[2]]
# [1] 4 5 6
#
# [[3]]
# [1] 7 8 9

Here, the function apply applies the identity function to each column of the matrix m, i.e. it just returns the columns. simplify = FALSE needed to keep it as a list rather than simplify back to an array.

Fribble answered 23/1 at 20:26 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Provincetown

© 2022 - 2024 — McMap. All rights reserved.