Apply function to a row in a data.frame using dplyr
Asked Answered
G

2

9

In base R I would do the following:

d <- data.frame(a = 1:4, b = 4:1, c = 2:5)
apply(d, 1, which.max)

With dplyr I could do the following:

library(dplyr)
d %>% mutate(u = purrr::pmap_int(list(a, b, c), function(...) which.max(c(...))))

If there’s another column in d I need to specify it, but I want this to work w/ an arbitrary amount if columns.

Conceptually, I’d like something like

pmap_int(list(everything()), ...)
pmap_int(list(.), ...)

But this does obviously not work. How would I solve that canonically with dplyr?

Grizzly answered 3/4, 2021 at 19:16 Comment(0)
G
7

We just need the data to be specified as . as data.frame is a list with columns as list elements. If we wrap list(.), it becomes a nested list

library(dplyr)
d %>% 
  mutate(u = pmap_int(., ~ which.max(c(...))))
#  a b c u
#1 1 4 2 2
#2 2 3 3 2
#3 3 2 4 3
#4 4 1 5 3

Or can use cur_data()

d %>%
   mutate(u = pmap_int(cur_data(), ~ which.max(c(...))))

Or if we want to use everything(), place that inside select as list(everything()) doesn't address the data from which everything should be selected

d %>% 
   mutate(u = pmap_int(select(., everything()), ~ which.max(c(...))))

Or using rowwise

d %>%
    rowwise %>% 
    mutate(u = which.max(cur_data())) %>%
    ungroup
# A tibble: 4 x 4
#      a     b     c     u
#  <int> <int> <int> <int>
#1     1     4     2     2
#2     2     3     3     2
#3     3     2     4     3
#4     4     1     5     3

Or this is more efficient with max.col

max.col(d, 'first')
#[1] 2 2 3 3

Or with collapse

library(collapse)
dapply(d, which.max, MARGIN = 1)
#[1] 2 2 3 3

which can be included in dplyr as

d %>% 
    mutate(u = max.col(cur_data(), 'first'))
Glyoxaline answered 3/4, 2021 at 19:17 Comment(9)
I could have sworn that I tried pmap_int(., ...), but overall cur_data() is what I was eventually looking for (also for some other use cases). Thx!Grizzly
@Grizzly the ... will be within a lambda callGlyoxaline
Thank you dear @Glyoxaline for this thorough explanation. I could've solved this but only in a single pmap form. Your valuable explanations just gave me so much insight in this kinda solutions. Thank you very much.Slide
You almost covered everything and left no rooms to other answers :) Cool! +1Epizoon
Haha, I forgot data.table here. Thanks for reminder! I added that :)Epizoon
@akrun, of course. I was just to lazy to type the full function call, hence I abbreviated it to (., ...) being well aware that this is not the proper code ;)Grizzly
Can you explain the c(...) ?Sodalite
Is this equivalent to d%>% mutate(u = pmap_int(., ~ which.max(.x))) ?Sodalite
Apprently the code above is not equivalent to your solutionSodalite
E
2

Here are some data.table options

setDT(d)[, u := which.max(unlist(.SD)), 1:nrow(d)]

or

setDT(d)[, u := max.col(.SD, "first")]
Epizoon answered 3/4, 2021 at 21:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.