How would I write a recursive version of purrr::keep?
Asked Answered
G

1

6

Say I have a nested list with a bunch of data frames at different levels. I want to extract out flattened list of just the data frames. How might I write this using purrr functions? Should I be looking at reduce?

For example, given the data:

s <- list(x = 1:10,
          data = data.frame(report = LETTERS[1:5],
                            value = rnorm(5, 20, 5)),
          report = list(A = data.frame(x = 1:3, y = c(2, 4, 6)),
                        B = data.frame(x = 1:3, y = c(3, 6, 9)),
                        z = 4:10,
                        other = data.frame(w = 3:5,
                                       color = c("red", "green", "blue"))))

I'd like the function to return:

list(data = data.frame(report = LETTERS[1:5],
                       value = rnorm(5, 20, 5)),
     `report$A` = data.frame(x = 1:3, y = c(2, 4, 6)),
     `report$B` = data.frame(x = 1:3, y = c(3, 6, 9)),
     `report$other` = data.frame(w = 3:5,
                                 color = c("red", "green", "blue")))

I've written a recursive function:

recursive_keep <- function(.x, .f) {
  loop <- function(.y) {
    if(is.list(.y)) {
      c(keep(.y, .f), flatten(map(discard(.y, .f), loop)))
    } else if(.f(.y)) {
      .y
    } else {
      NULL
    }
  }
  loop(.x)
}

It can be called as:

recursive_keep(s, is.data.frame)

It seems to work on this example, but it doesn't keep the name information. I'm looking to keep enough information that I could pluck the data from the original object. Maybe that is an easier question to answer?

Gorizia answered 3/1, 2018 at 18:3 Comment(2)
Re your code: why the intermediary function loop() and not recursive_keep <- function(.y, .f) { if(is.list(.y)) { c(keep(.y, .f), flatten(map(discard(.y, .f), recursive_keep, .f))) } else if(.f(.y)) { .y } else { NULL } }?Cordless
keep(rlang::squash(s), is.data.frame) is not an answer to your question, but gives the same output as your recursive_keep().Cordless
D
4

This recursive function with one-line body retains names and uses no packages:

rec <- function(x, FUN = is.data.frame)
  if (FUN(x)) list(x) else if (is.list(x)) do.call("c", lapply(x, rec, FUN))

str(rec(s))  # test

giving (continued after output):

List of 4
 $ data        :'data.frame':   5 obs. of  2 variables:
  ..$ report: Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5
  ..$ value : num [1:5] 29.1 19.9 21.2 13 25.2
 $ report.A    :'data.frame':   3 obs. of  2 variables:
  ..$ x: int [1:3] 1 2 3
  ..$ y: num [1:3] 2 4 6
 $ report.B    :'data.frame':   3 obs. of  2 variables:
  ..$ x: int [1:3] 1 2 3
  ..$ y: num [1:3] 3 6 9
 $ report.other:'data.frame':   3 obs. of  2 variables:
  ..$ w    : int [1:3] 3 4 5
  ..$ color: Factor w/ 3 levels "blue","green",..: 3 2 1

Regarding getting, say, A from within report from the original object s:

s[["report"]][["A"]]

or

ix <- c("report", "A")
s[[ix]]
Dichogamy answered 3/1, 2018 at 21:57 Comment(3)
Nice! May I suggest rec <- function(x, p) if (p(x)) list(x) else if (is.list(x)) do.call("c", lapply(x, rec, p)) ; rec(s, is.data.frame) to imitate keep() behavior?Cordless
@G. Grothendieck: Thanks, this looks great except for that you are passing your default FUN argument implicitly to the recursive call.Gorizia
OK. Have made these changes.Dichogamy

© 2022 - 2025 — McMap. All rights reserved.