How to unnest column-list?
Asked Answered
B

3

5

I have a tibble like:

tibble(a = c('first', 'second'), 
       b = list(c('colA' = 1, 'colC' = 2), c('colA'= 3, 'colB'=2))) 

# A tibble: 2 x 2
  a      b        
  <chr>  <list>   
1 first  <dbl [2]>
2 second <dbl [2]>

Which a would like to turn into this form:

# A tibble: 2 x 4
  a       colA  colB  colC
  <chr>  <dbl> <dbl> <dbl>
1 first     1.   NA     2.
2 second    3.    2.   NA 

I tried to use unnest(), but I am having issues preserving the elements' names from the nested values.

Baliol answered 18/4, 2018 at 0:16 Comment(0)
A
5

You can do this by coercing the elements in the list column to data frames arranged as you like, which will unnest nicely:

library(tidyverse)

tibble(a = c('first', 'second'), 
       b = list(c('colA' = 1, 'colC' = 2), c('colA'= 3, 'colB'=2))) %>% 
    mutate(b = invoke_map(tibble, b)) %>% 
    unnest()
#> # A tibble: 2 x 4
#>   a       colA  colC  colB
#>   <chr>  <dbl> <dbl> <dbl>
#> 1 first     1.    2.   NA 
#> 2 second    3.   NA     2.

Doing the coercion is a little tricky, though, as you don't want to end up with a 2x1 data frame. There are various ways around this, but a direct route is purrr::invoke_map, which calls a function with purrr::invoke (like do.call) on each element in a list.

Amaty answered 18/4, 2018 at 0:28 Comment(2)
I've noticed this works only with the same number of columns. Could you adapt your answer for say b = list(c('colA' = 1, 'colC' = 2), c('colA'= 3, 'colB'=2, 'colD' = 1)))?Baliol
That works fine for me? As long as each element is coercible to a data frame, unnest will line up matching names and fill missing values with NA.Amaty
R
2

With tidyr 1.0.0, we can use unnest_wider to directly add new columns.

tidyr::unnest_wider(df,b)
# A tibble: 2 x 4
#  a       colA  colC  colB
#  <chr>  <dbl> <dbl> <dbl>
#1 first      1     2    NA
#2 second     3    NA     2

data

df <- tibble(a = c('first', 'second'), 
   b = list(c('colA' = 1, 'colC' = 2), c('colA'= 3, 'colB'=2)))
Rhinestone answered 22/11, 2019 at 0:31 Comment(0)
U
0

You can use hoist to pull individually named components out. This uses purrr::pluck syntax so names have to be quoted:

library(tidyr) 

df |> 
  hoist(b, "colA", "colB", "colC")
Unseat answered 31/5, 2024 at 22:38 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.