Join list to dataframe in R
Asked Answered
B

5

12

I have the following dataframe df and list l (dput below):

> df
  group value
1     A     1
2     B     2
3     C     3
> l
$A
[1] 999

$B
[1] 55

I would like to join the values of the list to the dataframe based on the names in the list with the group variable of the dataframe and call it "value_l". The expected output should look like this:

  group value value_l
1     A     1     999
2     B     2      55
3     C     3      NA

So I was wondering if anyone knows how to join a list to a dataframe based on their names?


dput df and l:

df <- structure(list(group = c("A", "B", "C"), value = c(1, 2, 3)), class = "data.frame", row.names = c(NA, 
-3L))
l <- list(A = 999, B = 55)
Biosphere answered 15/1, 2023 at 18:54 Comment(2)
Do you want to join a list to a data.frame or is it also fine to unlist the list to a vector and join it then?Relucent
Hi @GKi, preferably a list but vector is also okay. Thank you for the great answers!Biosphere
R
10

You can use match. In case df$group is a character (what is here the case) it could be directly used to subset the list.

df$value_l <- l[match(df$group, names(l))]
#df$value_l <- l[df$group]  #Short alternative by @akrun works only in case df$group is a character, but not for factor or numeric
#df$value_l <- l[as.character(df$group)] #Maybe more secure

df
#  group value value_l
#1     A     1     999
#2     B     2      55
#3     C     3    NULL

In case there is a need for NA, instead of NULL use in addition:

df$value_l[vapply(df$value_l, is.null, TRUE)] <- NA

df
#  group value value_l
#1     A     1     999
#2     B     2      55
#3     C     3      NA

Or make it in single steps:

. <- match(df$group, names(l))
df$value_l <- l[.]
is.na(df$value_l) <- is.na(.)

Here we have joined a list to a data.frame.

str(df)
#'data.frame':   3 obs. of  3 variables:
# $ group  : chr  "A" "B" "C"
# $ value  : num  1 2 3
# $ value_l:List of 3
#  ..$ A : num 999
#  ..$ B : num 55
#  ..$ NA: logi NA

In case the List can be trasfomed to a vector you can use unlist before (thanks to @G. Grothendieck for the comment). But here we have then joined a vector to the data.frame.

df$value_l <- unlist(l)[match(df$group, names(l))]
#df$value_l <- unlist(l)[as.character(df$group)] #Option like shown above

df
#  group value value_l
#1     A     1     999
#2     B     2      55
#3     C     3      NA

str(df)
#'data.frame':   3 obs. of  3 variables:
# $ group  : chr  "A" "B" "C"
# $ value  : num  1 2 3
# $ value_l: num  999 55 NA

Another option, also joined a vector to the data.frame will be using merge.

merge(df, unlist(l), by.x="group", by.y=0, all.x = TRUE)
#  group value   y
#1     A     1 999
#2     B     2  55
#3     C     3  NA

Note: For the given list the results look similar but this will not be the case if the list looks e.g. like:

l <- list(A = 999, B = c(7, 55), A = 9)

A potential solution might be:
Taking first match:

df$value_l <- l[as.character(df$group)]

df
#  group value value_l
#1     A     1     999
#2     B     2   7, 55
#3     C     3    NULL

Making a left Join

merge(df, list2DF(list(group = names(l), value_l = l)), all.x=TRUE)
#merge(df, data.frame(group = names(l), value_l = I(l)), all.x=TRUE) #Alternative
#  group value value_l
#1     A     1     999
#2     A     1       9
#3     B     2   7, 55
#4     C     3      NA

Other options.

merge(df, list2DF(list(group = names(l), value_l = l)))             #Inner
merge(df, list2DF(list(group = names(l), value_l = l)), all=TRUE)   #Outer
merge(df, list2DF(list(group = names(l), value_l = l)), all.y=TRUE) #Right

For other options please have a look at How to join (merge) data frames (inner, outer, left, right).

Relucent answered 15/1, 2023 at 19:13 Comment(0)
C
6

You can do:

library(tidyverse)

l |> 
  as.data.frame() |> 
  pivot_longer(cols      = everything(),
               names_to  = "group",
               values_to = "value_1") |> 
  left_join(x  = df,
            y  = _,
            by = "group")

which gives:

  group value value_1
1     A     1     999
2     B     2      55
3     C     3      NA
Click answered 15/1, 2023 at 18:58 Comment(0)
U
6

Use merge from base R

 merge(df, stack(l), by.x = 'group', by.y = 'ind', all.x = TRUE)
  group value values
1     A     1    999
2     B     2     55
3     C     3     NA

Or with dplyr

library(dplyr)
df %>%
   rowwise %>%
   mutate(value_l = if(group %in% names(l)) l[[group]] else NA) %>% 
   ungroup

-output

# A tibble: 3 × 3
  group value value_l
  <chr> <dbl>   <dbl>
1 A         1     999
2 B         2      55
3 C         3      NA

Or using enframe/unnest

library(tidyr)
library(tibble)
enframe(l, name = 'group', value = 'value_l') %>%
  unnest(value_l) %>% 
  left_join(df, .)
  group value value_l
1     A     1     999
2     B     2      55
3     C     3      NA

Or if it can be a list column

df$value_l <- l[df$group]
> df
  group value value_l
1     A     1     999
2     B     2      55
3     C     3    NULL
Uwton answered 15/1, 2023 at 19:43 Comment(0)
B
5

Update:

Maybe this one:

library(dplyr)

stack(unlist(l)) %>% 
  full_join(df, by=c("ind"="group"))

  values ind value
1    999   A     1
2     55   B     2
3     NA   C     3 

First answer: Slightly different:

library(dplyr)
library(tidyr)
bind_rows(l) %>% 
  pivot_longer(everything()) %>% 
  full_join(df, by=c("name"="group")) %>% 
  select(name, value = value.y, value_l=value.x)
 name  value value_l
  <chr> <dbl>   <dbl>
1 A         1     999
2 B         2      55
3 C         3      NA
Burette answered 15/1, 2023 at 19:4 Comment(0)
P
5

This is a simpler version of what GKi suggested with unlist(). If your list always has a name and a single numeric value, you can convert it to a named vector and then use it as a lookup vector, which is simpler than doing merges or matches:

temp_vec = unlist(l)
df$l_value = temp_vec[df$group]

df
  group value l_value
1     A     1     999
2     B     2      55
3     C     3      NA

Without the intermediate variable for a single line solution:

df$l_value = unlist(l)[df$group]

df
  group value l_value
1     A     1     999
2     B     2      55
3     C     3      NA

Depending on what else you need the list for, it may even make sense just to use a named vector instead of a list in the first place.

Printable answered 16/1, 2023 at 3:35 Comment(1)
This is much better optionUwton

© 2022 - 2024 — McMap. All rights reserved.