Combining column values with column names using tidyr unite
Asked Answered
T

2

6

I have a data.frame with several columns:

set.seed(1)
df <- data.frame(cluster=LETTERS[1:4],group=c(rep("m",2),rep("f",2)),point=rnorm(4),err=runif(4,0.1,0.3))

and I'd add another columns which "\n" concatenates all columns of its respective row, where the column name precedes the value.

I know this:

library(tidyr)
library(dplyr)
tidyr::unite(df,text,sep="\n")

gives me this tibble:

                                         text
1  A\nm\n0.487429052428485\n0.286941046221182
2  B\nm\n0.738324705129217\n0.142428504256532
3  C\nf\n0.575781351653492\n0.230334753217176
4 D\nf\n-0.305388387156356\n0.125111019192263

But what I want is this tibble:

                                         text
1  cluster: A\ngroup: m\npoint: 0.487429052428485\nerr: 0.286941046221182
2  cluster: B\ngroup: m\npoint: 0.738324705129217\nerr: 0.142428504256532
3  cluster: C\ngroup: f\npoint: 0.575781351653492\nerr: 0.230334753217176
4 cluster: D\ngroup: f\npoint: -0.305388387156356\nerr: 0.125111019192263

Any idea?

Thomasthomasa answered 2/7, 2018 at 16:46 Comment(0)
N
6

We can use Map with do.call

data.frame(text = do.call(paste, c(Map(function(x, y) 
                 paste(x, y, sep=": "), names(df), df), sep="\n")))

Or using tidyverse, map through the columns (imap - provides the .y as column names) and then do the unite

library(tidyverse)
imap(df, ~ paste(.y, .x, sep=": ")) %>%
              as_tibble %>%
              unite(text, sep="\n")
# A tibble: 4 x 1
#  text                                                                     
#  <chr>                                                                    
#1 "cluster: A\ngroup: m\npoint: -0.626453810742332\nerr: 0.225822808779776"
#2 "cluster: B\ngroup: m\npoint: 0.183643324222082\nerr: 0.112357254093513" 
#3 "cluster: C\ngroup: f\npoint: -0.835628612410047\nerr: 0.14119491497986" 
#4 "cluster: D\ngroup: f\npoint: 1.59528080213779\nerr: 0.135311350505799"  

Or as @DanChaltiel mentioned

imap_dfr(df, ~ paste(.y, .x, sep = "; ")) %>%
      unite(text, sep = "\n")
Nuri answered 2/7, 2018 at 16:50 Comment(3)
we can also avoid the looping by using aggregate(do.call(paste,c(sep=" :",rev(stack(df)))),list(c(row(df))),paste,collapse="\n")Cleanup
@Cleanup Yes but if you check those functions, it does aggregate, stack etc. which would be more overheadNuri
You can even save a call by using imap_dfr() and removing as_tibble.Alumna
W
1

Thanks to @jared_mamrot's solution here, another option would be to use across to transform the columns to 'column name + column value' first and then unite as follows:

df %>% mutate(across(names(df), ~paste0(cur_column(), ": ", .x))) %>% unite(text, sep = "\n")

                                                                     text
1 cluster: A\ngroup: m\npoint: -0.626453810742332\nerr: 0.225822808779776
2  cluster: B\ngroup: m\npoint: 0.183643324222082\nerr: 0.112357254093513
3  cluster: C\ngroup: f\npoint: -0.835628612410047\nerr: 0.14119491497986
4   cluster: D\ngroup: f\npoint: 1.59528080213779\nerr: 0.135311350505799

This option also allows to select columns of interest easily. Refer to @jared_mamrot's solution.

Wing answered 22/6, 2023 at 13:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.