Merge rows in one data.frame
Asked Answered
A

1

1

This is a very similar question to merge-two-rows-in-one-dataframe but I have string variables and just want to collapse some rows that have the same country name. I adapted the MWER

data<-data.frame(code= c(345, 346), name= "Yemen", v1= c("", "text1"), v2= c("text2", ""),v3= c("text3", ""),v4= c("", "text4"))
code  name    v1    v2    v3    v4
345   Yemen         text2 text3      
346   Yemen   text1             text4

aggregate(x=data[c("v1","v2","v3","v4")], by=list(name=data$name), paste)
name v1.1  v1.2  v2.1 v2.2  v3.1 v3.2 v4.1  v4.2
1 Yemen      text1 text2      text3           text4

I was hoping paste would work as a function to combine the empthy cell with the text of the other row, but I somehow get one row with more variables v1.1 and so on.

Approach answered 22/10, 2015 at 16:13 Comment(1)
A dataframe with one row is the expected result of your call to aggregate. Could you be more clear on what result you are looking for here?Ripply
H
2

We could use data.table. We convert the 'data.frame' to 'data.table' (setDT(data)), grouped by 'name', we unlist the columns specified in the .SDcols, and paste it together.

library(data.table)
setDT(data)[, unlist(.SD), name, .SDcols=v1:v4][V1!='', paste(V1, collapse=', '), name]

As the expected output is not showed, it could be also

setDT(data)[, lapply(.SD, function(x) paste(x[x!=''], collapse='')) , name, .SDcols= v1:v4]

Update

Based on the expected output, we convert the 'factor' columns ('v1:v4') to 'character' class, then use the formula method of aggregate and paste the columns grouped by 'name'.

data[3:6] <- lapply(data[3:6], as.character)
aggregate(.~name, data[-1], FUN=function(x) paste(x[x!=''], collapse=', '))
Hendecagon answered 22/10, 2015 at 16:16 Comment(3)
Thx the second think worked, would have been interesting to see if it would work with the aggregate command, i dont quite understand what im doing now :)Approach
@MaxM So you are looking to combine within each column, instead of the whole rows.Hendecagon
no its fine the way it is. For some reasons I have the data is this structure that I have several rows for a country but only one entry in one of the columns v1-v4. The above code collapses my data in a way that i have at most one row for each country. Yea the second command workedApproach

© 2022 - 2024 — McMap. All rights reserved.