I have a dataset that looks like this:
before = data.frame(diag1 = c(1,NA, 1, NA, NA, 1), diag2 = c(NA, NA, NA, 2, NA, NA), diag3 = c(3, NA, NA, NA, 3, 3), diag4 = c(4, 4, NA, NA, 4, NA))
diag1 diag2 diag3 diag4
1 1 NA 3 4
2 NA NA NA 4
3 1 NA NA NA
4 NA 2 NA NA
5 NA NA 3 4
6 1 NA 3 NA
I have been trying to find a solution in which the end result is a new column named "diagnoses" that looks like this
diagnoses
1 1,3,4
2 4
3 1
4 2
5 3,4
6 1,3
This is just a much smaller example of my real problem. In the dataset I am working on there are over 70 columns of diagnoses, with no more than 3 numeric values in each row. I have tried strsplit, separate, unite functions. I still haven't found an elegant solution
I have used apply paste function
dat$diagnoses<- apply( (dat[ , cols]), 1, function(x) paste(na.omit(x),collapse=", ") )
However, it yields a string with many commas.
I tried gsub to substitute the , but I still have not been able to get the results I hoped.
This is the output I have been able to get: "1,,3,4,," ",,,4,," " 1,,,,," ",2,,,," ",,3,4,," "1,,3,,,"
""
instead ofNA
s – Langland