how to spread or cast multiple values in r [duplicate]
Asked Answered
E

2

13

Here is toy data set for this example:

data <- data.frame(x=rep(c("red","blue","green"),each=4), y=rep(letters[1:4],3), value.1 = 1:12, value.2 = 13:24)

       x y value.1 value.2
1    red a       1      13
2    red b       2      14
3    red c       3      15
4    red d       4      16
5   blue a       5      17
6   blue b       6      18
7   blue c       7      19
8   blue d       8      20
9  green a       9      21
10 green b      10      22
11 green c      11      23
12 green d      12      24

How can I cast or spread variable y, to produce the following wide data.frame:

     x a.value.1 b.value.1 c.value.1 d.value.1 a.value.2 b.value.2 c.value.2 d.value.2
1  blue         5         6         7         8        17        18        19        20
2 green         9        10        11        12        21        22        23        24
3   red         1         2         3         4        13        14        15        16
Echols answered 24/9, 2014 at 14:46 Comment(2)
If the real data has more variables, why would you want it in wide format? It's easier to read in the long format.Lobbyism
@RichardScriven, in genomics, for example, some downstream analyses requires data in wide format.Juryrig
C
16

We could do this using dplyr/tidyr. We reshape the 'data' from 'wide' to 'long' format with gather specifying the columns (starts_with('value')) to be combined to a key/value column pair ('Var/Val'), unite the 'Var' and 'y' column to create a single 'Var1' column, and reconvert back to 'wide' format with spread.

 library(dplyr)
 library(tidyr)
 data %>%
      gather(Var, val, starts_with("value")) %>% 
      unite(Var1,Var, y) %>% 
      spread(Var1, val)

 #      x value.1_a value.1_b value.1_c value.1_d value.2_a value.2_b   value.2_c
 #1   blue         5         6         7         8        17        18        19
 #2  green         9        10        11        12        21        22        23
 #3    red         1         2         3         4        13        14        15
 #    value.2_d
 #1        20
 #2        24
 #3        16

Update

(After 6 months)

Reshaping multiple value columns to wide is now possible with dcast from data.table_1.9.5 without using the melt. We can install the devel version from here

 library(data.table)
 dcast(setDT(data), x~y, value.var=c('value.1', 'value.2'))
 #       x a_value.1 b_value.1 c_value.1 d_value.1 a_value.2 b_value.2 c_value.2
 #1:  blue         5         6         7         8        17        18        19
 #2: green         9        10        11        12        21        22        23
 #3:   red         1         2         3         4        13        14        15
 #   d_value.2
 #1:        20
 #2:        24
 #3:        16
Corriecorriedale answered 24/9, 2014 at 15:11 Comment(4)
Hi, I am trying to do this based on your updated suggestion but get an error: "Error in .subset2(x,I,exact=exact): subscript out of bounds" Help?Veolaver
@Veolaver Are you using the devel version of data.table? I am not getting any error with the example you providedCorriecorriedale
I don't think I am...can you tell me how to install that perhaps? Sorry to bother youVeolaver
@Veolaver Instructions to install are hereCorriecorriedale
U
6

melt first then dcast:

library(reshape2)
data1 <- melt(data, id.vars = c("x", "y"))
dcast(data1, x ~ variable + y)
#      x value.1_a value.1_b value.1_c value.1_d value.2_a value.2_b value.2_c value.2_d
#1  blue         5         6         7         8        17        18        19        20
#2 green         9        10        11        12        21        22        23        24
#3   red         1         2         3         4        13        14        15        16
Uniat answered 24/9, 2014 at 15:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.