Original data frame:
v1 = sample(letters[1:3], 10, replace=TRUE)
v2 = sample(letters[1:3], 10, replace=TRUE)
df = data.frame(v1,v2)
df
v1 v2 1 b c 2 a a 3 c c 4 b a 5 c c 6 c b 7 a a 8 a b 9 a c 10 a b
New data frame:
new_df = data.frame(row.names=rownames(df))
for (i in colnames(df)) {
for (x in letters[1:3]) {
#new_df[x] = as.numeric(df[i] == x)
new_df[paste0(i, "_", x)] = as.numeric(df[i] == x)
}
}
v1_a v1_b v1_c v2_a v2_b v2_c 1 0 1 0 0 0 1 2 1 0 0 1 0 0 3 0 0 1 0 0 1 4 0 1 0 1 0 0 5 0 0 1 0 0 1 6 0 0 1 0 1 0 7 1 0 0 1 0 0 8 1 0 0 0 1 0 9 1 0 0 0 0 1 10 1 0 0 0 1 0
For small datasets this is fine, but it becomes slow for much larger datasets.
Anyone know of a way to do this without using looping?