I have a large data frame in which I am multiplying two columns together to get another column. At first I was running a for-loop, like so:
for(i in 1:nrow(df)){
df$new_column[i] <- df$column1[i] * df$column2[i]
}
but this takes like 9 days.
Another alternative was plyr
, and I actually might be using the variables incorrectly:
new_df <- ddply(df, .(column1,column2), transform, new_column = column1 * column2)
but this is taking forever
df$new_column <- df$column1 * df$column2
? How big is your data frame? – PassageR
are vectorized, so you can multiply vectors by vectors and it will multiply entries of the same index together. The problem with the for-loop is thatR
creates a new data frame for every iteration of the loop. The solution I suggested creates just one new data frame instead of 400K new data frames. – Passage