I have a dataset that looks like this toy example. The data describes the location a person has moved to and the time since this relocation happened. For example, person 1 started out in a rural area, but moved to a city 463 days ago (2nd row), and 415 days ago he moved from this city to a town (3rd row), etc.
set.seed(123)
df <- as.data.frame(sample.int(1000, 10))
colnames(df) <- "time"
df$destination <- as.factor(sample(c("city", "town", "rural"), size = 10, replace = TRUE, prob = c(.50, .25, .25)))
df$user <- sample.int(3, 10, replace = TRUE)
df[order(df[,"user"], -df[,"time"]), ]
The data:
time destination user
526 rural 1
463 city 1
415 town 1
299 city 1
179 rural 1
938 town 2
229 town 2
118 city 2
818 city 3
195 city 3
I wish to aggregate this data to the format below. That is, to count the types of relocations for each user, and sum it up to one matrix. How do I achieve this (preferably without writing loops)?
from to count
city city 1
city town 1
city rural 1
town city 2
town town 1
town rural 0
rural city 1
rural town 0
rural rural 0
data.table
. And just a side note for future visitors, this solution works great, but note that it relies on the data being ordered byuser
andtime
as in the example. – Cephalochordate