I have two data frames with a range of cases. One from time 1 and one from time 2. I am looking for a way to quickly identify cases where changes have occurred between time1 and time 2 and I am kinda stuck.
Here is an example. So, I have a data frame from time 1,
df.t1 <- data.frame(id = c(1,1,1,2,2,3,3,5,5,6), ABC = LETTERS[1:10], Num = 101:110)
and it looks like this,
df.t1
id ABC Num
1 1 A 101
2 1 B 102
3 1 C 103
4 2 D 104
5 2 E 105
6 3 F 106
7 3 G 107
8 5 H 108
9 5 I 109
10 6 J 110
time two rolls around
df.t2 <- df.t1
and some changes occur,
df.t2[3,3] <- 104
df.t2[2,2] <- "H"
df.t2[8,3] <- 999
df.t2[10,3] <- NA
df.t2[11,] <- c(3, "J", 107)
this is time 2,
df.t2
id ABC Num
1 1 A 101
2 1 H 102
3 1 C 104
4 2 D 104
5 2 E 105
6 3 F 106
7 3 G 107
8 5 H 999
9 5 I 109
10 6 J <NA>
11 3 J 107
I'm now looking for a quick way to remove cases, all id's, for cases where NO changes have occurred in the case (any row) between time1 and time 2. In the specific example it is only with id # 2 that no changes occurred between time 1 and time 2.
I am looking for an end result that looks like this,
(df <- subset(df.t2, id != 2))
id ABC Num
1 1 A 101
2 1 H 102
3 1 C 104
6 3 F 106
7 3 G 107
8 5 H 999
9 5 I 109
10 6 J <NA>
11 3 J 107
any help would be appreciated.
id=3
there is a change (a new entry). You don't consider them? – Higherdf
where changes have occurred between time 1 and time 2. Please let me know if it's still unclear! – Zoa