R's duplicated
returns a vector showing whether each element of a vector or data frame is a duplicate of an element with a smaller subscript. So if rows 3, 4, and 5 of a 5-row data frame are the same, duplicated
will give me the vector
FALSE, FALSE, FALSE, TRUE, TRUE
But in this case I actually want to get
FALSE, FALSE, TRUE, TRUE, TRUE
that is, I want to know whether a row is duplicated by a row with a larger subscript too.
x <- c(1:9, 7:10, 5:22); y <- c(letters, letters[1:5]); test <- data.frame(x, y); test[duplicated(test$x) | duplicated(test$x, fromLast=TRUE), ]
Returned all three of he copies of 7, 8, and 9. Why does that work? – Rosemare