R: find nearest index
Asked Answered
E

3

7

I have two vectors with a few thousand points, but generalized here:

A <- c(10, 20, 30, 40, 50)
b <- c(13, 17, 20)

How can I get the indicies of A that are nearest to b? The expected outcome would be c(1, 2, 2).

I know that findInterval can only find the first occurrence, and not the nearest, and I'm aware that which.min(abs(b[2] - A)) is getting warmer, but I can't figure out how to vectorize it to work with long vectors of both A and b.

Emelia answered 15/4, 2012 at 7:54 Comment(0)
P
12

You can just put your code in a sapply. I think this has the same speed as a for loop so isn't technically vectorized though:

sapply(b,function(x)which.min(abs(x - A)))
Prefect answered 15/4, 2012 at 8:2 Comment(1)
Do note that which.min() only returns the first match. There might be other elements that are equally close.Prefect
C
11

FindInterval gets you very close. You just have to pick between the offset it returns and the next one:

#returns the nearest occurence of x in vec
nearest.vec <- function(x, vec)
{
    smallCandidate <- findInterval(x, vec, all.inside=TRUE)
    largeCandidate <- smallCandidate + 1
    #nudge is TRUE if large candidate is nearer, FALSE otherwise
    nudge <- 2 * x > vec[smallCandidate] + vec[largeCandidate]
    return(smallCandidate + nudge)
}

nearest.vec(b,A)

returns (1,2,2), and should comparable to FindInterval in performance.

Cautionary answered 15/1, 2013 at 12:25 Comment(1)
Really useful, thank you. I'm surprised that there isn't something to do this in base. Which probably means that there is and I'm not aware of it...!Wes
E
0

Here's a solution that uses R's often overlooked outer function. Not sure if it'll perform better, but it does avoid sapply.

A <- c(10, 20, 30, 40, 50)
b <- c(13, 17, 20)

dist <- abs(outer(A, b, '-'))
result <- apply(dist, 2, which.min)

# [1] 1 2 2
Evangelize answered 5/10, 2017 at 0:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.