Based on the comments, I assume you are defining "first nearest neighbor" as all cells with a euclidean distance of 1 or less (excluding self), "second nearest neighbors" as those with 2 or less, etc. Your assertion in a comment in @evan058's answer that "for (1,1,1) the first level neighbors is 2,4,5,10,11,13", I'm actually interpreting this to include the immediate diagonals (with a distance of 1.414) but not further diagonals (in your example, 14 would be a further diagonal with a distance of 1.732).
This function accepts either a pre-defined array (ary
) or the dimensions to make one (dims
).
nearestNeighbors(dims = c(3,3,3), elem = c(1,1,1), dist = 1)
# dim1 dim2 dim3
# [1,] 2 1 1
# [2,] 1 2 1
# [3,] 1 1 2
nearestNeighbors(dims = c(3,3,3), elem = c(1,1,1), dist = 1,
return_indices = FALSE)
# [1] 2 4 10
nearestNeighbors(dims = c(3,3,3), elem = c(1,1,1), dist = 2,
return_indices = FALSE)
# [1] 2 3 4 5 7 10 11 13 14 19
nearestNeighbors(ary = array(27:1, dim = c(3,3,3)), elem = c(1,1,1), dist = 2)
# dim1 dim2 dim3
# [1,] 2 1 1
# [2,] 3 1 1
# [3,] 1 2 1
# [4,] 2 2 1
# [5,] 1 3 1
# [6,] 1 1 2
# [7,] 2 1 2
# [8,] 1 2 2
# [9,] 2 2 2
# [10,] 1 1 3
nearestNeighbors(ary = array(27:1, dim = c(3,3,3)), elem = c(1,1,1), dist = 2,
return_indices = FALSE)
# [1] 26 25 24 23 21 18 17 15 14 9
The function:
#' Find nearest neighbors.
#'
#' @param ary array
#' @param elem integer vector indicating the indices on array from
#' which all nearest neighbors will be found; must be the same
#' length as \code{dims} (or \code{dim(ary)}). Only one of
#' \code{ary} and \code{dim} needs to be provided.
#' @param dist numeric, the max distance from \code{elem}, not
#' including the 'self' point.
#' @param dims integer vector indicating the dimensions of the array.
#' Only one of \code{ary} and \code{dim} needs to be provided.
#' @param return_indices logical, whether to return a matrix of
#' indices (as many columns as dimensions) or the values from
#' \code{ary} of the nearest neighbors
#' @return either matrix of indices (one column per dimension) if
#' \code{return_indices == TRUE}, or the appropriate values in
#' \code{ary} otherwise.
nearestNeighbors <- function(ary, elem, dist, dims, return_indices = TRUE) {
if (missing(dims)) dims <- dim(ary)
tmpary <- array(1:prod(dims), dim = dims)
if (missing(ary)) ary <- tmpary
if (length(elem) != length(dims))
stop("'elem'' needs to have the same dimensions as 'ary'")
# work on a subset of the whole matrix
usedims <- mapply(function(el, d) {
seq(max(1, el - dist), min(d, el + dist))
}, elem, dims, SIMPLIFY=FALSE)
df <- as.matrix(do.call('expand.grid', usedims))
# now, df is only as big as we need to possibly satisfy `dist`
ndist <- sqrt(apply(df, 1, function(x) sum((x - elem)^2)))
ret <- df[which(ndist > 0 & ndist <= dist),,drop = FALSE]
if (return_indices) {
return(ret)
} else {
return(ary[ret])
}
}
Edit: changed the code for a "slight" speed improvement: using a 256x256x256 array and a distance of 2 previously took ~90 seconds on my machine. Now it takes less than 1 second. Even a distance of 5 (same array) takes less than a second. Not fully tested, please verify it is correct.
Edit: Removed the extra { on the fifty line of the function.