List of n first Neighbors from a 3d Array R

Asked 21/6, 2016 at 3:58 Answered 21/6, 2016 at 14:33

Solved arrays r matlab 3d nearest-neighbor

Lets say we have a 3d array:

my.array <- array(1:27, dim=c(3,3,3))

I would like to create a list of the n first neighbors.

Example: Lets get my.array[2,2,2]=14, so the first neighbors of 14 is:

list[14] = [1 to 27] - 14

I also would like to do the same for second, third, n closest neighbors using R, C or Matlab.

Thanks

Salicylate answered 21/6, 2016 at 3:58 Comment(3)

So the "first neighbors of 14" are a vector ranging from -13 to 13? What are the second neighbors of 14? Third? (I do not understand what you mean or are requesting.) – Lens 21/6, 2016 at 4:36

the first neighbor for (1,1,1) is 2 and so on. the second is 2,3 and so on... – Salicylate 21/6, 2016 at 11:38

"2 and so on" is not at all useful. Are "first level neighbors" those with a distance of 1? I'll assume euclidean distance, not Manhattan or other. If so, then the neighbors of [1,1,1] should be 2, 4, and 10, not including 5, 11, or 13 (dist 1.4). Please clear up your definition. – Lens 21/6, 2016 at 13:16

Based on the comments, I assume you are defining "first nearest neighbor" as all cells with a euclidean distance of 1 or less (excluding self), "second nearest neighbors" as those with 2 or less, etc. Your assertion in a comment in @evan058's answer that "for (1,1,1) the first level neighbors is 2,4,5,10,11,13", I'm actually interpreting this to include the immediate diagonals (with a distance of 1.414) but not further diagonals (in your example, 14 would be a further diagonal with a distance of 1.732).

This function accepts either a pre-defined array (ary) or the dimensions to make one (dims).

nearestNeighbors(dims = c(3,3,3), elem = c(1,1,1), dist = 1)
#      dim1 dim2 dim3
# [1,]    2    1    1
# [2,]    1    2    1
# [3,]    1    1    2
nearestNeighbors(dims = c(3,3,3), elem = c(1,1,1), dist = 1,
                 return_indices = FALSE)
# [1]  2  4 10
nearestNeighbors(dims = c(3,3,3), elem = c(1,1,1), dist = 2,
                 return_indices = FALSE)
#  [1]  2  3  4  5  7 10 11 13 14 19

nearestNeighbors(ary = array(27:1, dim = c(3,3,3)), elem = c(1,1,1), dist = 2)
#       dim1 dim2 dim3
#  [1,]    2    1    1
#  [2,]    3    1    1
#  [3,]    1    2    1
#  [4,]    2    2    1
#  [5,]    1    3    1
#  [6,]    1    1    2
#  [7,]    2    1    2
#  [8,]    1    2    2
#  [9,]    2    2    2
# [10,]    1    1    3
nearestNeighbors(ary = array(27:1, dim = c(3,3,3)), elem = c(1,1,1), dist = 2,
                 return_indices = FALSE)
#  [1] 26 25 24 23 21 18 17 15 14  9

The function:

#' Find nearest neighbors.
#'
#' @param ary array
#' @param elem integer vector indicating the indices on array from
#'   which all nearest neighbors will be found; must be the same
#'   length as \code{dims} (or \code{dim(ary)}). Only one of
#'   \code{ary} and \code{dim} needs to be provided.
#' @param dist numeric, the max distance from \code{elem}, not
#'   including the 'self' point.
#' @param dims integer vector indicating the dimensions of the array.
#'   Only one of \code{ary} and \code{dim} needs to be provided.
#' @param return_indices logical, whether to return a matrix of
#'   indices (as many columns as dimensions) or the values from
#'   \code{ary} of the nearest neighbors
#' @return either matrix of indices (one column per dimension) if
#'   \code{return_indices == TRUE}, or the appropriate values in
#'   \code{ary} otherwise.
nearestNeighbors <- function(ary, elem, dist, dims, return_indices = TRUE) {
  if (missing(dims)) dims <- dim(ary)
  tmpary <- array(1:prod(dims), dim = dims)
  if (missing(ary)) ary <- tmpary
  if (length(elem) != length(dims))
      stop("'elem'' needs to have the same dimensions as 'ary'")
  # work on a subset of the whole matrix
  usedims <- mapply(function(el, d) {
    seq(max(1, el - dist), min(d, el + dist))
  }, elem, dims, SIMPLIFY=FALSE)
  df <- as.matrix(do.call('expand.grid', usedims))
  # now, df is only as big as we need to possibly satisfy `dist`
  ndist <- sqrt(apply(df, 1, function(x) sum((x - elem)^2)))
  ret <- df[which(ndist > 0 & ndist <= dist),,drop = FALSE]
  if (return_indices) {
    return(ret)
  } else {
    return(ary[ret])
  }
}

Edit: changed the code for a "slight" speed improvement: using a 256x256x256 array and a distance of 2 previously took ~90 seconds on my machine. Now it takes less than 1 second. Even a distance of 5 (same array) takes less than a second. Not fully tested, please verify it is correct.

Edit: Removed the extra { on the fifty line of the function.

Lens answered 21/6, 2016 at 14:33 Comment(6)

It works, but I am dealing with a 256x256x256, for each element is taking 15 seconds. Need to do the same for 16.777.216 points :( Is there a way to make it faster. I wont need more than 5 for distance. Thanks r2evans. – Salicylate 21/6, 2016 at 19:53

You can do similar things much faster with Rcpp. It'll deal with the dimensionality without problem, and with its "sugar", much of the code you need is very similar to the code above. – Lens 21/6, 2016 at 20:2

You should be able to speed things up by restricting yourself to a submatrix of the whole. That is, you know that you can limit yourself to +/- dist cells in each direction, so you can reduce the 256x256x256 array to a 11x11x11 array, calculate the neighbors, then re-offset based on the center cell. Since this reduces the matrix size by a factor of no less than 12,600 (worst case), it should speed up quite a bit. – Lens 21/6, 2016 at 20:5

You can speed it up just a little more if you (a) hard-code which of ary or dims you will be providing all the time, and (b) you take out the length check. This doesn't save you much, but if you're dealing with 256^3 checks, any microsecond may help. – Lens 21/6, 2016 at 20:56

Good catch ({), I was in the middle of improving the readability when "work" caught up to me :-) – Lens 21/6, 2016 at 21:36

Could you please help me on this one? You are the guy! #38001494 @Lens – Salicylate 25/6, 2016 at 3:17

I think something along these lines will do the trick:

nClosest <- function(pts, pt, n)
{
  # Get the target value
  val <- pts[pt[1], pt[2], pt[3]]
  # Turn the matrix into a DF
  ptsDF <- adply(pts, 1:3)
  # Create Dist column for distance to val
  ptsDF$Dist <- abs(ptsDF$V1 - val)
  # Order by the distance to val
  ptsDF <- ptsDF[with(ptsDF, order(Dist)),]
  # Split into groups:
  sp <-  split(ptsDF, ptsDF$Dist)
  # Get max index
  topInd = min(n+1, length(sp))
  # Agg the split dfs into a single df
  rbind.fill(sp[2:topInd])
}

Output:

> nClosest(my.array, c(1,2,2), 3)
  X1 X2 X3 V1 Dist
1  3  1  2 12    1
2  2  2  2 14    1
3  2  1  2 11    2
4  3  2  2 15    2
5  1  1  2 10    3
6  1  3  2 16    3

Coreycorf answered 21/6, 2016 at 5:4 Comment(3)

Almost there, but for (1,1,1) the first level neighbors is 2,4,5,10,11,13 and second level neighbors 2,3,4,5,7,9,10,19... – Salicylate 21/6, 2016 at 11:35

So by n nearest you mean all points within distance n of the given point? Or do you want the n groups of nearest neighbors? (grouped by distance) – Coreycorf 21/6, 2016 at 14:48

@Salicylate check out my latest version – Coreycorf 21/6, 2016 at 15:35

Recommended topics

Hot tags