I'm not sure if this is helpful (or even stupid), but I thought of this:
You use a sort-function to sort ALL elements in the grid and then pick the first k
elements. If you consider a sorting algorithm like recursive merge-sort, you have something like this:
- Split collection in half
- Recurse on both halves
- Merge both sorted halves
Maybe you could optimize such a function for your needs. The merging part normally merges all elements from both halves, but you are only interested in the first k
that result from the merging. So you could only merge until you have k
elements and ignore the rest.
So in the worst-case, where k >= n
(n
is the size of the grid) you would still only have the complexity of merge-sort. O(n log n)
To be honest I'm not able to determine the complexity of this solution relative to k
. (too tired for that at the moment)
Here is an example implementation of that solution (it's definitely not optimal and not generalized):
def minK(seq: IndexedSeq[coord], x: coord, k: Int) = {
val dist = (c: coord) => c.dist(x)
def sort(seq: IndexedSeq[coord]): IndexedSeq[coord] = seq.size match {
case 0 | 1 => seq
case size => {
val (left, right) = seq.splitAt(size / 2)
merge(sort(left), sort(right))
}
}
def merge(left: IndexedSeq[coord], right: IndexedSeq[coord]) = {
val leftF = left.lift
val rightF = right.lift
val builder = IndexedSeq.newBuilder[coord]
@tailrec
def loop(leftIndex: Int = 0, rightIndex: Int = 0): Unit = {
if (leftIndex + rightIndex < k) {
(leftF(leftIndex), rightF(rightIndex)) match {
case (Some(leftCoord), Some(rightCoord)) => {
if (dist(leftCoord) < dist(rightCoord)) {
builder += leftCoord
loop(leftIndex + 1, rightIndex)
} else {
builder += rightCoord
loop(leftIndex, rightIndex + 1)
}
}
case (Some(leftCoord), None) => {
builder += leftCoord
loop(leftIndex + 1, rightIndex)
}
case (None, Some(rightCoord)) => {
builder += rightCoord
loop(leftIndex, rightIndex + 1)
}
case _ =>
}
}
}
loop()
builder.result
}
sort(seq)
}
val nearest = grid.minBy( p => p.dist(x) )
and then remove that element for list and try again. Works if small number 3. This is not not worthy of an answer. I suspect bit wise operation somewhere to speed up – Patricdef distSquare(c: coord) = Math.pow(x-c.x, 2) + Math.pow(y-c.y, 2)
as the measure. (which basically saves you calculating.sqrt
each time) – Mussulmantop
method that does what you want. Maybe you can repurpose the source for that? spark.apache.org/docs/0.8.1/api/core/org/apache/spark/rdd/… – Oscaroscillate