sort numpy array elements by the value of a condition on the elements
Asked Answered
A

3

9

I need to sort a numpy array of points by increasing distance from another point.

import numpy as np

def dist(i,j,ip,jp): 
    return np.sqrt((i-ip)**2+(j-jp)**2)

arr = np.array([[0,0],[1,2],[4,1]])

What I would like to do is use function dist(1,1,ip,jp) between a fixed point [i,j]=[1,1] and each ordered pair [ip,jp] in arr to return arr with each element sorted from lowest to highest proximity to [i,j]. Anyone have a quick solution for this?

The output I want is new_arr = np.array([[1,2],[0,0],[4,1]])

I have some ideas but they're wildly inefficient seeming.

Thanks!

Attemper answered 5/12, 2016 at 22:31 Comment(1)
This is one way: np.array(sorted(arr, key=lambda x: dist(1,1,x[0], x[1]))).Phalanstery
A
1

I realize now this is a popular question, so years later, here's my own answer which uses the extremely powerful functionality of scipy.spatial. Here, scipy.spatial.cdist is used to do the distance computations. This is lightning fast and pythonic, without any "convert to list and convert back" hackery:

from scipy.spatial.distance import cdist
import numpy as np

# EXAMPLE DATA
arr = 20*np.random.random(size=(5000000,2))-10 # testing data
pt = np.array([[1,1]]) # the point to eval proximity to

# THE SOLUTION
out = arr[np.argsort(cdist(arr,pt).squeeze())]

Here, cdist gets an array of distances, squeeze kills the extra dimension in this array, argsort orders the indices into the distances by the distances, and arr[...] sorts arr by these indices.

Attemper answered 19/7, 2022 at 8:22 Comment(0)
C
13

There seem to be two ways to do this:

  1. Convert the whole numpy array into a Python list, and sort it using Python's sort method with a key function.

     l = list(arr)
     l.sort(key=lambda coord: dist(1, 1, coord[0], coord[1]))
     arr = np.array(l)
    
  2. Create a second numpy array by mapping dist() over the original array, use .argsort() to get the sorted order, then apply that to the original array.

     arr2 = np.vectorize(lambda coord: dist(1, 1, coord[0], coord[1]))(arr)
     arr3 = np.argsort(arr2)
     arr = np.array(arr)[arr3]
    
Carbonado answered 5/12, 2016 at 22:41 Comment(0)
W
1

You can actually make @user3030010's second answer more efficient by using numpy.lexsort(), using arr mapped with dist() as key, and then applying the resulting mask to arr itself

import numpy as np

my_key = np.vectorize(lambda coord: dist(1, 1, coord[0], coord[1]))(arr)
inds = np.lexsort(keys = [my_key])
arr = arr[inds]

It is indeed a minor improvement but the method is particularly useful if you then add more keys for sorting.

Whopping answered 16/12, 2021 at 17:16 Comment(0)
A
1

I realize now this is a popular question, so years later, here's my own answer which uses the extremely powerful functionality of scipy.spatial. Here, scipy.spatial.cdist is used to do the distance computations. This is lightning fast and pythonic, without any "convert to list and convert back" hackery:

from scipy.spatial.distance import cdist
import numpy as np

# EXAMPLE DATA
arr = 20*np.random.random(size=(5000000,2))-10 # testing data
pt = np.array([[1,1]]) # the point to eval proximity to

# THE SOLUTION
out = arr[np.argsort(cdist(arr,pt).squeeze())]

Here, cdist gets an array of distances, squeeze kills the extra dimension in this array, argsort orders the indices into the distances by the distances, and arr[...] sorts arr by these indices.

Attemper answered 19/7, 2022 at 8:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.