I would like to calculate K-nearest neighbour in python. what library should i use?
I think that you should use scikit ann.
There is a good tutorial about the nearest neightbour here.
According to the documentation :
ann is a SWIG-generated python wrapper for the Approximate Nearest Neighbor (ANN) Library (http://www.cs.umd.edu/~mount/ANN/), developed by David M. Mount and Sunil Arya. ann provides an immutable kdtree implementation (via ANN) which can perform k-nearest neighbor and approximate k
Here is a script comparing scipy.spatial.cKDTree and pyflann.FLANN. See for yourself which one is faster for your application.
import cProfile
import numpy as np
import os
import pyflann
import scipy.spatial
# Config params
dim = 4
data_size = 1000
test_size = 1
# Generate data
np.random.seed(1)
dataset = np.random.rand(data_size, dim)
testset = np.random.rand(test_size, dim)
def test_pyflann_flann(num_reps):
flann = pyflann.FLANN()
for rep in range(num_reps):
params = flann.build_index(dataset, target_precision=0.0, log_level='info')
result = flann.nn_index(testset, 5, checks=params['checks'])
def test_scipy_spatial_kdtree(num_reps):
flann = pyflann.FLANN()
for rep in range(num_reps):
kdtree = scipy.spatial.cKDTree(dataset, leafsize=10)
result = kdtree.query(testset, 5)
num_reps = 1000
cProfile.run('test_pyflann_flann(num_reps); test_scipy_spatial_kdtree(num_reps)', 'out.prof')
os.system('runsnake out.prof')
scipy.spatial.cKDTree is fast and solid. For an example of using it for NN interpolation, see (ahem) inverse-distance-weighted-idw-interpolation-with-python on SO.
(If you could say e.g. "I have 1M points in 3d, and want k=5 nearest neighbors of 1k new points",
you might get better answers or code examples.
What do you want to do with the neighbors once you've found them ?)
It is natively in scipy if you're looking to do a kd-tree approach: http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.KDTree.html#scipy.spatial.KDTree
© 2022 - 2024 — McMap. All rights reserved.