K nearest neighbour in python [closed]
Asked Answered
M

4

13

I would like to calculate K-nearest neighbour in python. what library should i use?

Mckissick answered 6/4, 2011 at 11:56 Comment(0)
R
22

I think that you should use scikit ann.

There is a good tutorial about the nearest neightbour here.

According to the documentation :

ann is a SWIG-generated python wrapper for the Approximate Nearest Neighbor (ANN) Library (http://www.cs.umd.edu/~mount/ANN/), developed by David M. Mount and Sunil Arya. ann provides an immutable kdtree implementation (via ANN) which can perform k-nearest neighbor and approximate k

Rip answered 6/4, 2011 at 12:0 Comment(4)
+1 this library is very easy to work with.Haemic
scikit.ann not the same as scikit-learn. scikit.ann hard to compile even using easy_install(it requires swig), so scikit-learn is better solution.Coeducation
The scikit ann link is broken.Googolplex
ANN is not same as KNN (which the question is originally about)Oviparous
U
5

Here is a script comparing scipy.spatial.cKDTree and pyflann.FLANN. See for yourself which one is faster for your application.

import cProfile
import numpy as np
import os
import pyflann
import scipy.spatial

# Config params
dim = 4
data_size = 1000
test_size = 1

# Generate data
np.random.seed(1)
dataset = np.random.rand(data_size, dim)
testset = np.random.rand(test_size, dim)

def test_pyflann_flann(num_reps):
    flann = pyflann.FLANN()
    for rep in range(num_reps):
        params = flann.build_index(dataset, target_precision=0.0, log_level='info')
        result = flann.nn_index(testset, 5, checks=params['checks'])

def test_scipy_spatial_kdtree(num_reps):
    flann = pyflann.FLANN()
    for rep in range(num_reps):
        kdtree = scipy.spatial.cKDTree(dataset, leafsize=10)
        result = kdtree.query(testset, 5)

num_reps = 1000
cProfile.run('test_pyflann_flann(num_reps); test_scipy_spatial_kdtree(num_reps)', 'out.prof')
os.system('runsnake out.prof')
Uzzi answered 15/7, 2011 at 5:41 Comment(0)
S
4

scipy.spatial.cKDTree is fast and solid. For an example of using it for NN interpolation, see (ahem) inverse-distance-weighted-idw-interpolation-with-python on SO.

(If you could say e.g. "I have 1M points in 3d, and want k=5 nearest neighbors of 1k new points", you might get better answers or code examples.
What do you want to do with the neighbors once you've found them ?)

Sebrinasebum answered 6/4, 2011 at 16:0 Comment(0)
A
4

It is natively in scipy if you're looking to do a kd-tree approach: http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.KDTree.html#scipy.spatial.KDTree

Anodic answered 8/6, 2012 at 19:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.