Why does decreasing K in K-nearest-neighbours increase complexity?

I had the same moment of disbelief when reading that axiom ; a parameter of higher value that decreases complexity seems a bit counterintuitive at first.

To put an intuition on this, let's compare a 1-nearest-neighbour trained model, and a N>>1-nearest-neighbours one. Let's use a simplified 2D-plot (two-features dataset) with a binary classification (each "point" has a class, or label, of either A or B).

With the 1-nearest-neighbour model, each example of the training set is potentially the center of an area predicting class A or B, with most of its neighbors the center of an area predicting the other class. Your plot might look like one of those maps of ethnicity, language or religion in the regions of the world where they are deeply intertwined (Balkans or the Middle East comes to mind) : small patches of complex shapes and alternating colors, with no discernible logic, and thus "high complexity".

1-nearest neighbour

If you increase k, the areas predicting each class will be more "smoothed", since it's the majority of the k-nearest neighbours which decide the class of any point. Thus the areas will be of lesser number, larger sizes and probably simpler shapes, like the political maps of country borders in the same areas of the world. Thus "less complexity".

k-nearest neighbours

(Intuition and source from this course.)

Recommended topics

Hot tags