what is the difference between "sklearn.cluster.k_means" and "sklearn.cluster.KMeans" when I should use one of them?
Asked Answered
C

1

6

I am confusing about the difference between "sklearn.cluster.k_means" and "sklearn.cluster.KMeans" when I should use one of them?

Coralline answered 23/12, 2017 at 21:30 Comment(1)
Use the latter. They are doing the same, but the latter is using sklearn's API, while the other is just a function.Zack
F
3

From the sklearn glossary: "[w]e provide ad hoc function interfaces for many algorithms, while estimator classes provide a more consistent interface." k_means() is just a wrapper that returns the result of KMeans.fit():

  • cluster_centers_,
  • labels_,
  • inertia_,
  • n_iter_

KMeans is a class designed following the developer guide for sklearn objects. KMeans, like other classifier objects in sklearn, must implement methods for:

  • fit(),
  • transform(), and
  • score().

and can also implement other methods like predict(). The main benefit of using KMeans over k_means() is that you have easy access to the other methods implemented in KMeans. For example, if you want to use your trained model to predict which cluster unseen data belongs to:

from sklearn.clustering import KMeans

est = KMeans()
KMeans.fit(X_train)
cluster_labels = est.predict(X_test)

If you use the functional API, to apply the prediction you would have to look under the hood of KMeans.predict() to figure out how to do this.

The functional design is not implemented for all sklearn objects, but you can easily implement this yourself using other examples from sklearn to guide you.

Fendig answered 13/6, 2020 at 18:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.