Many algorithms for clustering are available. A popular algorithm is the K-means where, based on a given number of clusters, the algorithm iterates to find best clusters for the objects.
What method do you use to determine the number of clusters in the data in k-means clustering?
Does any package available in R contain the V-fold cross-validation
method for determining the right number of clusters?
Another well used approach is Expectation Maximization (EM) algorithm which assigns a probability distribution to each instance which indicates the probability of it belonging to each of the clusters.
Is this algorithm implemented in R?
If it is, does it have the option to automatically select the optimum number of clusters by cross validation?
Do you prefer some other clustering method instead?
between class distance
and thewithin class distance
. See, for example the method described in paragraphOptimal Number of Clusters
here: sandro.saitta.googlepages.com/… – Westwardly