Convergence criterion for (batch) SOM (Self-Organizing Map, aka "Kohonen Map")?

S

2

6

I like to stop the execution when Batch SOM becomes converged. What error function can I use to determine the convergence?

Sidero answered 31/3, 2010 at 23:54 Comment(0)

S

2

When talking about convergence for SOMs, for a given map size (n x m), you want to know whether sufficient iterations of the algorithm have run to ensure the map is "Stable". This means, loosely speaking, do new inputs (observations) to the map get placed at the same neurons /codebook vectors if the map is retrained many times?(Ignoring the issue of the fact that the arrangement of the map may switch around when it is trained each time, which is fine as long as the clusters are still arranged in a stable way).

To assist in answering the question of whether enough iterations have run, see the academic papers listed below. Both papers also touch on the issue of what map size is appropriate (what n x m values help ensure convergence of the SOM?).

One of the traditional approaches that has been popular in papers is given here:

Statistical tools to assess the reliability of self-organizing maps (Bodt, Cottrell, Verleysen)

More recently, this method has come about, which looks rather promising:

A CONVERGENCE CRITERION FOR SELF-ORGANIZING MAPS , masters thesis, Benjamin h. ott (University of Rhode island)

This thesis, in my opinion, was really well written and a pleasure to read. What's also nice is that this research has been written up as a SOM convergence test in a (rather unknown) package in R, called popsom. Check it out:

popsom

Swage answered 26/6, 2014 at 23:33 Comment(0)

R

5

I'm pretty sure you mean cost function rather than error function.

SOM does not require an error function (nor a cost function).

At the top level of the Machine Learning taxonomy, SOM is an unsupervised learning technique--no target vector, and therefore no "target-vector" minus "value_at_the_current_iteration" to minimize.

Another way to think of it: The role of a cost function is to minimize some cost; in ML, it's the delta between model calculation and supplied data. In SOM, no data is supplied to the algorithm for this purpose.

(I realize that this is somewhat confusing because the input data from which the network is created is often referred to as "training data"--probably that's the role of input data in supervised ML techniques, which are far more common than unsupervised ones. It's probably also confusing because Teuvo Kohonen, the person credited with 'inventing' SOM, originally referred to them as a class of neural networks--and of course NN is a supervised technique and does rely on a cost function (often gradient descent.))

Finally, just to make sure, i checked my own SOM code as well as the code from the ML textbook by Marsland, "Machine Learning: An Algorithmic Perspective". In both my code and his, the only stopping criterion whatever value for "maximum iterations" the user passed in when he called the main function.

Rhnegative answered 1/4, 2010 at 1:3 Comment(3)

Thanks very much. I wonder if you know any good reference to batch SOM implementation. I find many serial SOM implementation on the net, but not batch one. In my implementation, the average cost (distance between training data and neurons) decreases, but the cost increases for some neurons... The code must be missing some piece... – Sidero 1/4, 2010 at 8:34

here's one in C++ (haven't looked at it): cis.hut.fi/research/som_lvq_pak.shtml; and here's a python implementation (www-ist.massey.ac.nz/smarsland/MLBook.html); the link is to Home Page for a ML textbook by Stephen Marsland; on this Page, he posts all of the code used in the book (scroll to ch. 9, you'll see the SOM code). – Rhnegative 5/4, 2010 at 4:1

There is also nicely written C code for SOMs in the R open source packages class (basic version), kohonen, popsom ... – Swage 26/6, 2014 at 23:36

S

2

When talking about convergence for SOMs, for a given map size (n x m), you want to know whether sufficient iterations of the algorithm have run to ensure the map is "Stable". This means, loosely speaking, do new inputs (observations) to the map get placed at the same neurons /codebook vectors if the map is retrained many times?(Ignoring the issue of the fact that the arrangement of the map may switch around when it is trained each time, which is fine as long as the clusters are still arranged in a stable way).