How do I make a U-matrix?

Asked 29/11, 2012 at 17:48 Answered 6/2, 2014 at 16:34

Solved machine-learning neural-network som self-organizing-maps

How exactly is an U-matrix constructed in order to visualise a self-organizing-map? More specifically, suppose that I have an output grid of 3x3 nodes (that have already been trained), how do I construct a U-matrix from this? You can e.g. assume that the neurons (and inputs) have dimension 4.

I have found several resources on the web, but they are not clear or they are contradictory. For example, the original paper is full of typos.

Duplet answered 29/11, 2012 at 17:48 Comment(0)

A U-matrix is a visual representation of the distances between neurons in the input data dimension space. Namely you calculate the distance between adjacent neurons, using their trained vector. If your input dimension was 4, then each neuron in the trained map also corresponds to a 4-dimensional vector. Let's say you have a 3x3 hexagonal map.

map lattice

The U-matrix will be a 5x5 matrix with interpolated elements for each connection between two neurons like this

u-mat lattice

The {x,y} elements are the distance between neuron x and y, and the values in {x} elements are the mean of the surrounding values. For example, {4,5} = distance(4,5) and {4} = mean({1,4}, {2,4}, {4,5}, {4,7}). For the calculation of the distance you use the trained 4-dimensional vector of each neuron and the distance formula that you used for the training of the map (usually Euclidian distance). So, the values of the U-matrix are only numbers (not vectors). Then you can assign a light gray colour to the largest of these values and a dark gray to the smallest and the other values to corresponding shades of gray. You can use these colours to paint the cells of the U-matrix and have a visualized representation of the distances between neurons.

Have also a look at this web article.

Sybarite answered 30/11, 2012 at 9:26 Comment(10)

+1 great explanation. Alternatively only the average values of inter-nodes distances are shown (i.e. only visualize the {x} elements). I think this was already mentioned in one of the posts linked above (although in less details) – Saltant 30/11, 2012 at 16:32

+ 1000 ...Please do the human race a favor and publish a paper/blog-post with this because it has been harrowing getting a proper explanation of this. Now for some followup questions: 1) As @Saltant has mentioned, the alternative is to just visualize the {x} and not have the inter-node distances as you have mentioned. What is the advantage of one over the other? 2) I know that this German author, Ulter created the U-Matrix, but why does one include the inter-node distances as mentioned here? I mean, what is the reasoning behind it? 3) How did you make this diagram on the fly? Thanks so much! – Duplet 30/11, 2012 at 17:21

@Learnaholic: to me both conventions serve a very similar purpose; to visualize the clusters covered by the SOM nodes using a low dimensional mapping of the original features. You would expect to see areas/zones strongly connected (small inter-nodes distances) separated by regions of weak connections (large distances). There are many other possible visualizations, this page lists a few.. – Saltant 30/11, 2012 at 17:35

@Saltant One point: Is it possible for you please update the answer to add the case for when the grid is square? For example, are my diagonal neighbors in a square grid considered 'neighbors'? In other words, what would {4} be in the square case? Would it be mean({4,1}, {4,2}, {4,5}, {4,7}, {4,8})? Thanks. (I am asking about square case because I have to make this in MATLAB and I do not think I can do hexagons). – Duplet 30/11, 2012 at 18:10

@Learnaholic: it depends on your topology, you could have the 2D lattice 4-connected (up/down/left/right) or 8-connected (in all eights directions), the latter is often used. By the way, hexagonal layout can be viewed as a regular grid with every other column shifted by half unit (see this post) – Saltant 30/11, 2012 at 18:17

@Saltant Thanks so much for your help. That link was very useful. Of course, the display in MATLAB would still be as a rectangular grid, no? – Duplet 30/11, 2012 at 18:33

@Learnaholic: you could always draw your own polygons using the PATCH function, but I'll leave that to you :) Also why not look at how SOM Toolbox implements the drawing part, it is released under GPL license. – Saltant 30/11, 2012 at 18:38

@Learnaholic: Skipping the inter-nodes distances, is of course correct but also lowers the representational strength of the diagram. The U-matrix with inter-nodes can reveal more easily the underling structure of the input data (which is its purpose as Amro correctly pointed out). I did the diagrams with MS Visio ;) but I heavily use the Som Toolbox mentioned by Amro. If you are working on SOM with Matlab, it is definitely a must. – Sybarite 1/12, 2012 at 9:49

@Sybarite Thank you very much, I have learned a lot from you. :-) – Duplet 1/12, 2012 at 17:8

@Sybarite Thanks again. I would appreciate any insights you might have on this matter here, since you seem to be well versed in this. – Duplet 3/12, 2012 at 16:13

The original paper cited in the question states:

A naive application of Kohonen's algorithm, although preserving the topology of the input data is not able to show clusters inherent in the input data.

Firstly, that's true, secondly, it is a deep mis-understanding of the SOM, thirdly it is also a mis-understanding of the purpose of calculating the SOM.

Just take the RGB color space as an example: are there 3 colors (RGB), or 6 (RGBCMY), or 8 (+BW), or more? How would you define that independent of the purpose, ie inherent in the data itself?

My recommendation would be not to use maximum likelihood estimators of cluster boundaries at all - not even such primitive ones as the U-Matrix -, because the underlying argument is already flawed. No matter which method you then use to determine the cluster, you would inherit that flaw. More precisely, the determination of cluster boundaries is not interesting at all, and it is loosing information regarding the true intention of building a SOM. So, why do we build SOM's from data? Let us start with some basics:

Any SOM is a representative model of a data space, for it reduces the dimensionality of the latter. For it is a model it can be used as a diagnostic as well as a predictive tool. Yet, both cases are not justified by some universal objectivity. Instead, models are deeply dependent on the purpose and the accepted associated risk for errors.
Let us assume for a moment the U-Matrix (or similar) would be reasonable. So we determine some clusters on the map. It is not only an issue how to justify the criterion for it (outside of the purpose itself), it is also problematic because any further calculation destroys some information (it is a model about a model).
The only interesting thing on a SOM is the accuracy itself viz the classification error, not some estimation of it. Thus, the estimation of the model in terms of validation and robustness is the only thing that is interesting.
Any prediction has a purpose and the acceptance of the prediction is a function of the accuracy, which in turn can be expressed by the classification error. Note that the classification error can be determined for 2-class models as well as for multi-class models. If you don't have a purpose, you should not do anything with your data.
Inversely, the concept of "number of clusters" is completely dependent on the criterion "allowed divergence within clusters", so it is masking the most important thing of the structure of the data. It is also dependent on the risk and the risk structure (in terms of type I/II errors) you are willing to take.
So, how could we determine the number classes on a SOM? If there is no exterior apriori reasoning available, the only feasible way would be an a-posteriori check of the goodness-of-fit. On a given SOM, impose different numbers of classes and measure the deviations in terms of mis-classification cost, then choose (subjectively) the most pleasing one (using some fancy heuristics, like Occam's razor)

Taken together, the U-matrix is pretending objectivity where no objectivity can be. It is a serious misunderstanding of modeling altogether. IMHO it is one of the greatest advantages of the SOM that all the parameters implied by it are accessible and open for being parameterized. Approaches like the U-matrix destroy just that, by disregarding this transparency and closing it again with opaque statistical reasoning.

Thessalonians answered 6/2, 2014 at 16:34 Comment(2)

Hello monnoo, thanks for your input, however I am not entirely following what you are saying. I have used U-matrix to cluster via SOM successfully already. Perhaps I do not understand what you mean. Thanks. – Duplet 12/2, 2014 at 16:30

@monnoo,@Learnaholic, May I suggest you illustrate - probably use code, figures - your explanation with example(s). This will make it clearer. – Liberia 16/5, 2016 at 10:33

Recommended topics

Hot tags