Plotting weighted mean activations
Plotting graded activations
Training individual maps
Training more individual maps
Hebbian training on non-symmetric maps
References for SOMs

Training a single map

Here's a better explanation of how the individual maps are trained.

We will have 50 neurons in our map, arranged 10x5. On the left you see the hexagon-based map of the kind I used for the MCWoP talk, on the right the coordinate map. These are two representations of the same structure.

In these maps, the neurons have no value. That's why the cells are empty on the left. There is structure, though. Remember that on the left, we said there is no real meaning to the horizontal and vertical axes. What does matter is that the cells are next to each other; neurons with adjacent edges are neighbors. Even without values, we know which neurons are neighboring which others.

The same is now true on the right map. The neurons are arranged as we asked, in a 10x5 rectangle, although the placement has no meaning here. The lines tell us who the neighbors are. You can see that each neuron has 6 neighbors, like in the map on the left.

When we do the training, the neighborhoods are defined by these connections rather than the values of the neurons. That's why the connections are needed. As in my MCWoP presentation, we'll first initialize the map with some random values. Each neuron will be two-dimensional.

On the left, the colors represent the value of the first variable. We could also show a similar map for the second variable, but no need. On the right, the values are interpreted as coordinates. So the neurons are spread all around the plot. The reason the lines are all crazy is that the neurons stay connected to their neighbors. The neighbors are still defined the way they were in the empty plots. Essentially neighborhoods are determined by the index of the neuron, not the value. Why? Here's what happens when we present one data point:

On the left, the best matching unit was in the lower right, so its color changed, and a neighborhood around it changed. That neighborhood is something like "within a 4 neuron radius of the BMU". The same thing happened on the right. The best matching unit was found, its value changed, and all of the neurons in its neighborhood changed somewhat too. Most of the neurons were 'pulled' to that point. Rather than changing the connections, they instead disentangle. This way the algorithm isn't calculating all of the distances between all of the neurons at every presentation; it knows how large the neighborhood radius is, meaning "how many connections away". Then it's easy to find the right neurons based on the connections that are already there. Presenting the rest of the data:

The left map shows an even distribution of values, and the right map shows the approximate distribution of the data in two dimensions. The data I used here is the pentagon data from here.