You are on page 1of 2

E.

Merényi ELEC / COMP 502 3/30/10 page 1/2

Homework 8
Posted March 30, 2010, due April 8, 2010
Maximum available points: 100, 100% = 60 points

Problem 1. (40 points)


Consider the classification problem shown in the figure below.

• Design a Learning Vector Quantization (LVQ or LVQ1) network to perform this


classification. Show the classification results on the training data by generating a plot
of the different classification regions. You do not need to plot stars and boxes, etc.
Any discriminating representation will do. An example is this, where the three
classes are coded with values of 1, 2, and 3:

-1

-2

-3

-4

-4 -3 -2 -1 0 1 2 3 4
E. Merényi ELEC / COMP 502 3/30/10 page 2/2

a) Test the classification boundaries placed by the network in the following way:
• Generate the test input points on a finer grid than the one shown in the figure.
• Present the test input points to the network and let the network classify the test data.
• Generate a plot showing the different classification regions.

You can use the MATLAB implementation of the LVQ1 code (lvq1.m), posted in the
data directory http://argus.ece.rice.edu/ANNclass502/. In the same folder there is also
another code, LVQ-misc.m, which helps generate the target data, and classify data into
the target categories using the converged weights of the LVQ1 net.

Problem 2. (60)

Use your 10 x 10 Kohonen SOM that you developed in HW7, to cluster the iris data. Use both the
iris-train.txt and iris-test.txt data sets, since here you will not use the class labels for training the
SOM, you will only use the 4-dimensional input vectors. (Read in the class labels and store them,
however, for subsequent verification as described below.) Monitor the progress of the network,
and when stable, identify clusters.

This is a 4-D data set, thus looking at the weights in the input space is not possible. To see the
progression and cluster formation, you need to represent the differences of the weights on the
SOM grid. You can do that in the way we showed in class: Compute the Euclidean distance of
each pair of weight vectors that belong to adjacent SOM PEs and visualize that distance as a
“fence” between the two adjacent PEs, with the shade of the fence proportional to the distance
value. (For example, using a grey scale, black can mean” high fence”, i.e., large distance of the
weights, and white would be “low fence”, i.e., zero distance, in this case, other distances
proportionally scaled to grey shades.)
By plotting this representation, you can watch the progression of the SOM, including the
formation of “fenced off “ areas of PEs, which would indicate clusters. You can also show the
density information, layered with the fences, for more complete knowledge of what the SOM
learned. Once the SOM seems stabilized, try to determine the clusters in the SOM by visual
inspection. (To that end, and for this HW, you may simply circle by hand where you see clusters.)

After this, superimpose the class labels (our external knowledge) on the SOM cells. I.e., find a
way to show what samples (from which of the three classes) are mapped to each cell. You can do
this in several ways. For example, assign colors to the three classes and color each SOM cell to
the color of the majority of the samples in that cell; or plot a colored density histogram of samples
in the cell. (I showed you examples of both of these in class, it is in the copy of my SOM slides.)

Now you can see what the SOM tells you about the iris data set, and may get an explanation of
why those few samples were usually misclassified when we did supervised classification on this
data set. Tell me what you see.

You might also like