Professional Documents
Culture Documents
v2 = [0, 0, 1, 1, 0, 1]
are the voting records on bills one and two, then the relevant matrix is found
by first centering each vector around its mean by subtracting the average:
1 1 1 1 1 1
5 1 1 1 1 1
w1 = [1, 0, 0, 0, 0, 0] [ , , , , , ] = [ , , , , , ]
6 6 6 6 6 6
6 6 6 6 6 6
1 1 1 1 1 1
1 1 1 1 1 1
and w2 = [0, 0, 1, 1, 0, 1] [ , , , , , ] = [ , , , , , ]
2 2 2 2 2 2
2 2 2 2 2 2
The covariance matrix is obtained by multiplying the matrix A whose
rows are w1 and w2 by its transpose (i.e. the matrix whose columns are w1
and w2 ). Here we obtain
5
12
T
6
M = AA =
3
12
2
We interpret the diagonal entries as the variance of votes on each bill: there
was much more variance, i.e. less agreement, on the second bill (3/2) than
on the first (5/6). The off-diagonal entry is the covariance of these two bills,
roughly, whether voting for one was correlated more with voting for the
other, or against the other. Here the negative (1/2) indicates that voting
for the Kansas-loot bill was correlated with a likelihood of voting against
abortion rights, which makes sense in the data since the only legislator to
vote for the Kansas bill also voted against abortion rights.
Intuitively, there was little disagreement over the Kansas bloat bill five
of six legislators were against it while there was a lot of disagreement on
the abortion rights bill, a 50/50 split. So if we want to explain the differences
between legislators, their vote on abortion rights should be a much more
significant indicator than their vote on the Kansas bill, on which they mostly
agreed.
To quantify this, lets find the principal component of M. Its eigenvalue/eigenvector pairs are (using technology)
0.535
1.87
1 1.77, e1
,
2 0.57, e2
1
1
Since the former has the larger eigenvalue, we say that e1 = (0.535, 1) is
the principal component of the covariance. Projecting each legislators (centered) data point onto the e1 direction will then give us a one-dimensional
viewpoint of their differences.
Graphically: if each legislators votes on these two bills (after subtracting
the average vote to center them) were plotted as a point in the xy-plane, they
would appear as the dark blue points in Figure 1.
If (x0 , y0 ) are the coordinates of each point with respect to the basis {e1 , e2 }
of eigenvectors of M, then any data set whose total variance is equal to 1
(for example) and whose covariances are behave identically to our data set
will lie within the circle x02 + y02 = 1. This can be plotted in the xy-plane
using a change of coordinates to have the equation
1 x
(1)
[x y]M
= 1
y
3 2
5
(2)
x + xy + y2 = 1.
2
6
This ellipse is plotted in blue in Figure 1, and represents the maximum
amount of variability in each direction, given a data set with the same covariance matrix as ours. In other words, if these 6 legislators are representative of all 535 legislators in the U.S. Congress, and we plotted the data
points for all 535, the image we get would be the same as Figure 1.
Meanwhile, the eigenvectors e1 , e2 of M are the principal axes of the quadratic form in Equation (2), with the e1 axis being the principal component
of M since its eigenvalue is larger. Geometrically, this corresponds to the
semimajor axis, or the largest diameter of the ellipse. Intuitively, if we view
the ellipse from a viewing angle that is perpendicular to this direction, the
ellipse will appear as wide as possible.
But, viewing the ellipse and the data points in this fashion also visualizes
the projection of each data point onto this principal axis. The coordinate
obtained in this projection is called the principal factor score of each data
point, and these may be arranged along the principal axis as shown in Figure
1. We have therefore reduced the dimension of our data set from 2 to 1,
replacing each legislators actual vote total by their factor score (coordinate
along the principal axis).
Because we have selected the maximal diameter of the ellipse to visualize, we realize the maximal spread along the axis between distinct data
points in the set. We know, for instance, that in the original 2-dimensional
data, Kansas and New York whose votes differed on both bills are much
further apart than Kansas and Louisiana (for instance) who agreed on the
abortion rights bill. This difference from the original data set is maximally
preserved in the one-dimensional factor score.
We then interpret this factor score as follows. The principal direction e1
is more closely aligned (makes a narrower angle) with the y-axis than with
the x-axis, meaning that the vote on the abortion-rights bill weighs more
heavily in the factor score hence explains more of the variance between
legislators than the vote on the Kansas bill. But the vote on the Kansas
bill is not completely omitted, since the principal direction still contains a
nonzero component in the x-direction. This presumably allows the factor
score to differentiate between the legislators from Louisiana and Ohio, who
both voted with Kansas on the abortion-rights bill, from the legislator from
Kansas who departed from them on the Kansas bill.
The factor score for each legislator is then a single number that we may
interpret as the amount to which this legislator combines opposition to
abortion rights (in large measure) with support for Kansas (in smaller measure). The Kansas legislator has the highest combination of these characteristics and so the highest factor score of all six; Louisiana and Ohio
slightly lower because they did not go along on the Kansas bill; and the
other three states much lower negative in fact since they voted in
opposition to Kansas and in support of abortion rights.
3. NOMINATE
Scaling this method up, NOMINATE: (a) combines votes on a much
larger number of issues, so uses a much larger N than we did here; and
(b) instead of selecting only the one principal component, selects the top
two principal components so as to plot each legislator on a two-dimensional
plane using two factor scores instead of one.
Principal
Direction
2
(Centered)
Vote on
Abortion 0
Rights Bill
MD/NY/PA
Yea
Nay
KS
LA/OH
-1
-2
Nay
-3
-3
-2
-1
Yea
0
More opposition
to Kansas bill
& support for
abortion rights bill
-0.4
MD/NY/PA
-0.2
More support
for Kansas bill
& opposition to
abortion rights bill
0.0
0.2
LA/OH
0.4
0.6
KS