Pre-trained embeddings from US Senate voting records
I am using GovTrack (Congressional web service) to bulk download all US Senate and House of Representative votes for the past 10 sessions of Congress. Each voting record is stored in a JSON file.
Each Senator votes Nay or Yay (or Abstain), we assign values -1 and +1 (and 0) respectively — for each Bill or Vote within each Session of Congress. We collect all these values into a “Voting Matrix.”
Printout of Voting Matrix (+1, 0, -1 represent Yay, N/A, Nay respectively)
We can see the overlap across Session of Congress when plotting the Voting Matrix
Now let us try to make a meaningful embedding space. First we need a covariance matrix of the voting records
*# Assume input data matrix X of size [N x D]*
X **-=** np**.**mean(X, axis **=** 0) *# zero-center the data (important)*
cov **=** np**.**dot(X**.**T, X) **/** X**.**shape[0] *# get the data covariance matrix*
The (i,j) element of the data covariance matrix contains the covariance between i-th and j-th dimension of the data. In particular, the diagonal of this matrix contains the variances. Furthermore, the covariance matrix is symmetric and positive semi-definite.
Now we run SVD to get the Eigenvectors, describing the Voting Matrix
U,S,V **=** np**.**linalg**.**svd(cov)
where the columns of U are the eigenvectors and S is a 1-D array of the singular values. Eigenvectors are sorted by importance, in recreating the original data.
Finally project the original voting matrix into a reduced set of eigenbasis.
Xrot_reduced **=** np**.**dot(X, U[:,:10]) *# Xrot_reduced becomes [N x 10]*
Plotting the first two components, we can already see the spectrum of Senators. Both poticial parties are shown at the extremes. Here the scatter plot is annotate with the Senator’s name.
Projection of Senator into 2-dimentions
We can take it a step further, and look at how the “median” representative from each party changes between each session of Congress.
Blue lines point to the median of each Session, Red lines point to the median counterpart
Pre-trained vectors can be found here: sughodke/Senator2Vec Senator2Vec - US Senator embeddingsgithub.com
Similar Senators