bagheri365/bagheri365.github.io

Eigenvector Index should have been in each column instead of row

Opened this issue · 0 comments

Hi Alireza,

I am writing to inform you of a mathematical error in your Blog post about “Python implementations of Principal Component Analysis”.

First, thank you for this tutorial. It helped me understand PCA.

The math error occurred at the blog attached, in “Step 4: Rearrange the eigenvectors and eigenvalues" (code chunk 6, line 2).

Instead of:

# We first make a list of (eigenvalue, eigenvector) tuples
eig_pairs = [(np.abs(eig_vals[i]), eig_vecs[i,:]) for i in range(len(eig_vals))]

It should have been:

# the “i" should have been in column index instead of row index
eig_pairs = [(np.abs(eig_vals[i]), eig_vecs[:,i]) for i in range(len(eig_vals))]

The “i" eigenvector should have been a row rather than a column index.

There is a simple check if eigenvectors are placed in row index or column index.

## assume eigenvector are in columns
assert np.all(cov_mat @ eig_vecs[:,0] == eig_vecs[:,0] * eig_vals[0])
## assume eigenvector are in rows
assert np.all(cov_mat @ eig_vecs[0,:] == eig_vecs[0,:] * eig_vals[0])

This also resonate with Numpy's official documentation.

Hope this helps!