Scripts with own implementations of PCA, EVD, and MDS methods used for visualization.
I've tested results of various decomposition methods:
- Custom MDS
- Sklearn MDS
- Sklearn Isomap
- Sklearn tSNE
- Sklearn Locally Linear embedding
I used Kaggle's "Weedle's Cave" dataset to visualize distances between Pokémons. To get use of Pokémon types I've changed raw string types such as "Grass", "Poison" to strengths and weaknesses against all other types. In result for example water and fire types are close because Pokémons of this types are strong against fire and ground but weak against grass.
Visualisation shows that next evolutions of Pokémons are often near each other. Also "Mega" Pokémons are close in visualization even if there wasn't direct information about that in dataset, what is really fascinating.
With SVD it is also possible to compress images. After constructing U, T and Vt matrices such as:
Where each matrix has defined size.
A (m x n)
U (m x m)
T (m x n)
Vt (n x n)
We can take k rows or columns to compress data, result with sizes:
A (m x n)
U (m x k)
T (k x k) (diagonal matrix)
Vt (k x n)
Summarizing, if image has 3 channels we can compress it with:
What it is less than original size
Have in mind that there are sophisticated algorithms for image compression such as JPG which can do it better.
usage: compress.py [-h] -f INPUT_FILE [-out OUTPUT_FILE]
[-svd {sklearn,custom,numpy}] [-k K]
compress.py: error: the following arguments are required: -f