/SPCA

This repo contains the codes, images, report and slides for the project of the course - `MTH514A: Multivariate Analysis` at IIT Kanpur during the academic year 2022-2023.

Primary LanguageR

SPCA

This repo contains the R codes, figures, and datasets used in the project for the course - MTH514A: Multivariate Analysis at IIT Kanpur during the academic year 2022-2023.

Project Members

Project Title

A Brief Review of Sparse Principal Components Analysis and its Generalization [Report] [Slides]

Abstract

Principal Component Analysis is a widely studied methodology as it is a useful technique for dimension reduction. In this report, we discuss Sparse Principal Component Analysis (SPCA), which is a modification over PCA. This method is able to resolve the interpretation issue of PCA. Additionally, it provides sparse loadings to the principal components. The main idea of SPCA comes from the relationship between PCA problem and regression analysis. We also discuss GAS-PCA, which is a generalization over SPCA and this method performs better than SPCA, even in finite sample cases. Our report is mainly based on [1] and its extension [2].

Table of Contents

Section Topic
1 Introduction
2 The LASSO and Elastic Net
3 SPCA
    3.1 Direct Sparse Approximation
    3.2 SPCA Criterion
    3.3 Numerical Solution
    3.4 Adjusted Total Variance
    3.5 Computational Complexity
4 GAS-PCA
    4.1 Asymptotic Properties of GAS-PCA
    Optimal Choice of the Kernel Matrix, $\tilde{\Omega}$
5 Examples
    5.1 Synthetic Data Analysis
    5.2 Real Data Analysis
      5.2.1 Pitprops Data
      5.2.2 Teaching Data
6 Conclusion

Primary References

[1] Hui Zou, Trevor Hastie & Robert Tibshirani (2006) Sparse Principal Component Analysis, Journal of Computational and Graphical Statistics, 15:2, 265-286, DOI: 10.1198/106186006X113430

[2] Chenlei Leng & Hansheng Wang (2009) On General Adaptive Sparse Principal Component Analysis, Journal of Computational and Graphical Statistics, 18:1, 201-215, DOI: 10.1198/jcgs.2009.0012