FAMD explained inertia == PCA explained inertia

Question

FAMD explained inertia == PCA explained inertia

orbisvicis opened this issue 2 years ago · 1 comments

FAMD explained inertia is almost identical to PCA explained variance with one-hot encoding.

Given a mixed categorical/numeric dataframe where categorical data is encoded as str/object and numeric as int64:

FAMD total inertia is only 1 when the number of components equals the number of columns after one-hot encoding.
The explained inertia from prince.FAMD nearly matches the explained variance from sklearn.decomposition.PCA.
prince.FAMD.column_correlations shows the one-hot encoded columns.

This can't be so? I'd compare the eigenvectors but I don't think those are available in prince... so I can't think of any reason to use FAMD.

Answer 1 · 2022-06-22T00:23:58.000Z

The results from prince.FAMD match those from R's FactoMineR. Does FAMD approach PCA as the number of samples increases, or do I just have one of those (difficult) datasets? The data has a large sample size and is mostly categorical with many categories and no categorical outliers.