Store scaling parameters when doing PCA with scaling
bapike opened this issue · 0 comments
bapike commented
When scaling, stats::prcomp
stores the calculated scaling values in the returned object, while irlba::prcomp_irlba
only stores scale=TRUE. This behavior doesn't match up with the documentation for irlba::prcomp_irlba
.
Storing the scaling values makes it possible to apply the fitted PCA model object to other datasets.
I'm happy to write a patch, though it looks like pull request #52 rewrites irlba::prcomp_irlba
; I haven't checked to see if the problem exists there.
Some code to see the difference:
library(irlba)
set.seed(1234)
r<-100L
c<-10L
M<-matrix(data=runif(r*c),nrow=r,ncol=c)
# scaling and centering
builtin<-prcomp(M,rank.=4,center=TRUE,scale.=TRUE)
str(builtin$scale) # a numeric vector
summary(builtin$x-( sweep(sweep(M,2,builtin$center),2,builtin$scale,FUN=`/`) %*% builtin$rotation ))
packaged<-prcomp_irlba(M,n=4,center=TRUE,scale.=TRUE)
str(packaged$scale) # the logical TRUE
scaling<-apply(M,2,sd)
summary(packaged$x-( sweep(sweep(M,2,packaged$center),2,scaling,FUN=`/`) %*% packaged$rotation ))
# just scaling. Uses RMS
RMS <- function (v) sqrt(sum(v^2)/(length(v)-1))
builtin<-prcomp(M,rank.=4,center=FALSE,scale.=TRUE)
str(builtin$scale) # a numeric vector
summary(builtin$x-( sweep(M,2,builtin$scale,FUN=`/`) %*% builtin$rotation ))
packaged<-prcomp_irlba(M,n=4,center=FALSE,scale.=TRUE)
str(packaged$scale) # the logical TRUE
scaling<-apply(M,2,RMS)
summary(packaged$x-( sweep(M,2,scaling,FUN=`/`) %*% packaged$rotation ))