ck37/varimpact

Error message: unused argument (X = train)

hanson1005 opened this issue · 12 comments

Hi. While using your package, I got an error message below:

train <- as.data.frame(train)
Q_lib <- c("SL.randomForest", "SL.glmnet", "SL.svm")
g_lib <- c("SL.randomForest", "SL.glmnet", "SL.svm")
vim <- varimpact(Y = label, X = train, Q.library = Q_lib, g.library = g_lib)
Error in varimpact(Y = label, X = train, Q.library = Q_lib, g.library = g_lib) :
unused argument (X = train)

Y is a numeric vector with the length of 1000. X has 1132 variables with 1000 observations. 1132 variables are words from raw document. Entries in each variable are numeric.

I used the same data (X) and the dependent variable (Y) when fitting a Super Learner model:

SLmodel <- SuperLearner(Y = label, X = train, newX=m113_114,
SL.library=c("SL.randomForest", "SL.glmnet", "SL.svm"),
method = "method.NNLS", verbose=TRUE)

It worked fine!
Could you tell me how to fix this issue? What's the problem with my "train" data?
Thank you!

ck37 commented

Hello,

Please specify the training data using the "data = " function argument rather than "X = ". You can see that in the examples here: https://github.com/ck37/varimpact#examples

Thanks,
Chris

Hi, Chris.
Oh right. Sorry for the mistake. I just realized it and was going to fix the question.
So, I did that. Since my y variable is continuous variable, I specified the family as "gaussian" and got following error message:

vim <- varimpact(Y = label, data = train, Q.library = Q_lib, g.library = g_lib, family="gaussian")
Error in arules::discretize(Xt, method = "frequency", categories = bins_numeric, :
Some breaks are not unique, use fewer breaks for the data.

Could you please help me with solving this problem? Thank you.

ck37 commented

Yeah an update to the arules package is causing that and I'm trying to fix. Does installing this beta version of varimpact resolve that?

devtools::install_github("ck37/varimpact@revamp")

Otherwise I might need another week or two to get it working.

Still not working...

ck37 commented

Are you getting the same error or a different error?

More importantly though, I should clarify that this package is not intended to provide variable importance for a particular SuperLearner, it's addressing a different question.

This package might be more relevant to what you're looking to do though: https://github.com/bdwilliamson/vimp

ck37 commented

Ok you might try restarting your R session (Session -> Restart R in RStudio) and trying one more time - a lot of times when R reinstalls a loaded package it doesn't use the new version unless it is restarted.

ck37 commented

Closing this for now - feel free to re-open if more discussion is helpful.

ck37 commented

Oh it doesn't have to be closed, but usually github repository maintainers will close troubleshooting issues if they've been inactive for a while, as a way to stay organized.