glmnet not supported
jkylearmstrong opened this issue · 7 comments
library('OddsPlotty')
library('glmnet')
library('mlbench')
data("BreastCancer", package = "mlbench")
#Use complete cases of breast cancer
breast <- BreastCancer[complete.cases(BreastCancer), ] #Create a copy
breast <- breast[, -1]
head(breast, 10)
#Convert the class to a factor - Beningn (0) and Malignant (1)
breast$Class <- factor(breast$Class)
str(breast)
for(i in 1:9) {
breast[, i] <- as.numeric(as.character(breast[, i]))
}
#Loops through the first columns - 1 to 9 and changes them from factors to a numerical representation
str(breast)
library('caret')
glm_model <- caret::train(Class ~ .,
data = breast,
method = "glmnet",
family = "binomial")
OddsPlotty::odds_plot(glm_model$finalModel,
title = "Odds Plot",
subtitle = "Showing odds of cancer based on various factors")
#> Error in UseMethod("vcov") : no applicable method for 'vcov' applied to an object of class "c('lognet', 'glmnet')"
Yes, I just checked the CRAN version downloading it and you need to make sure that when you train your model you use caret. Plus, the method="glmnet" should be just "glm".
glmnet is a logistic regression model - this example will compute the odds ratio and errors:
glm_model$finalModel %>%
broom::tidy() %>%
group_by(term) %>%
summarise(
mean_est = mean(estimate , na.rm=TRUE),
sd = sd(estimate , na.rm = TRUE),
.groups = 'keep') %>%
ungroup() %>%
mutate(Odds_Ratio = if_else(term != '(Intercept)', exp(mean_est), as.numeric(NA))) %>%
ggplot(aes(x= Odds_Ratio , y= term)) +
geom_point() +
geom_errorbarh(aes(xmax = exp(mean_est + sd), xmin = exp(mean_est - sd), height = .2)) +
ggtitle(expression("Comparing model estimates of Odds Ratios" ~ exp(beta[i])))
There is also an issue with method = "bayesglm"
glm_model <- caret::train(Class ~ .,
data = breast,
method = "bayesglm")
the broom
package provides a warning:
glm_model$finalModel %>%
broom::tidy() %>%
mutate(Odds_Ratio = if_else(term != '(Intercept)', exp(estimate), as.numeric(NA))) %>%
ggplot(aes(x= Odds_Ratio , y= term)) +
geom_point() +
geom_errorbarh(aes(xmax = exp(estimate + std.error), xmin = exp(estimate - std.error), height = .2)) +
ggtitle(expression("Comparing model estimates of Odds Ratios" ~ exp(beta[i])))
but if we perform the computation by grabbing all of the elements things look the same:
tibble(
term = names(glm_model$finalModel$coefficients),
mean_est = glm_model$finalModel$coefficients,
sd = glm_model[["finalModel"]][["prior.sd"]]
) %>%
mutate(Odds_Ratio = if_else(term != '(Intercept)', exp(mean_est), as.numeric(NA))) %>%
ggplot(aes(x= Odds_Ratio , y= term)) +
geom_point() +
geom_errorbarh(aes(xmax = exp(mean_est + sd), xmin = exp(mean_est - sd), height = .2)) +
ggtitle(expression("Comparing model estimates of Odds Ratios" ~ exp(beta[i])))
@jkylearmstrong the glmnet bootstrapping confidence intervals are not correct for this function. I think by averaging the standard error across each interval is incorrect. I have managed to fix the bayesglm issues, but the confidence limits look very different to the logistic regression: