Quantile regression with lightgbm not possible
Closed this issue · 5 comments
Quantile regression in lgb.train()
requires objective = 'quantile'
, which does not work when we use {treesnip} because objective
supplied in set_engine()
is not respected.
library(magrittr)
# boost_tree
model <- parsnip::boost_tree(mtry = NULL, trees = 5, mode = 'regression', learn_rate = .1)
library(treesnip)
model <- parsnip::set_engine(model, "lightgbm", objective = "quantile", metric = "l2", alpha = 0.1)
set.seed(4)
data <- matrix(rnorm(1000), ncol = 4) %>%
tibble::as_tibble() %>%
dplyr::mutate(y = sample(1000/4))
fit <- model %>%
parsnip::fit(y ~., data = data)
predict(fit, data)
The problem is that objective
and others are overwritten in train_lightgbm()
. I think arguments such as objective
in ...
in train_lgb()
should always have precedence in this function over the ones derived form other variables.
What was the rationale behind giving others
lower precedence in https://github.com/curso-r/treesnip/blob/master/R/lightgbm.R#L237? I think other
should have presedence, we could achieve this by replacing the above line with arg_list <- modifyList(arg_list, others)
and remove the merging of the two later in the code. I can make a PR if you want.
I don't think we have a strong argument for giving others
a lower precedence. I think we followed what parsnip does for xgboost
, see: https://github.com/tidymodels/parsnip/blob/master/R/boost_tree.R#L382-L384
I agree with your suggestion to give others
a higher precedence, so you can use whichever objective
you want. We might need to implement a https://github.com/tidymodels/parsnip/blob/9333a0d08764c28eb12337e0bc95160a20462356/R/predict_quantile.R#L9 too.
Well I have the apprehension that Max had a reason to do it that way... Maybe he wanted to make sure you can't override things that are not engine-specific in set_engine()
. Probably the same problem exist when you want to use another objective in {parsnip} with xgboost than 'regression' or 'classification'? There are quite a number of objectives in xgboost.
The quantile method sounds very cool too 🎉. I am not familiar enough with parsnip though to contribute that now unfortunately.
Two things:
- I don't think implementing a quantile method as you suggested in #24 (comment) will work here because lightgbm's quantile regression is different from other algorithms that implment quantile methods. It's not an argument to
predict(lightgbm_model, ...)
, but the quantile needs to be set on training. - For consistency, I opened an issue in https://github.com/tidyverse/parsnip to discuss the general handling of this: tidymodels/parsnip#403
Update: My suggested changes were incorporated into parsnip for xgboost in tidymodels/parsnip#403 and @topepo said he'd also submit a PR for this here.
I've reproduced @topepo's logic to allow pass objective to set_engine() for both catboost and lightgbm! Thank you!
tidymodels/parsnip#403