XGB2Regressor vs XGBRegressor
srbPhy opened this issue · 3 comments
srbPhy commented
Hi, it seems a model trained using XGB2Regressor is slightly different than the one obtained using regular XGBRegressor. For instance, if we run the following code, I get slightly different predictions for the test data. I am sure I am missing something, but I am unable to figure it out. Could you please help?
from piml import Experiment
from piml.models import XGB2Regressor
from xgboost import XGBRegressor
exp = Experiment(highcode_only=True)
exp.data_loader(data='BikeSharing', silent=True)
exp.data_prepare(target='cnt', task_type='regression', test_ratio=0.2, random_state=0, silent=True)
model1 = XGB2Regressor()
exp.model_train(model=model1, name='XGB2')
model2 = XGBRegressor(max_depth=2)
exp.model_train(model=model2, name='XGB2-default')
print(model1.predict(exp.get_data(test=True)[0]))
print(model2.predict(exp.get_data(test=True)[0]))
[-0.04393188 0.03837352 0.4268577 ... 0.02106261 -0.00260242
0.34881094]
[-0.03740007 0.03996139 0.42402536 ... 0.02290548 0.0015662
0.3511871 ]
yodiaditya commented
Confirmed I also have the same result
[-0.04393188 0.03837352 0.4268577 ... 0.02106261 -0.00260242
0.34881094]
[-0.03740007 0.03996139 0.42402536 ... 0.02290548 0.0015662
0.3511871 ]
ZebinYang commented
Hi @yodiaditya and @srbPhy
The results difference is due to the use of different default hyperparameters.
You would get the same results using the following codes.
from piml import Experiment
from piml.models import XGB2Regressor
from xgboost import XGBRegressor
exp = Experiment(highcode_only=True)
exp.data_loader(data='BikeSharing', silent=True)
exp.data_prepare(target='cnt', task_type='regression', test_ratio=0.2, random_state=0, silent=True)
model1 = XGB2Regressor()
exp.model_train(model=model1, name='XGB2')
params = exp.get_model("XGB2").estimator.estimator_.get_params()
model2 = XGBRegressor(**params)
exp.model_train(model=model2, name='XGB2-default')
print(model1.predict(exp.get_data(test=True)[0]))
print(model2.predict(exp.get_data(test=True)[0]))
srbPhy commented
Thank you very much for your quick response. That makes sense.