Different results in Optuna best value and re-train
Closed this issue · 3 comments
Datasplit:
data = normalized_df_ros.to_numpy()
target = y_train_oversample
train_x, valid_x, train_y, valid_y = train_test_split(data, target, test_size=0.25, random_state = 52)
I have used the code below for Optuna hyperparameter tuning:
def objective_ros(trial):
dtrain = xgb.DMatrix(train_x, label=train_y)
dvalid = xgb.DMatrix(valid_x, label=valid_y)
param = {
"objective": "binary:logistic",
"booster": trial.suggest_categorical("booster", ["gbtree", "gblinear", "dart"]),
"lambda": trial.suggest_loguniform("lambda", 1e-8, 1.0),
"learning_rate": trial.suggest_loguniform("alpha", 1e-8, 1.0),
"subsample": trial.suggest_float("subsample", 0.05, 1.0),
}
if param["booster"] == "gbtree" or param["booster"] == "dart":
param["max_depth"] = trial.suggest_int("max_depth", 1, 9)
param["eta"] = trial.suggest_loguniform("eta", 1e-8, 1.0)
param["gamma"] = trial.suggest_loguniform("gamma", 1e-8, 1.0)
param["grow_policy"] = trial.suggest_categorical("grow_policy", ["depthwise", "lossguide"])
if param["booster"] == "dart":
param["sample_type"] = trial.suggest_categorical("sample_type", ["uniform", "weighted"])
param["normalize_type"] = trial.suggest_categorical("normalize_type", ["tree", "forest"])
param["rate_drop"] = trial.suggest_loguniform("rate_drop", 1e-8, 1.0)
param["skip_drop"] = trial.suggest_loguniform("skip_drop", 1e-8, 1.0)
bst = xgb.XGBClassifier(**param, random_state = 52)
bst.fit(valid_x, valid_y)
preds = bst.predict(valid_x)
f1 = f1_score(valid_y, preds, average='micro')
return f1
if __name__ == "__main__":
"""
The optuna creates the study to optimize the objective function with 100 trials and return the best trial
that has the maximum F1-Score and the best hyperparameters of that trial
"""
study = optuna.create_study(
pruner=optuna.pruners.MedianPruner(n_warmup_steps=5), direction="maximize"
)
study.optimize(objective_ros, n_trials=3)
print(study.best_trial)
print("Number of finished trials: {}".format(len(study.trials)))
print("Best trial:")
trial = study.best_trial
print(" F1 Score: {}".format(trial.value))
print(" Best Hypeparameters: ")
for key, value in trial.params.items():
print(" {}: {}".format(key, value))
Output:
FrozenTrial(number=2, state=TrialState.COMPLETE, values=[0.9856801909307876], datetime_start=datetime.datetime(2024, 5, 8, 15, 35, 16, 222522), datetime_complete=datetime.datetime(2024, 5, 8, 15, 36, 2, 629269), params={'booster': 'dart', 'lambda': 0.002311280531979015, 'alpha': 0.0002916755277581915, 'subsample': 0.9746150609390157, 'max_depth': 6, 'eta': 1.2123850647079977e-05, 'gamma': 7.277285195329304e-05, 'grow_policy': 'lossguide', 'sample_type': 'weighted', 'normalize_type': 'tree', 'rate_drop': 0.0010080385982105485, 'skip_drop': 0.002532401998679777}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'booster': CategoricalDistribution(choices=('gbtree', 'gblinear', 'dart')), 'lambda': FloatDistribution(high=1.0, log=True, low=1e-08, step=None), 'alpha': FloatDistribution(high=1.0, log=True, low=1e-08, step=None), 'subsample': FloatDistribution(high=1.0, log=False, low=0.05, step=None), 'max_depth': IntDistribution(high=9, log=False, low=1, step=1), 'eta': FloatDistribution(high=1.0, log=True, low=1e-08, step=None), 'gamma': FloatDistribution(high=1.0, log=True, low=1e-08, step=None), 'grow_policy': CategoricalDistribution(choices=('depthwise', 'lossguide')), 'sample_type': CategoricalDistribution(choices=('uniform', 'weighted')), 'normalize_type': CategoricalDistribution(choices=('tree', 'forest')), 'rate_drop': FloatDistribution(high=1.0, log=True, low=1e-08, step=None), 'skip_drop': FloatDistribution(high=1.0, log=True, low=1e-08, step=None)}, trial_id=2, value=None)
Number of finished trials: 3
Best trial:
F1 Score: 0.9856801909307876
Best Hypeparameters:
booster: dart
lambda: 0.002311280531979015
alpha: 0.0002916755277581915
subsample: 0.9746150609390157
max_depth: 6
eta: 1.2123850647079977e-05
gamma: 7.277285195329304e-05
grow_policy: lossguide
sample_type: weighted
normalize_type: tree
rate_drop: 0.0010080385982105485
skip_drop: 0.002532401998679777
Using the best hyperparameters from Optuna, I re-trianed the 'XGBClassifier'
Code:
clf = xgb.XGBClassifier(**best_params, random_state = 52)
clf.fit(train_x, train_y)
valid_pred_ros = clf.predict(valid_x)
f1 = f1_score(valid_y, valid_pred_ros, average='micro')
print("validation f1_score : ", f1)
Output:
validation f1_score : 0.937947494033413
Now, you can see the best value from Optuna is F1 Score: 0.9856801909307876
, whereas the validation F1-score that is trained with the same hyperparameters and same XGBclassifier is validation f1_score : 0.937947494033413
.
my question is why the values are different here though the parameters are the same?
Could you post an optuna question on https://github.com/optuna/optuna/discussions since the question is not related to this repo, optuna-example?
Could you post an optuna question on https://github.com/optuna/optuna/discussions since the question is not related to this repo, optuna-example?
Done.
Thanks!