jeongyoonlee/Kaggler

ValueError: For early stopping, at least one dataset and eval metric is required for evaluation

yassineAlouini opened this issue · 3 comments

When I run AutoLGB using objective="regression" and metric="neg_mean_absolute_error", I get an ValueError: For early stopping, at least one dataset and eval metric is required for evaluation error.
Here is the complete stacktrace:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-13-4052be1dbbca> in <module>
      3 model = AutoLGB(metric="neg_mean_absolute_error", 
      4                 objective="regression")
----> 5 model.tune(X_train, y_train)
      6 model.fit(X_train, y_train)

/opt/conda/lib/python3.6/site-packages/kaggler/model/automl.py in tune(self, X, y)
    114             self.features = self.select_features(X_s,
    115                                                  y_s,
--> 116                                                  n_eval=self.n_fs)
    117             logger.info('selecting {} out of {} features'.format(
    118                 len(self.features), X.shape[1])

/opt/conda/lib/python3.6/site-packages/kaggler/model/automl.py in select_features(self, X, y, n_eval)
    164             random_cols.append(random_col)
    165 
--> 166         _, trials = self.optimize_hyperparam(X.values, y.values, n_eval=n_eval)
    167 
    168         feature_importances = self._get_feature_importance(

/opt/conda/lib/python3.6/site-packages/kaggler/model/automl.py in optimize_hyperparam(self, X, y, test_size, n_eval)
    258         best = hyperopt.fmin(fn=objective, space=self.space, trials=trials,
    259                              algo=tpe.suggest, max_evals=n_eval, verbose=1,
--> 260                              rstate=self.random_state)
    261 
    262         hyperparams = space_eval(self.space, best)

/opt/conda/lib/python3.6/site-packages/hyperopt/fmin.py in fmin(fn, space, algo, max_evals, trials, rstate, allow_trials_fmin, pass_expr_memo_ctrl, catch_eval_exceptions, verbose, return_argmin, points_to_evaluate, max_queue_len, show_progressbar)
    387             catch_eval_exceptions=catch_eval_exceptions,
    388             return_argmin=return_argmin,
--> 389             show_progressbar=show_progressbar,
    390         )
    391 

/opt/conda/lib/python3.6/site-packages/hyperopt/base.py in fmin(self, fn, space, algo, max_evals, max_queue_len, rstate, verbose, pass_expr_memo_ctrl, catch_eval_exceptions, return_argmin, show_progressbar)
    641             catch_eval_exceptions=catch_eval_exceptions,
    642             return_argmin=return_argmin,
--> 643             show_progressbar=show_progressbar)
    644 
    645 

/opt/conda/lib/python3.6/site-packages/hyperopt/fmin.py in fmin(fn, space, algo, max_evals, trials, rstate, allow_trials_fmin, pass_expr_memo_ctrl, catch_eval_exceptions, verbose, return_argmin, points_to_evaluate, max_queue_len, show_progressbar)
    406                     show_progressbar=show_progressbar)
    407     rval.catch_eval_exceptions = catch_eval_exceptions
--> 408     rval.exhaust()
    409     if return_argmin:
    410         return trials.argmin

/opt/conda/lib/python3.6/site-packages/hyperopt/fmin.py in exhaust(self)
    260     def exhaust(self):
    261         n_done = len(self.trials)
--> 262         self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
    263         self.trials.refresh()
    264         return self

/opt/conda/lib/python3.6/site-packages/hyperopt/fmin.py in run(self, N, block_until_done)
    225                     else:
    226                         # -- loop over trials and do the jobs directly
--> 227                         self.serial_evaluate()
    228 
    229                     try:

/opt/conda/lib/python3.6/site-packages/hyperopt/fmin.py in serial_evaluate(self, N)
    139                 ctrl = base.Ctrl(self.trials, current_trial=trial)
    140                 try:
--> 141                     result = self.domain.evaluate(spec, ctrl)
    142                 except Exception as e:
    143                     logger.info('job exception: %s' % str(e))

/opt/conda/lib/python3.6/site-packages/hyperopt/base.py in evaluate(self, config, ctrl, attach_attachments)
    846                 memo=memo,
    847                 print_node_on_error=self.rec_eval_print_node_on_error)
--> 848             rval = self.fn(pyll_rval)
    849 
    850         if isinstance(rval, (float, int, np.number)):

/opt/conda/lib/python3.6/site-packages/kaggler/model/automl.py in objective(hyperparams)
    248                               valid_data,
    249                               early_stopping_rounds=self.n_stop,
--> 250                               verbose_eval=0)
    251 
    252             score = (model.best_score["valid_0"][self.params["metric"]] *

/opt/conda/lib/python3.6/site-packages/lightgbm/engine.py in train(params, train_set, num_boost_round, valid_sets, valid_names, fobj, feval, init_model, feature_name, categorical_feature, early_stopping_rounds, evals_result, verbose_eval, learning_rates, keep_training_booster, callbacks)
    231                                         begin_iteration=init_iteration,
    232                                         end_iteration=init_iteration + num_boost_round,
--> 233                                         evaluation_result_list=evaluation_result_list))
    234         except callback.EarlyStopException as earlyStopException:
    235             booster.best_iteration = earlyStopException.best_iteration + 1

/opt/conda/lib/python3.6/site-packages/lightgbm/callback.py in _callback(env)
    209     def _callback(env):
    210         if not cmp_op:
--> 211             _init(env)
    212         if not enabled[0]:
    213             return

/opt/conda/lib/python3.6/site-packages/lightgbm/callback.py in _init(env)
    190             return
    191         if not env.evaluation_result_list:
--> 192             raise ValueError('For early stopping, '
    193                              'at least one dataset and eval metric is required for evaluation')
    194 

ValueError: For early stopping, at least one dataset and eval metric is required for evaluation

The pandas version is: 0.23.4
The ligthgbm version is: 2.2.3
The error might be due to the lightgbm version?

Thanks for filing the issue, @yassineAlouini.

It was due to the metric name that is not the standard LightGBM metric name but is an alias to the standard one. To accommodate both the standard and alias names, I added _get_metric_alias_minimize() to model.AutoLGB at d5fe02b.

With the change, you can use either mae, l1, or mean_absolute_error as a metric.

Please upgrade kaggler to 0.8.2 using pip install -U kaggler and see if it works.

Thanks @jeongyoonlee, it works with the listed metrics above.
However, I still have the same error when using neg_mean_absolute_error.
The fix is enough for me of course but I was wondering if it might be a good idea to raise an error when the metric name isn't as expected?

That makes sense. I will open another ticket for it. Thanks!