`NeptuneCallback` produces lots of `X-coordinates (step) must be strictly increasing` errors
iirekm opened this issue · 1 comments
Bug description
When Optuna is run in parallel mode (n_jobs=-1
), with NeptuneCallback
, I get:
[neptune] [error ] Error occurred during asynchronous operation processing: X-coordinates (step) must be strictly increasing for series attribute: trials/values. Invalid point: 0.0
It's normal that during parallel or distributed hyperparam optimization, information become unordered. Either Neptune should support adding steps out of order, or NeptuneCallback
should support it somehow (e.g. by using an artificial step number).
What version are you seeing the problem on?
v1.x
How to reproduce the bug
study.optimize(..., callbacks=[NeptuneCallback(run)], n_jobs=-1)
Error messages and logs
[neptune] [error ] Error occurred during asynchronous operation processing: X-coordinates (step) must be strictly increasing for series attribute: trials/values. Invalid point: 0.0
Environment
Any multi-threaded environment.
More info
No response
Hi @iirekm. I had this same problem when working with Neptune. I was logging metrics during the train, val and test phases. I later realized that I was using the same names for the metrics in the metrics dictionary. Sometimes I was even suing the same Torchmetrics instance in all the phases. Perhaps you're doing the same and could you check it again? I am not a pro at this. Just hoping that it is the same gotcha as mine. Sorry if it doesn't work.