ValueError exception in loss function

Question

ValueError exception in loss function

jcnossen opened this issue 3 years ago · 1 comments

Line 230 of /decode/neuralfitter/loss.py is sometimes producing a "ValueError: The parameter probs has invalid values", due to numerical inaccuracies in a sum on CUDA. It will trigger the Simplex constraint check in the Categorical distribution:

return torch.all(value >= 0, dim=-1) & ((value.sum(-1) - 1).abs() < 1e-6)

This can be fixed by setting validate_args=False:

mix = distributions.Categorical(prob_normed[p_inds].reshape(batch_size, -1), validate_args=False)

Answer 1 · 2021-11-16T13:09:12.000Z

Hey,

Thanks for reporting this.
In my experience this happened if the simulation parameters are somewhat odd. Do you have a parameter file that can reproduce this error reliably?

Did the training run stable when you disabled argument validation?