ValueError exception in loss function
jcnossen opened this issue · 1 comments
jcnossen commented
Line 230 of /decode/neuralfitter/loss.py is sometimes producing a "ValueError: The parameter probs has invalid values", due to numerical inaccuracies in a sum on CUDA. It will trigger the Simplex constraint check in the Categorical distribution:
return torch.all(value >= 0, dim=-1) & ((value.sum(-1) - 1).abs() < 1e-6)
This can be fixed by setting validate_args=False:
mix = distributions.Categorical(prob_normed[p_inds].reshape(batch_size, -1), validate_args=False)
Haydnspass commented
Hey,
Thanks for reporting this.
In my experience this happened if the simulation parameters are somewhat odd. Do you have a parameter file that can reproduce this error reliably?
Did the training run stable when you disabled argument validation?