Distribution error: Tensor of shape..

Question

Distribution error: Tensor of shape..

Closed this issue a year ago · 16 comments

First of all thanks for the new pytorch version.
I've been using the previous versions and today saw a new version and wanted to give it a try, with the data and code that worked fine in previous version.

After I edited the old code with the differences of the new version, following the examples, i've noticed some problems with distributions.
So when I give my label data to the optimization or training I get something on the lines of:

Expected value argument (Tensor of shape (8700,)) to be within the support (GreaterThan(lower_bound=0.0)) of the distribution LogNormal(), but found invalid values:
tensor([0.2782, 0.3064, 0.3202,  ..., 0.3338, 0.3202, 0.3202])

or

Expected parameter scale (Tensor of shape ()) of distribution Normal(loc: -974.14453125, scale: -1210.6866455078125) to satisfy the constraint GreaterThan(lower_bound=0.0), but found invalid values:
-1210.6866455078125

This happens with every distribution that I've tried.
My dataset is between 0.09 and 0.9 in value, I've tried with similar datasets and got similar results. With one dataset I've managed to run the model by multiplying the values by 10, but for other datasets does not work.

Reminding that all these datasets worked fine with previous version. Do you know what might be the reason?

Answer 1 · 2023-05-23T17:38:01.000Z

@sirdawar Thanks for the interest in the project.

My dataset is between 0.09 and 0.9 in value, ....

This seems like a Beta distributed dataset, so fitting a continuous real distribution like the LogNormal/Normal can cause stability issues. But since you've tried other distributions also, this might not be the reason. The reason you receive the error is that the parameter constraint for the scale parameter is not met: it has to be positive but in your example, it has negative values.

I suspect that it has to do with the starting values. I look into it and report back.

Answer 2 · 2023-05-23T23:11:48.000Z

Hi there, I believe I have potentially ran into the same issue of predicted distribution shape. I've been using this library for a few months, with the "Expectile" distribution. After the upgrade to pytorch, I noticed a dramatic shift in the scale and location of predicted data distributions. It seems that test predictions are being re-centered at zero, and also the expectile flip in orientation i.e. 0.95 with lower valued distribution that 0.05.

I made some adjustments to synthetic gaussian simulation example and found the same case to be true.

Answer 3 · 2023-05-24T06:28:24.000Z

@maxfield-green Thanks for your good analysis.

Would you be so kind to share the notebook, so that I can look into and replicate the issue. Thanks!

It looks like that the start_values are not correctly used: to initialize the models, we use unconditional MLE. For this to work, we set the base_score=0 as shown here base_margin. We then initialize the train and test set with the start_values, as shown here https://github.com/StatMixedML/XGBoostLSS/blob/master/xgboostlss/model.py#L124. This lack of the start_values causes the predictions to be centered around 0 I suppose.

Answer 4 · 2023-05-24T14:51:31.000Z

Here is a link to the notebook : https://github.com/maxfield-green/XGBoostLSS/blob/master/examples/simulation_example_Expectile_v2.ipynb

Thank you for your quick response, I'm not seeing that parameter used in the example, but from a theoretic standpoint this makes sense. I hadn't been taking any steps to set the base_score=0 in past modeling work with the library.

Answer 5 · 2023-05-24T15:50:27.000Z

@maxfield-green Ok cool, thanks for the notebook.

Setting base_margin is not something you do, but it is implicit in the code. So no worries, nothing you can account for.

@sirdawar, @maxfield-green I report back once your problems are fixed.

Answer 6 · 2023-05-25T13:19:40.000Z

@sirdawar, @maxfield-green I have made changes to the corresponding functions. Can you please re-install the package and test if it is working now. Thanks.

Answer 7 · 2023-05-25T13:55:44.000Z

@StatMixedML Just tried again using your distribution select example and also the optimization example. And still getting similar errors.
I attached one example of X_train and Y_train data that you can check.

Expected parameter loc (Tensor of shape (1, 1)) of distribution Normal(loc: tensor([[nan]], grad_fn=), scale: tensor([[nan]], grad_fn=)) to satisfy the constraint Real(), but found invalid values:
tensor([[nan]], grad_fn=)

Traindata_for xgblss_test.zip

Answer 8 · 2023-05-25T14:46:05.000Z

@sirdawar I have made changes to the corresponding functions. Can you please re-install the package and test if it is working now. Also, please use the following notebook
How To Choose A Distribution.zip

Answer 9 · 2023-05-25T15:10:03.000Z

That works now for choosing distribution, but still can't make to run hyperparameter tuning.


Expected parameter concentration (Tensor of shape (13850, 1, 2)) of distribution Dirichlet(concentration: torch.Size([13850, 1, 2])) to satisfy the constraint IndependentConstraint(GreaterThan(lower_bound=0.0), 1), but found invalid values:
tensor([[[0.0302,    nan]],

        [[0.7349,    nan]],

        [[2.3822,    nan]],

        ...,

        [[1.7256,    nan]],

        [[0.0510,    nan]],

        [[0.0316,    nan]]])

Answer 10 · 2023-05-26T07:16:23.000Z

@sirdawar Can you re-install the package and try again with the following notebook.
Beta Example.zip

Answer 11 · 2023-05-26T08:49:31.000Z

This now works, I tried Beta, LogNormal and Gaussian.

Beta and LogNormal works fine, with Gaussian it runs optimization, but then when running predict it gives again the error:

Expected parameter loc (Tensor of shape (4210,)) of distribution Normal(loc: torch.Size([4210]), scale: torch.Size([4210])) to satisfy the constraint Real(), but found invalid values:
tensor([nan, nan, nan,  ..., nan, nan, nan])

Answer 12 · 2023-05-26T12:31:08.000Z

@sirdawar Glad that the Beta and LogNormal are now working.

Given the Skewness and the support (0,1) of the data, I wouldn't recommend using a symmetric distribution like the Gaussian, since this very likely leads to numerical instabilities. This is also what you see from your output.

In case you still want to use the Gaussian you can try and

change the response function using softplus
transform the response via. e.g., y*10 and see if this is running

I have attached an example that runs end-to-end using the Gaussian. I used the softplus instead of exp when initializing the distribution.

Thanks again for using the package!

Gaussian Example.zip

Answer 13 · 2023-05-26T12:33:08.000Z

@maxfield-green Do you still have the problems when estimating expectiles?

Answer 14 · 2023-05-26T12:38:21.000Z

Thanks for this, yes with softplus it works.
Could you explain in more details how does the new version differ from the old one? Can we expect different results with different versions?

Answer 15 · 2023-05-26T14:28:10.000Z

@sirdawar The previous version mostly used analytical gradients and hessians.

With the new release v0.2.1 we now rely exclusively on PyTorch and its automatic derivation of gradients and hessians. Automatic differentiation enables efficient computation of gradients and hessians and also offers greater flexibility to incorporate custom loss functions into XGBoostLSS workflows. Hence, it is easier to add new distributions and also have a more unified structure among distributions.

As far as varying results are concerned: there are some differences between scipy and PyTorch distributions, i.e., how distributions are parameterized. Yet, we shouldn't expect to see a lot of differences.

Answer 16 · 2023-05-26T16:27:23.000Z

Closing this issue.