Equiprobable discretization method doesn't work for lognormal distribution

Question

Equiprobable discretization method doesn't work for lognormal distribution

ezgioz opened this issue 5 years ago · 4 comments

I created a test file to compare the expected value generated by random draws with the expected value with our grids. (I chose the polynomial x^2)

For Normal distribution, I also check whether the expected value of our grids matches the exact integral of Normal distribution.

For Uniform and Normal distribution, the equiprobable method works quite well (Gauss-hermite even better for Normal). The results are closer with increased grid number.

There is an issue with Log-normal, we should check whether we use ppf function from scipy correctly in the processes_iid.py file

See these links for the lognormal functions in scipy

Related files are on my fork:

processes_iid.py
test_iid_processes.py --> Runs the test to compare results, works
temp_test_iid_processes.py --> This one is with plots and for looking in detail

Answer 1 · 2019-07-11T20:00:11.000Z

@ezgioz : thank you very much for the contribution.
I just looked at your code. Basically it seems all correct to me.
The issue (I think) with the lognormal test is that when mu and sigma are not very small, the lognormal distribution has a very high variance so all integration methods have very high variance (because of the exponential). So you should take smaller baseline values for the test.

Answer 2 · 2019-07-11T20:00:37.000Z

BTW, this should maybe be a PR rather than an issue ? ;-)

Answer 3 · 2019-07-17T12:44:08.000Z

@albop I added beta distribution as well and sent the PR.

For lognormal distribution yes, smaller baseline values give closer numbers for the test however the skewness of the plots (discretized vs continuous distribution) look very different for smaller values. (This is not the case for larger values)

Please see "temp_test_iid_processes.py" on my fork (or after the merge with the new PR) where I do some tests manually and do the plots.

Answer 4 · 2019-07-18T09:48:48.000Z

I'm not sure what the issue is, but it is probably a matter of different conventions. If you write line 164:

xvec = np.linspace(0,1,100)
ppf0 = scipy.stats.lognorm.ppf(xvec,s=σ, loc=μ, scale=np.exp(μ) )
ppf1 = distLog.ppf(xvec)

plt.plot(xvec, ppf0)
plt.plot(xvec, ppf1)

you see the ppf of distlog and that from scipy.stats don't match at all.