Noncentral F distribution CDF edge cases

Question

Noncentral F distribution CDF edge cases

Opened this issue 3 months ago · 3 comments

SciPy is thanks to Boost finally able to remove parts of the ancient cdflib Fortran library. Recently, the noncentral F distribution CDF was fully replaced by the boost implementation in scipy/scipy#21505, resulting in a far more accurate results for a wide range of inputs. Still, edge cases were uncovered which we would like to share with you. They occur for very large and very tiny values for the degrees of freedom. Note that for the second case even Mathematica is not able to compute a result, so maybe we do not need to thrive for a feasible value but raise an error or return nan. @mdhaber found a limiting formula that might be helpful: scipy/scipy#21505 (comment)

v1	v2	l	x	Boost result
1e-100	3	1.5	1e100	nan
1e20	1e20	1	1	-124.40303872682864

Answer 1 · 2024-09-13T18:06:41.000Z

Reproduced, first one is an easy fix. Not so sure about the second yet...

Answer 2 · 2024-10-03T18:47:17.000Z

Update: working on this now, there are some easy to fix numerical-stability issues which sort of fix both, but ideally we need some asymptotic expansions for the incomplete beta with large parameters: these exist, and I have them somewhat working, but they are deeply numerically unstable, especially near the saddle point. Hopefully more soon...

Answer 3 · 2024-11-21T11:45:01.000Z

FYI: We found a likely related edge for the noncentral t distribution CDF in scipy/scipy#21728 (comment) .

For df=980, noncentrality=38, x=1.5 boost computes a negative value of the order of -1e-70.