lbl-srg/funnel

compareAndReport overrides small atoly values

Closed this issue · 4 comments

If you pass an atoly to compareAndReport that is zero (or less than ~1e-10) an atoly of 0.1 is used instead. (This issue may affect other tolerance parameters but I have not checked.)

Running the following from ./tests/test_bin I cannot reproduce this issue.

In [5]: >>> ref = pd.read_csv('trended.csv')
   ...: >>> test = pd.read_csv('simulated.csv')
   ...: >>> pyfunnel.compareAndReport(xReference=ref.iloc(axis=1)[0], yReference=ref.iloc(axis=1)[1],
   ...: ... xTest=test.iloc(axis=1)[0], yTest=test.iloc(axis=1)[1], atolx=0.002, atoly=0.0)
Output directory not specified: results are stored in subdirectory `results` by default.

In [7]: cat results/upperBound.csv
x,y
-0.002000,0.000000
0.048000,0.000000
0.098000,1.000000
0.202000,1.000000
0.249000,0.060000
0.298000,3.000000
0.352000,3.000000
0.402000,0.000000
0.448000,0.000000
0.498000,4.000000
0.552000,4.000000
0.602000,0.000000
0.648000,0.000000
0.698000,0.500000
0.752000,0.500000
0.802000,0.000000
1.002000,0.000000

In [8]: >>> pyfunnel.compareAndReport(xReference=ref.iloc(axis=1)[0], yReference=ref.iloc(axis=1)[1],
   ...: ... xTest=test.iloc(axis=1)[0], yTest=test.iloc(axis=1)[1], atolx=0.002, atoly=0.1)
Output directory not specified: results are stored in subdirectory `results` by default.

In [9]: cat results/upperBound.csv
x,y
-0.002000,0.100000
0.048000,0.100000
0.098000,1.100000
0.202000,1.100000
0.249000,0.160000
0.298000,3.100000
0.352000,3.100000
0.402000,0.100000
0.448000,0.100000
0.498000,4.100000
0.552000,4.100000
0.602000,0.100000
0.648000,0.100000
0.698000,0.600000
0.752000,0.600000
0.802000,0.100000
1.002000,0.100000

Could you provide a minimum example or the code snippet that may explain this issue?

Sorry for the imprecision. I hadn't run every combination of tolerance arguments.

If I run your example only passing the tolerance argument ltoly=0.1 then +/-0.1 becomes the y absolute tolerance when the reference value is zero (instead of zero). But if I also pass, for example, atoly=1e-6, then it correctly uses that value.

In my brief look at the Python and C code I don't see the cause for this strange behavior.

That happens here https://github.com/lbl-srg/funnel/blob/master/src/tubeSize.c#L122-L142
The tube size is built by taking the max of all possible tolerance arguments.
But to avoid the relative tolerance to vanish in the vicinity of zero, the relative tolerance is considered as absolute if the tube size is zero.
Do you see a better way to guard against a zero tube size near zero when using a relative tolerance?

My sense would be to use the specified tolerances and adapt the logic, if needed, to allow a zero tube size. If something gets divided by the tube size then that will need a special case.

Using the relative tolerance as the minimum absolute tolerance but only when the value is near-zero seems confusing and surprising.