loosolab/TF-COMB

TFcomb changes numpy error handling

Closed this issue · 4 comments

Numpy has a default for handling floating-point errors. You can few the current setting with:

import numpy as np
np.geterr()
> {'divide': 'warn', 'over': 'warn', 'under': 'ignore', 'invalid': 'warn'}

When importing tfcomb they are changed:

import tfcomb
np.geterr()
> {'divide': 'raise', 'over': 'raise', 'under': 'raise', 'invalid': 'raise'}

With these settings, each event will raise an error. This is a problem as all functions using numpy, even as sub-dependencies, now can start throwing errors.

For example, when I wanted to do a marker gene analysis where I happened to have tfcomb imported there suddenly was an underflow exception. This was really confusing since I did the same analysis with exactly the same versions and the same data a few weeks prior.

I don't know where or why these settings are changed but this is confusing behavior that we need to change better sooner than later. Maybe we can limit this error-handling to stay within tfcomb? Or remove it altogether?

The "where" is easily found in the code here:

np.seterr(all='raise') # raise errors for runtimewarnings

For the "why", I believe we added it to ensure that we can catch exceptions in downstream modules, for example here:

TF-COMB/tfcomb/utils.py

Lines 1982 to 1987 in 2e52fbe

#Catch any exceptions from fitting
try:
params = distribution.fit(data_finite)
except Exception as e:
logger.error("Exception ({0}) occurred while fitting data to '{1}' distribution; skipping this distribution. Error message was: {2} ".format(e.__class__.__name__, distribution.name, e))
continue

I agree, the global setting of errors is not nice. A good solution would be to make this local in the function where it is needed. I think it is possible to do something like:

with np.seterr(all='raise'):
      <code>

Can you locate any code-snippets where this might be important?

Thanks for showing where this is set! I couldn't find it earlier. Sure I will have a look at where this is important. I guess I will start to look at all the locations where the objects are used.

Great, thanks! Yes I think anywhere there is a try/except which could be related to back-end numpy is a good start. But I think it was mainly the example above as far as I remember.

Fixed with #59