Priesemann-Group/covid19_inference

Nan errors in example notebooks

Closed this issue · 3 comments

When excecuting the covid19_inference/scripts/interactive/example_paper_scenarios.ipynb on google colab (kaggle as well) I get the following error:

Auto-assigning NUTS sampler...
INFO     [pymc3] Auto-assigning NUTS sampler...
Initializing NUTS using advi+adapt_diag...
INFO     [pymc3] Initializing NUTS using advi+adapt_diag...
WARNING (theano.tensor.blas): We did not find a dynamic library in the library_dir of the library we use for blas. If you use ATLAS, make sure to compile it with dynamics library.
WARNING  [theano.tensor.blas] We did not find a dynamic library in the library_dir of the library we use for blas. If you use ATLAS, make sure to compile it with dynamics library.
WARNING (theano.tensor.blas): We did not find a dynamic library in the library_dir of the library we use for blas. If you use ATLAS, make sure to compile it with dynamics library.
WARNING  [theano.tensor.blas] We did not find a dynamic library in the library_dir of the library we use for blas. If you use ATLAS, make sure to compile it with dynamics library.
Average Loss = 777.63:   0%|          | 386/200000 [00:01<16:12, 205.30it/s]

---------------------------------------------------------------------------

FloatingPointError                        Traceback (most recent call last)

<ipython-input-10-c20ebc4c3196> in <module>()
----> 1 trace = pm.sample(model=this_model, tune=500, draws=1000, init="advi+adapt_diag")

4 frames

/usr/local/lib/python3.7/dist-packages/pymc3/variational/inference.py in _iterate_with_loss(self, s, n, step_func, progress, callbacks)
    216                     except IndexError:
    217                         pass
--> 218                     raise FloatingPointError('\n'.join(errmsg))
    219                 scores[i] = e
    220                 if i % 10 == 0:

FloatingPointError: NaN occurred in optimization. 
The current approximation of RV `lambda_0_log_`.ravel()[0] is NaN.
The current approximation of RV `lambda_3_log_`.ravel()[0] is NaN.
The current approximation of RV `transient_day_3`.ravel()[0] is NaN.
The current approximation of RV `lambda_2_log_`.ravel()[0] is NaN.
The current approximation of RV `offset_modulation_rad_circular__`.ravel()[0] is NaN.
The current approximation of RV `weekend_factor_log`.ravel()[0] is NaN.
The current approximation of RV `delay_log`.ravel()[0] is NaN.
The current approximation of RV `transient_len_1_log_`.ravel()[0] is NaN.
The current approximation of RV `transient_day_1`.ravel()[0] is NaN.
The current approximation of RV `mu_log__`.ravel()[0] is NaN.
The current approximation of RV `transient_len_2_log_`.ravel()[0] is NaN.
The current approximation of RV `transient_len_3_log_`.ravel()[0] is NaN.
The current approximation of RV `I_begin_ratio_log`.ravel()[0] is NaN.
The current approximation of RV `lambda_1_log_`.ravel()[0] is NaN.
The current approximation of RV `transient_day_2`.ravel()[0] is NaN.
Try tracking this parameter: http://docs.pymc.io/notebooks/variational_api_quickstart.html#Tracking-parameters

in trace = pm.sample(model=this_model, tune=500, draws=1000, init="advi+adapt_diag")

The same happens if I run python.exe .\example_paper_scenarios.py locally

WARNING (theano.configdefaults): g++ not available, if using conda: `conda install m2w64-toolchain`
WARNING (theano.configdefaults): g++ not detected ! Theano will be unable to execute optimized C-implementations (for both CPU and GPU) and will default to Python implementations. Performance will be severely degraded. To remove this warning, set Theano flags cxx to an empty string.
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
INFO     [covid19_inference.data_retrieval._JHU] Successfully downloaded new files.
INFO     [covid19_inference.data_retrieval._JHU] Local backup to [...] successful.
INFO     [covid19_inference.model.spreading_rate] Lambda_t with sigmoids
INFO     [covid19_inference.model.model] relative_to_previous was set to default value False
INFO     [covid19_inference.model.model] pr_factor_to_previous was set to default value 1
INFO     [covid19_inference.model.model] relative_to_previous was set to default value False
INFO     [covid19_inference.model.model] pr_factor_to_previous was set to default value 1
INFO     [covid19_inference.model.model] relative_to_previous was set to default value False
INFO     [covid19_inference.model.model] pr_factor_to_previous was set to default value 1
INFO     [covid19_inference.model.compartmental_models] Uncorrelated prior_I
INFO     [covid19_inference.model.compartmental_models] SIR
INFO     [covid19_inference.model.delay] Delaying cases
INFO     [covid19_inference.model.week_modulation] Week modulation
.\example_paper_scenarios.py:126: FutureWarning: In v4.0, pm.sample will return an `arviz.InferenceData` object instead of a `MultiTrace` by default. You can pass return_inferencedata=True or return_inferencedata=False to be safe and silence this warning.
  trace = pm.sample(model=this_model, tune=500, draws=1000, init="advi+adapt_diag")
Auto-assigning NUTS sampler...
INFO     [pymc3] Auto-assigning NUTS sampler...
Initializing NUTS using advi+adapt_diag...
INFO     [pymc3] Initializing NUTS using advi+adapt_diag...
[....]
|--------------------------------------------------------------------------------| 0.74% [1474/200000 17:12<38:36:54 Average Loss = 709.21]C:\Users\Maximilian\Anaconda3\envs\covid19_inference2\lib\site-packages\theano\scalar\basic.py:1955: RuntimeWarning: invalid value encountered in true_divide
  return x / y
C:\Users\Maximilian\Anaconda3\envs\covid19_inference2\lib\site-packages\theano\tensor\elemwise.py:826: RuntimeWarning: invalid value encountered in impl (vectorized)
  variables = ufunc(*ufunc_args, **ufunc_kwargs)
 |--------------------------------------------------------------------------------| 0.74% [1475/200000 17:12<38:36:50 Average Loss = 709.21]Traceback (most recent call last):
  File ".\example_paper_scenarios.py", line 126, in <module>
    trace = pm.sample(model=this_model, tune=500, draws=1000, init="advi+adapt_diag")
  File "C:\Users\Maximilian\Anaconda3\envs\covid19_inference2\lib\site-packages\pymc3\sampling.py", line 496, in sample
    start_, step = init_nuts(
  File "C:\Users\Maximilian\Anaconda3\envs\covid19_inference2\lib\site-packages\pymc3\sampling.py", line 2121, in init_nuts
    approx = pm.fit(
  File "C:\Users\Maximilian\Anaconda3\envs\covid19_inference2\lib\site-packages\pymc3\variational\inference.py", line 832, in fit
    return inference.fit(n, **kwargs)
  File "C:\Users\Maximilian\Anaconda3\envs\covid19_inference2\lib\site-packages\pymc3\variational\inference.py", line 150, in fit
    state = self._iterate_with_loss(0, n, step_func, progress, callbacks)
  File "C:\Users\Maximilian\Anaconda3\envs\covid19_inference2\lib\site-packages\pymc3\variational\inference.py", line 238, in _iterate_with_loss
    raise FloatingPointError("\n".join(errmsg))
FloatingPointError: NaN occurred in optimization.
The current approximation of RV `lambda_1_log_`.ravel()[0] is NaN.
The current approximation of RV `transient_day_1`.ravel()[0] is NaN.
The current approximation of RV `transient_len_3_log_`.ravel()[0] is NaN.
The current approximation of RV `lambda_2_log_`.ravel()[0] is NaN.
The current approximation of RV `mu_log__`.ravel()[0] is NaN.
The current approximation of RV `delay_log`.ravel()[0] is NaN.
The current approximation of RV `transient_len_1_log_`.ravel()[0] is NaN.
The current approximation of RV `transient_day_2`.ravel()[0] is NaN.
The current approximation of RV `transient_len_2_log_`.ravel()[0] is NaN.
The current approximation of RV `weekend_factor_log`.ravel()[0] is NaN.
The current approximation of RV `offset_modulation_rad_circular__`.ravel()[0] is NaN.
The current approximation of RV `transient_day_3`.ravel()[0] is NaN.
The current approximation of RV `lambda_3_log_`.ravel()[0] is NaN.
The current approximation of RV `lambda_0_log_`.ravel()[0] is NaN.
The current approximation of RV `I_begin_ratio_log`.ravel()[0] is NaN.
Try tracking this parameter: http://docs.pymc.io/notebooks/variational_api_quickstart.html#Tracking-parameters

This is a problem with the priors, maybe it was a typo. Changed the values to the default ones (see e2b330a). It is working for me now. Have another try :)

Best Sebastian

Seems to work now! Thank you very much for the quick fix!