cerfacs-globc/icclim

BUG: RuntimeWarning: All-NaN slice encountered

pagecp opened this issue · 2 comments

  • icclim version: 5.4.0
  • Python version: 3.9

Description

Sometimes when calculating, I get a warning on All-NaN slice encountered. After it gets Killed but I cannot confirm it is related, since what is raised is only a Warning. Further investigation will be done by running manually on the input file.

Sorry for the poor GitHub issue information included here... it is run in batch so I have less information until I run it manually.

Minimal reproducible example

/home/jovyan/work/data/CMIP6/ScenarioMIP/CNRM-CERFACS/CNRM-CM6-1-HR/ssp585/r1i1p1f2/CSU/gr/v20191202/CSU_day_CNRM-CM6-1-HR_ssp585_r1i1p1f2_gr_20650101-21001231.nc
Processing CSU and creating /home/jovyan/work/data/CMIP6/ScenarioMIP/CNRM-CERFACS/CNRM-CM6-1-HR/ssp585/r1i1p1f2/CSU/gr/v20191202/CSU_day_CNRM-CM6-1-HR_ssp585_r1i1p1f2_gr_20650101-21001231.nc

Output received

2022-10-13 12:23:43,995 Calculating climate index: CSU
/opt/conda/lib/python3.9/site-packages/dask/array/reductions.py:608: RuntimeWarning: All-NaN slice encountered
return np.nanmax(x_chunk, axis=axis, keepdims=keepdims)
Killed

bzah commented

I don't think the warning is related to the process being killed.
The "all nan sliced encoutered" is displayed by numpy when np.nanmax (or other nan-related functions) is computed over an array made of only nans.
See:

>>> np.nanmax(np.asanyarray([np.nan]))
<ipython-input-173-445f05e52077>:1: RuntimeWarning: All-NaN slice encountered
  np.nanmax(np.asanyarray([np.nan]))
Out[173]: nan

I haven't played too much with dask recently, but I guess it's memory related.
You may try to feed dask fewer workers and/or fewer threads per worker or give a larger memory pool. Careful though, on a LocalCluster/Client the pool is per worker, so make sure you have n_worker * mem_pool memory available.

Also, I don't know on which machine you run that but if it's one from Cerfacs it could also be due to the file creation issue that Laurent talked about earlier. Dask can create quite a few files especially when the memory is limited.

Yes, it may be memory leak for long running scripts. It runs on CMCC cluster. I will investigate and we can close the Issue for now, I guess, until I have more info.