can't save to netcdf
Closed this issue · 8 comments
Hi, I'm new to python, maybe the question is very simple.
I build a binder in Github using Pangeo binder, which can handle the MITgcm LLC4320 huge dataset. When I use the "to_netcdf", only the 2D(i, j) data can be transferred, when the data include "k", like 2D(i, k) or 3D(i, j, k), it will stop at 83% each time. I'm not sure what happened, can anyone help me? jupyter notebook address
Perhaps you are running out of memory? Does your session crash, or does it just hang here forever?
I can download more than 1GB 2D(i, j) data, this one is really small to that.
I think you are confusing the final file size with the intermediate memory usage. Because of the way the LLC data are stored, you may need to download a large amount of data before you can subset it.
I noticed you are using k_chunksize=90
. This means that you are downloading all 90 vertical levels, then subsetting the top 5. Try instead with k_chunksize=5
or even k_chunksize=1
.
I can confirm it works for me if you change k_chunksize=90
to k_levels=[0, 1, 2, 3, 4]
.
I have a further question on this front. It seems that for me things work when k_levels is small in number (like around 10) or k_chunksize is not too small (fails if this value is 1 or 2). I tried k_levels=range(0,56) and that also failed. Things fail at the get_dataset step itself.
Any idea why?
Can you post the full error and traceback for "things failed"?
there is no error, the kernel just crashes/restarts. (I am using pangeo google cloud - large size)