NOAA-GFDL/FRE-NCtools

make_solo_mosaic error when getting value of variable from file

Opened this issue · 4 comments

Hi All,

We are trying to use make_solo_mosaic to generate mosaic file for mom6 with extreme high resolution (i.e., 36000x27000 horizontal grid), but we got the following message when do this:

**../fre_nctools/tools/make_solo_mosaic/make_solo_mosaic --num_tiles 1 --dir . --mosaic_name ocean_mosaic --tile_file ocean_hgrid.nc --periodx 360.
Error from pe 0: mpp_io(mpp_get_var_value): Error in getting value of variable x from file ./ocean_hgrid.nc: NetCDF: HDF error**

I check the make_solo_mosaic.c file, and it seems that the error occurs in this code segment:

**/*First read all the grid files.*/
  nxp = (int *)malloc(ntiles*sizeof(int));
  nyp = (int *)malloc(ntiles*sizeof(int));
  x = (double **)malloc(ntiles*sizeof(double *));
  y = (double **)malloc(ntiles*sizeof(double *));
  for(n=0; n<ntiles; n++) {
    char filepath[512];
    int fid, vid;
    sprintf(filepath, "%s%s",dir, tilefile[n]);
    fid = mpp_open(filepath, MPP_READ);
    nxp[n] = mpp_get_dimlen(fid, "nxp");
    nyp[n] = mpp_get_dimlen(fid, "nyp");
    x[n] = (double *)malloc(nxp[n]*nyp[n]*sizeof(double));
    y[n] = (double *)malloc(nxp[n]*nyp[n]*sizeof(double));
    vid = mpp_get_varid(fid, "tile");
    mpp_get_var_value(fid, vid, tile_name[n]);
    vid = mpp_get_varid(fid, "x");
    mpp_get_var_value(fid, vid, x[n]);
    vid = mpp_get_varid(fid, "y");
    mpp_get_var_value(fid, vid, y[n]);
    mpp_close(fid);
  }**

The format of ocean_hgrid.nc is netcdf4.

I convert the ocean_hgrid file from netcdf4 to cdf5 using nccopy, and rerun make_solo_mosaic, the error message looks like this:

# fre_nctools/tools/make_solo_mosaic/make_solo_mosaic --num_tiles 1 --dir . --mosaic_name ocean_mosaic --tile_file ocean_hgrid.nc.cdf5 --periodx 360.
make_solo_mosaic: putget.c:7274: getNCvx_double_double: Assertion `value != NULL' failed.
Aborted (core dumped)

Is this error due to the NetCDF library or other reason?

Thank you!

@camscmip6 is there any way you could share the ocean_hgrid.nc file?

@camscmip6 is there any way you could share the ocean_hgrid.nc file?

The size of ocean_hgrid.nc file is about 186GB
Following is the information of this file:
#ncdump -h ocean_hgrid.nc
netcdf ocean_hgrid {
dimensions:
nyp = 54001 ;
nxp = 72001 ;
ny = 54000 ;
nx = 72000 ;
string = 255 ;
variables:
double angle_dx(nyp, nxp) ;
angle_dx:units = "degrees" ;
double area(ny, nx) ;
area:units = "m2" ;
double dx(nyp, nx) ;
dx:units = "meters" ;
double dy(ny, nxp) ;
dy:units = "meters" ;
char tile(string) ;
double x(nyp, nxp) ;
x:units = "degrees" ;
double y(nyp, nxp) ;
y:units = "degrees" ;

// global attributes:
:_NCProperties = "version=1|netcdflibversion=4.4.1|hdf5libversion=1.8.17" ;

I'll try to share this file

My guess is the size of the ocean_hgrid.nc file is causing a memory issue in the code. Trying to share a file that size will be difficult.

@camscmip6
To handle arrays of the size you are interested in may require a few things from both our ends. Working with arrays of this size may also affect other NCTools applications (some which may be in your workflow) and I think we need to understand
a little more to plan development, schedule or even even uncover workarounds.

  1. Most probably the immediate cause of failure is that the malloc routine could not return a valid contiguous 32 GB (gigabyte) of memory for array X. This can be coded differently - such as with "status" check on that malloc returns - but even so, if the memory is unavailable there is no way around it. BTW, and even after array X is allocated, other similarly large arrays would also be allocated. Can you determine how much memory you have on your system? If you have more than 70GB or so, is there a time you can run where the memory may actually be available to the program?

  2. Are there other NCTools apps that you plan to use with the very large datasets?

Out of curiosity, is your work related to the NCAR MUSICA work?
Thanks
Mike Zuniga