error processing netcdf file
fipoucat opened this issue · 8 comments
I am using metr to process a netcdf file but facing an error message:
slp <- ReadNetCDF(file = slp_file,vars = "msl",out = "data.frame") %>%
- setNames(c("time","latitude","longitude","value")) %>%
- select(lon,lat,time, value) %>% mutate(time = as.Date(time))
Error in ReadNetCDF(file = slp_file, vars = "msl", out = "data.frame") %>% :
1 assertions failed:
- Variable 'file': All elements must have at least 1 characters, but element
- 1 has 0 characters.
- What does this mean and any hint to fix?
Cheers
What does slp_file
has? The error implies that it's an empty string.
tgge netcdf file looks lile this:
ncdump -h mslp_1991-2022.nc
netcdf mslp_1991-2022 {
dimensions:
longitude = 281 ;
latitude = 221 ;
time = 47104 ;
variables:
float longitude(longitude) ;
longitude:units = "degrees_east" ;
longitude:long_name = "longitude" ;
float latitude(latitude) ;
latitude:units = "degrees_north" ;
latitude:long_name = "latitude" ;
int time(time) ;
time:units = "hours since 1900-01-01 00:00:00.0" ;
time:long_name = "time" ;
time:calendar = "gregorian" ;
short msl(time, latitude, longitude) ;
msl:scale_factor = 0.127916469564952 ;
msl:add_offset = 100097.936041765 ;
msl:_FillValue = -32767s ;
msl:missing_value = -32767s ;
msl:units = "Pa" ;
msl:long_name = "Mean sea level pressure" ;
msl:standard_name = "air_pressure_at_mean_sea_level" ;
So maybe the file not correctly read?
can it be the file size 5Gb?
No, I mean literally what the variable slp_file
is storing. It should be the path to your file, but the error you are seeing suggests that it's an empty string.
Now, obviously it was the netcdf file. I used another method to download and can read it now. Another error occurred but looks related to maybe the size?
slp_file <- nc_open("/home/sarr/work/DIN/ERA5-mslp_1991-2022.nc",write=FALSE, readunlim=TRUE, verbose=FALSE, auto_GMT=TRUE, suppress_dimvals=FALSE, return_on_error=FALSE )
slp <- ReadNetCDF(file = slp_file,vars = "msl",out = "data.frame") %>%
- setNames(c("time","lat","lon","value")) %>%
- select(lon,lat,time, value) %>% mutate(time = as.Date(time))
Error in (function (..., sorted = TRUE, unique = FALSE) :
Cross product of elements provided to CJ() would result in 2526733664 rows which exceeds .Machine$integer.max == 2147483647
Yes, that might be related to the size of the file. You might be able to read it with out = "array"
or using the subset
argument to read only part of the file. Otherwise, you might need to use other tools that don't read the whole file at once.
I used daily average (cdo) and was able to read the file. I want to subset the time using only data for the months May, June,July, august, September and October. Is-it possible with metR? if so how.
Thank you
Yes, but you need to split your subset into continuous chunks and your subset would look like this
subset = list(time = list(c("2000-05-01", "2000-11-01"),
c("2001-05-01", "2001-11-01"))
... and so forth. So your time
element needs to be a list and each elements of that list is a contiguous chunk of time. (This is due to the way the NetCDF file format works, it can only read contiguous chunks in any dimension, so to read many separated chunks you need one read operation per chunk, which ReadNetCDF
does automatically with this syntax.)
One way of building this subset programmatically is this:
years <- 1979:2008
chunks <- lapply(years, function(y) paste0(y, c("-05-01", "-11-01")))
subset = list(time = chunks)
(of course, change years
to the years you need).