NJ 5 fails to find coord system where NJ 4.6 does
rschmunk opened this issue · 20 comments
A user just provided me a sample dataset in which the 5D variable of interest has 1D dimensions of lon, lat, time, level and tau. There are matching coordinate variables for lon, lat and time with appropriate units. Thus...
float clisccp(time=8, tau=7, plev7=7, lat=90, lon=144);
Using netCDF-Java 5.0 (beta2) to query VariableDS.getCoordinateSystems() on this variable, the response is a 0-length array.
However, if I use an old copy of my app based on NJ 4.6.12, a usable coordinate system is returned.
We wondered if this was the result of a change 5.0 branch to enforce dimension ordering following convention, but it seems that except for the additional inserted tau dimension, the ordering is fine.
Hi @msdsoftware - it's hard to tell exactly what could be going on here. What class are you using to open the file?
DatasetUrl durl = DatasetUrl.findDatasetUrl (urlString);
NetcdfDataset result = NetcdfDataset.acquireDataset (durl, true, null);
I wonder if this is creating a CoverageDataset
object, which has replaced GridDataset
...and by replace, I mean is all new, likely has bugs, and you probably just found one :-/
Any chance I could get a sample dataset?
Thanks! I'll take a look and see what I can find out.
@lesserwhirls, Did you have any luck with this?
I just had another user report trouble with a coordinate system not being found, but the new case is a WRF dataset using conformal conic gridding. The only difference btw his "bad" file and a very similar "good" file is that the bad also includes a time dimension, albeit a dimension of length 1.
Ping! @lesserwhirls, I'm debating whether to revert my code to use 4.6.12-snapshot, and would appreciate any update you might have on whether fixing this issue is in the works, and if so when it might reasonably happen. Thx.
@msdsoftware - many apologies for the silence. I was out all last week and am still playing catch-up.
Last I looked, I though I had a good idea of where things were going wrong, so my hope is that a fix could happen quickly. However, the WRF file issue seems to be different. Would it be possible to get a copy of that file, too?
It will be quite difficult for me to get to it this week, but I could dedicate Saturday to it. If I was to give you a status update Saturday evening, would that be soon enough for for you to make the call in releasing using 4.6/5.0?
@lesserwhirls, See
https://www.dropbox.com/s/k5lgmnn2jga6jxa/T2_avghst_C.nc?dl=1 and
https://www.dropbox.com/s/vz9y4ryaa6ozjhy/gfdl_T2_1970-1999_07_avg.nc?dl=1
These two WRF files are almost identical except that in gfdl_T2_1970-1999_07_avg.nc the quantity has the extra time dimension of length 1.
@lesserwhirls, I figured you were on vacation given the lack of activity around the repo. I am about to go on leave myself, so I will be making a decision later today whether to continue with NJ 5 or go back to 4.6 when I post a Panoply bugfix release tomorrow. Other cases people have reported with coordinate systems not being recognized I have been able to code around, but this WRF case is the one I've been unable to do anything with.
—
EtA: I opted to stick 5.0 for this release. Hopefully, I won't receive (many) more reports of unrecognized/unreported coordinate systems.
Ok, I think I understand what is happening here. It's not a case of ordering, but rather completeness. In 5.0, the CoordSysBuilder
became a bit more strict. Specifically, the makeCoordinateSystemsMaximal
method started requiring the the coordinate systems be complete (that is, a coordinate variable for every dimension must exist) for them to be registered. Here is the magic:
Both the check to isComplete()
and the bool
useCompleteCoordSys
(statically set to true
) keep coordinate systems from being attached to the variables for these files. A better name for the variable would probably be useCompleteCoordSysOnly
. The variable is something newly added in v5
that was not there in v4
; the isComplete()
method exists in v4
, but was not used as part of the makeCoordinateSystemsMaximal
method. Because of this change, the individual coordinate axes still get attached to the variable, but are only gathered together and registered as a coordinate system for the variable if they form a complete set.
The first case (1949.clisccp.nc
) isn't considered complete due to lack of coordinate variables for the dimensions tau
and plev7
. The second case isn't considered complete due to the lack of a coordinate variable for the dimension Time
.
Thinking out loud here...
Perhaps the idea was to allow the code implementing individual conventions to accept incomplete coordinate systems via extensions to the CoordSysBuilder
class? If that's correct, then the right thing to do for the WRF files (second case) is to extend the code in ucar.nc2.dataset.conv.WRFConvention
to build a time
variable, if one does not exist (would the global attribute START_DATE
be the thing to use?). The file in the first case does not appear to be associated with a convention, and there isn't enough info in the file to create coordinate variables for tau
and plev7
, so a one-off CoordSysBuilder
would need to be written for it and its siblings (but I don't see anything to ID by in the global metadata, like a model ID or the likes).
I can see this being a cleaner approach code-wise, while minimizing assumptions for the general case, but certainly does cause some issues (one of which cannot currently be worked around for things that used to work).
Another option would be to break up the Enhance
Enum
CoordSystems
into CoordSystemsComplete
and CoordSystemsIncomplete
. For backwards compatibility, Enhanse.CoordSystems
would alias to setting both of those; however, only CoodSystemsComplete
would be set in the defaultEnhanceMode
. Then, to allow incomplete coordinate systems, one would need to create a Set<Enum>
for use then they call NetcdfDataset
that included CoordSystemsIncomplete
; for maximum backwards compatibility, you'd probably go with the built-in set of all enhancements using NetcdfDataset.getEnhanceAll
.
@lesserwhirls, I don't know if you've been waiting for a response from me on this, but I guess after a few weeks I probably ought to say something.
Since a fair number of users could be generating their own data files and likely not being strict about following a convention (sometimes I have to tell Panoply users that there are conventions) — or else obtaining and trying to plot from such "carelessly" created datasets — the problem of a coordinate system being "incomplete" is going to be an ongoing problem.
The idea of there being a configurable option in the enhancement settings seems more workable. I am currently using NetcdfDataset.acquireDataset(DatasetUrl, true, CancelTask)
. I take your suggestion to mean that I might then have to follow that up with NetcdfDataset.enhance(Set<NetcdfDataset.Enhance>)
to change the enhancement scheme to not be so restrictive about coordinate system completeness?
Just got another query from a user about this, @lesserwhirls, in connection to an HDF-EOS dataset. The variables in question are on an auxiliary loa-lat grid, but have an extra dimension for "channel". The vars are reported as having no coordinate system, apparently because there is no coordinate var for channel.
Hi @msdsoftware - sorry for the delay. I think I have an enhancement that will allow for the use of incomplete coordinate systems. The code will look something like this:
Set<NetcdfDataset.Enhance> defaultEnhanceMode = NetcdfDataset.getDefaultEnhanceMode();
EnumSet<NetcdfDataset.Enhance> enhanceMode = EnumSet.copyOf(defaultEnhanceMode);
enhanceMode.add(NetcdfDataset.Enhance.IncompleteCoordSystems);
DatasetUrl durl = DatasetUrl.findDatasetUrl (urlString);
NetcdfDataset ncd = NetcdfDataset.acquireDataset(durl, enhanceMode, null);
After digging into this more, I have a feeling that our ucar.nc2.dataset.conv.WRFConvention
class isn't doing the right thing when constructing coordinate systems. Of course, that does not account for failures for HDF-EOS, but perhaps that too is a sign that the conventions class for HDF-EOS isn't doing the right thing (looks like we have two of those coordinate system builders). Ultimately, I'd like to get those convention classes working correctly, but for now, this will give a workaround.
I'm currently running our full test suite on my changes, and depending on those results, I should have a fix committed within the next day.
@lesserwhirls, FWIW, I've added some code to my classes to try a little harder looking for lon and lat coordinate vars in swath data when there is no coordinate system reported. It seems to handle this particular HDF-EOS case pretty well. I just need to do some testing against how other sample HDF-EOS files are handled.
Just compiled from the latest snapshot. Commit #1154 seems to have solved the problem with the recent HDF-EOS example file. I'll have to dig up some of the prior cases and see how it does there.
But if I understand you correctly, @lesserwhirls,, it doesn't work for the GRIB example?
Good to hear, @msdsoftware! In terms of GRIB, which example? I'm not seeing a GRIB file here on this issue. I'll take a look once I have a file. Thanks for your help on shaking out these issues!
Ooops, sorry. I meant the WRF file, @lesserwhirls .
Ah, no worries! The WRF file is what I used to test this with, so that should be good to go. My comment about the WRF file is that our coordinate system builder code that handles WRF files is lacking and should be beefed up independently of the the IncompleteCoordSystems
enhancement. However, using the new enhancement will at least give you a coordinate system to work with.
Calling this closed until such time as I receive another bug report about datasets being treated differently by my NJ-using code.
Re-opened this as I ran into a problem with an old WRF file using the Lambert conformal grid. Failure pattern is not quite the same as before, however, so I'm trying to make sense of what's happening before I add detail to this issue.