cedadev/cf-checker

Add support for checking files using groups

Opened this issue · 11 comments

erget commented

CF-1.8 will add support for using groups as described in cf-convention/cf-conventions#144. This means that the Checker will need to be able to check that this part of the Standard is being used correctly.

I discussed this with David at the CF-WMO meeting in Exeter and can contribute to this but I want to agree the best approach before proceeding.

A simple approach would be that if a file that contains groups is passed to the Checker, it gets flattened before getting checked. Probably a function plus a single line inserted about here.

Additionally one could consider applying this only if the file claims compliance with CF-1.8. This could be done in one feel swoop but it might make more sense to have a more general handling of different versions of the standard.

How would you like me to proceed?

@erget - Yes this certainly needs thinking about. The first thing that needs to happen is for the conformance document to be updated with the additional checks required for groups. Do you want to update your PR (cf-convention/Conformance#8) with the comments and then hopefully we can get that approved an in.

Simple is good, if we flatten the file I assume we don't lose any information to be able to perform the lateral search algorithm for example? If you are happy to write a function to flatten the file that's fine with me - did I understand correctly from David that you had already started looking at this for something else?

On your last point I agree it would be good to have a more general approach to handling different versions, at the moment different versions are handled by if statements. In the current layout of the code it will be virtually impossible to do this as things are so intertwined, however, my plans for redeveloping the checker which I am planning to start in December will most definitely do this.

Cheers,
Ros.

erget commented

@RosalynHatcher excellent, I'll take a look at cf-convention/Conformance#8 - I had assumed that you could just commit on top of my PR rather than hunting and pecking the changes in, but as soon as I get some time I'll do that.

For the moment then I'll consider the scope for code contribution a flattener that matches the algorithm described in the 1.8 draft - different version of CF is a separate issue that it sounds like will be addressed separately in a consolidated fashion in any case. Thanks for the input.

@erget I'm happy to try committing on top of your PR if you'd prefer - I just assumed I couldn't!

erget commented

Ok, then I'll hold off and approve our joint contribution :D In case it's not possible to commit on top you can always send a PR to erget/Conformance:master and I'll merge it in, so it flows into the cf-convention/Conformance#8

erget commented

@RosalynHatcher as @davidhassell and I discussed at the WMO/CF meeting this summer, EUMETSAT is happy to contribute an extention to this software to accomodate Groups. cf-convention/Conformance#8 hasn't been merged yet but it's probably only because it's been overlooked. I've proposed merging it now.

In order to keep this from dragging out until late in 2020, I'd like to start updating the software soon, if you think that's a good idea. My proposal would be to proceed as we'd discussed before and flatten incoming files before feeding them into the Checker. I'll leave checking the Conventions version out of scope. Let me know if you prefer to adopt another approach; otherwise I'll be sending a PR beginning of next year if I'm lucky :)

@erget @RosalynHatcher - I would also be interested in this capability and can help with this.

erget commented

@piyushrpt thanks for reaching out - unfortunately, we're just wrapping this activity up, the code is pretty much complete already so this issue is close to resolution. It's great to know you're in the game though!

@erget - would be happy to help. We have a slew of SAR/InSAR products that use groups and are looking for a more automated compliance checker - here is an example: https://aria.jpl.nasa.gov/node/97

https://grfn.asf.alaska.edu/door/download/S1-GUNW-D-R-087-tops-20200125_20200119-161629-20645N_18638N-PP-7bc1-v2_0_2.nc

On a related note, are you aware of tools that extend this compliance checks to HDF5 (now groups are supported)? Would there be any interest in such a contribution to this toolset? Should not be too hard to use h5py instead of netCDF4 to perform the same checks. That would be really useful for NISAR mission products - https://nisar.jpl.nasa.gov/ which plans to use CF conventions in HDF5 format.

erget commented

@piyushrpt I don't know of any tools for checking HDF5 files for netCDF compliance but it might be straightforward to build one. There will be a CF community meeting this year and I would be interested in exploring options for extending CF support to other data formats, as currently we are very focused on netCDF. There's been very good and interesting work on separating the concepts of CF from those of netCDF and if the community is interested it could be beneficial (although it might also be a lot of work to map to additional data formats). It's some nice off-topic food for thought :)

@piyushrpt thanks for reaching out - unfortunately, we're just wrapping this activity up, the code is pretty much complete already so this issue is close to resolution. It's great to know you're in the game though!

@erget @RosalynHatcher Any update on this? This was a few months ago now. i understand that many things have got delayed recently, but it'd be good to hear how close to merge-able the new code is.

@jshholland : The NetCDF flattener code referred to by @egret was released a couple of months ago (https://gitlab.eumetsat.int/open-source/netcdf-flattener). I will be using this to enable implementation of the groups conformance checks and hope to have this complete within the next month.