cf-convention/cf-conventions

Axis Order for CRS-WKT grid mappings

Closed this issue · 54 comments

marqh commented

Title: Specify Axis Order for CRS-WKT grid mappings

Moderator: @dblodgett-usgs

Moderator Status Review [last updated: 2020/01/05]:
A clear use case has emerged: If coordinate variable values and a CRS-WKT string are to be passed to a coordinate projection library, then the coordinate variables must be in the right order for the CRS-WKT string. Parsing the AXIS order of the CRS-WKT string and inferring how it relates to coordinate variable names is non-trivial if not impossible to automated in all cases. An explicit statement of coordinate variable axis order in a grid_mapping attribute of the form: {variable}:grid_mapping = “{grid mapping}: {coordinate variable axis 1} {coordinate variable axis 2} ...” should be required when using CRS-WKT. (@marqh’s comment here is a good place to look for more.)

I don’t see any strong disagreement to this being a valuable addition and think @marqh should go ahead and prepare a pull request. I would request that any modifications to the original proposal be applied to the description here at the top.

Summary of comments. A couple key comments below: @davidhassle asked: Are you saying the CRS-WKT stored as an attribute of the grid_mapping variable can contain the actual coordinate values in plain text, as well as the projection definition?

@marqh responded more here:
The CRS definition does not contain any actual coordinate values. It is only the CRS definition.
The CRS-WKT definition includes a definition of a Coordinate System, a system of coordinates: which is a component part of a Coordinate Reference System. (careful with the similar but distinct terminology.

A minimal CDL example is:

dimensions:
  lat = 18 ;
  lon = 36 ;
variables:
  double lat(lat) ;
  double lon(lon) ;
  float temp(lat, lon) ;
    temp:grid_mapping = "crs:lat lon" ;
  int crs ;
    crs:grid_mapping_name = "latitude_longitude"
    crs:semi_major_axis = 6378137 ;
    crs:inverse_flattening = 298.257223563 ;
    crs:crs_wkt = "GEODCRS["WGS 84",
  DATUM["World Geodetic System 1984",
    ELLIPSOID["WGS 84",6378137,298.257223563,LENGTHUNIT["metre",1.0]]],
  CS[ellipsoidal,2],
    AXIS["latitude",north,ORDER[1]],
    AXIS["longitude",east,ORDER[2]],
    ANGLEUNIT["degree",0.01745329252], ID["EPSG",4326]]"

And @marqh stated:
This syntax is already available in CF, but it does not yet carry the meaning which I am suggesting that we add.

In a [useful response](https://github.com/cf-convention/cf-conventions/issues/223#issuecomment-570227603] @marqh stated:

All instances of CRS WKT string will be expected to have an AXIS list.

The intent of the extra interpretation is to ensure that a map between variables in the file and the concepts in the CRS WKT is explicit.

In the case of two grid mappings and two sets of (related) coordinate variables, the grid_mapping attribute would look like: temp:grid_mapping = "crs:x y crs_wgs84:lat lon”

It was noted that examples will be crucial here.

Requirement Summary:

There is a requirement for using Coordinate Reference System Well Known Text to define coordinate reference systems that coordinate value sets be presented as ordered tuples, where each tuple represents a location with respect to that coordinate reference system.

Currently CF does not provide a mechanism to specify axis order when linking a data variable to a CRS-WKT instance via a grid mapping.

Technical Proposal Summary:

Benefits: Who or what will benefit from this proposal?

Users of CRS-WKT will benefit from this proposal as it will ensure that data sets comply with the OGC mandate to provide coordinate tuple value ordering as part of the reference between

Whilst axis order can sometimes be inferred from deep inspection of the crs_wkt string and comparison to standard names of variables, this is a challenging inference to make and prone to significant problems, especially in automation scenarios.

I think that treating the WKT string as a parcel for parsing by other systems is much safer.

Status Quo:

CF has adopted CRS-WKT from OGC but has not provided a mechanism to state how the variables containing coordinate values need to be structured in order to comply with the OGC mandates on implementation of Referencing by Coordinates in general, of which CRS-WKT is an implementation.

Detailed Proposal:

Provide an explicit mechanism for defining axis order, to enable individual coordinate variables to be unambiguously expanded into coordinate tuple collections.

The expanded form of the grid_mapping attribute is

a blank-separated list of words "grid_mapping_variable: coordinate_variable [coordinate_variable …​] [grid_mapping_variable: …​]", which identifies one or more grid mapping variables, and with each grid mapping associates one or more coordinate_variables, i.e. coordinate variables or auxiliary coordinate variables.

This in situ feature provides a useful point to provide this capability extension.

As the extended form already explicitly references individual variables to use as coordinate values, the order of these values may be used to explicitly state the axis order of the coordinate values defined with respect to the coordinate reference system.

Explicitly, the following text may be added as following paragraphs.

Where an extended "grid_mapping_variable: coordinate_variable [coordinate_variable …​]" entity is defined, then the order of the coordinate variable definitions provides an explicit order for these coordinate value variables to be combined into coordinate tuples.

The order of the coordinate_variable instances defines the order of elements within a coordinate value tuple. This enable an application reading the data from a file to construct an array of coordinate value tuples, where each tuple is ordered to match the specification of the coordinate reference system being used.

This explicit 'axis order' is important when the grid_mapping_variable contains an attribute crs_wkt. It is mandated by the OGC CRS-WKT standard that coordinate tuples with correct axis order are provided as part of the reference to a Coordinate Reference System.

I would also suggest including text within the later Use of the CRS Well-known Text Format section:

Where crs_wkt is added to a grid_mapping, the extended syntax for the grid_mapping attribute enables the axis order for the coordinates being referenced to be explicitly stated. The explicit definition of axis order is expected by the OGC standards for referencing by coordinates.

Hi Mark,

I'm not a WKT expert (or even an amateur), so it'd be very useful if you could clarify something for me.

Are you saying the CRS-WKT stored as an attribute of the grid_mapping variable can contain the actual coordinate values in plain text, as well as the projection definition?

If not, then I'm confused as to what the axis order refers to.

If so, then I'll follw up with some more questions!

David

marqh commented

Hello @davidhassell

Are you saying the CRS-WKT stored as an attribute of the grid_mapping variable can contain the actual coordinate values in plain text, as well as the projection definition?

I think no, that is not what I am saying. The CRS definition does not contain any actual coordinate values. It is only the CRS definition.

The CRS-WKT definition includes a definition of a Coordinate System, a system of coordinates: which is a component part of a Coordinate Reference System. (careful with the similar but distinct terminology)

For example,
https://www.epsg-registry.org/export.htm?wkt=urn:ogc:def:crs:EPSG::4326
contains the definition

  CS[ellipsoidal,2],
    AXIS["latitude",north,ORDER[1]],
    AXIS["longitude",east,ORDER[2]],

This includes a dimensionality, in this case 2, and an ordered list of AXIS definitions, which shall be the size of the dimensionality.

The key information here is that a single position is defined with respect to the CRS by providing that position in the form of a coordinate tuple, which is ordered. Thus, a coordinate value defined with respect to this CRS shall be of the form

  • ( <latitude_value> , <longitude_value> )

It is fine to provide a list, or array, of coordinate tuples, as long as individual tuple order is consistent, e.g.:

  • [ ( <latitude_value> , <longitude_value> ) , ( <latitude_value> , <longitude_value> ) ]

It is not correct to provide (Axis inversion is banned in coordinate value to CRS referencing)

  • ( <longitude_value> , <latitude_value> )

It is not correct to provide

  • [<longitude_value> , <longitude_value> ...]

This is only of import when providing coordinate values and CRS-WKT to an application which is aware of these technologies and trying to process the spatial information.

Given that a significant majority of CF data will have separate variables storing an x and a y coordinate, (lets call them lat and lon) then the only additional information I want to provide here is explicit information on whether a coordinate values for a single location shall be presented to CRS-WKT aware software should be presented as (lat, lon) or (lon,lat)

It would be really useful to be explicit about providing this single piece of relation information for implementers and users of CRS-WKT.

in this case, I would like to see a netCDF file encoded (pseudo CDL)

dimensions:
  lat = 18 ;
  lon = 36 ;
variables:
  double lat(lat) ;
  double lon(lon) ;
  float temp(lat, lon) ;
    temp:long_name = "temperature" ;
    temp:units = "K" ;
    temp:grid_mapping = "crs:lat lon" ;
  int crs ;
    crs:grid_mapping_name = "latitude_longitude"
    crs:semi_major_axis = 6378137 ;
    crs:inverse_flattening = 298.257223563 ;
    crs:crs_wkt = "GEODCRS["WGS 84",
  DATUM["World Geodetic System 1984",
    ELLIPSOID["WGS 84",6378137,298.257223563,LENGTHUNIT["metre",1.0]]],
  CS[ellipsoidal,2],
    AXIS["latitude",north,ORDER[1]],
    AXIS["longitude",east,ORDER[2]],
    ANGLEUNIT["degree",0.01745329252],
  ID["EPSG",4326]]"

the line:

    temp:grid_mapping = "crs:lat lon" ;

is the crucial one, providing an explicit definition that i am relating (lat, lon) coordinate value pairs, from the lat and lon variables, not (lon,lat) coordinate value pairs.

This syntax is already available in CF, but it does not yet carry the meaning which I am suggesting that we add.

mark

Hi Mark,

Many thanks for the explanation - just what I needed - I now wholly understand the reason for the proposal.

In your CDL example, you also have the axis order encoded in the CRS-WKT string:

    AXIS["latitude",north,ORDER[1]],
    AXIS["longitude",east,ORDER[2]],

Was that intentional? If it is there, then should the variable order in the grid_mapping be ignored?

It might be worth also mentioning that if any N-d coordinates (N>1) are listed in the grid_mapping attribute, then they must be ignored when ascertaining the order.

marqh commented

Hello @JonathanGregory

I think your proposal makes sense, if you're sure that the variables listed in the grid_mapping attribute (in the extended form) always correspond to the coordinate variables of the OGC CRS, do they?

I think that they have to. I think that for a CRS to be used in its WKT form, then the variables listed in the grid_mapping attribute must correspond, otherwise the mismatch cannot be handled by downstream software.

This is an onus on the data provider, they are asserting this by providing the WKT encoding for the CRS.

In your text, I think it would be good to say that the order of the variables in the grid_mapping attribute is significant only if crs_wkt is also specified, because it doesn't have a meaning for the CF metadata, in which order is not defined.

I agree, perhaps:

5.6. Horizontal Coordinate Reference Systems, Grid Mappings, and Projections section:

Where an extended "grid_mapping_variable: coordinate_variable [coordinate_variable …​]" entity is defined, then the order of the coordinate variable references within the definition provides an explicit order for these coordinate value variables, if they are to be combined into individual coordinate tuples.

This order is only significant if crs_wkt is also specified within the referenced grid mapping variable. Explicit 'axis order' is important when the grid_mapping_variable contains an attribute crs_wkt as it is mandated by the OGC CRS-WKT standard that coordinate tuples with correct axis order are provided as part of the reference to a Coordinate Reference System.

5.6.1 Use of the CRS Well-known Text Format section:

Where crs_wkt is added to a grid_mapping, the extended syntax for the grid_mapping attribute enables the axis order for the coordinates being referenced to be explicitly stated. The explicit definition of axis order is expected by the OGC standards for referencing by coordinates.

The order of the coordinate variable references within the grid_mapping attribute definition defines the order of elements within a derived coordinate value tuple. This enable an application reading the data from a file to construct an array of coordinate value tuples, where each tuple is ordered to match the specification of the coordinate reference system being used.

For example, a file has two coordinate variables, lon and lat, and a grid mapping variable crs with an associated crs_wkt attribute; the WKT definition defines the AXIS order as ["latitude", "longitude"]. The grid_mapping attribute is thus given a value crs:lat lon to define that where coordinate pairs are required, these shall be ordered (lat, lon), to be consistent with the provided crs_wkt string (and not order inverted).

Is this a slight improvement on the previous text?

thank you
mark

marqh commented

Hello @davidhassell

In your CDL example, you also have the axis order encoded in the CRS-WKT string:

    AXIS["latitude",north,ORDER[1]],
    AXIS["longitude",east,ORDER[2]],

Was that intentional? If it is there, then should the variable order in the grid_mapping be ignored?

This is intentional. All instances of CRS WKT string will be expected to have an AXIS list.

The intent of the extra interpretation is to ensure that a map between variables in the file and the concepts in the CRS WKT is explicit.

There's a lot of risk in relying on parsing the WKT and trying to match strings in AXIS names in WKT to strings in variable names, or variable metadata. In many cases, matching becomes fragile or impossible.

e.g.

AXIS
  "easting"
  "northing"

ncVars
   x
   y

this is about making consistency explicit.

I have not grasped the intent of:

It might be worth also mentioning that if any N-d coordinates (N>1) are listed in the grid_mapping attribute, then they must be ignored when ascertaining the order.
(sorry)
please could you restate or explain further?

thank you
mark

Re. the order in being in the CRS-WKT - that makes sense, thanks.

My other comment on N-d variables was referring to the case when dimension coordinate and auxiliary coordinate variables are attached to the same grid_mapping, which I think is allowed:

dimensions:
  x = 18 ;
  y = 36 ;
variables:
  double x(x) ;
  double y(y) ;
  double lat(y, x) ;
  double lon(y, x) ;
float temp(lat, lon) ;
  temp:long_name = "temperature" ;
  temp:units = "K" ;
  temp:grid_mapping = "crs: lat lon x y" ;
marqh commented

Hi @davidhassell

I don't understand the last example you presented. please could you explain in words and/or examples what is meant by four different coordinates all 'grid mapping' to the same CRS?

thank you
mark

@davidhassell If I am understanding @marqh's proposal correctly, you should have only x and y for the coordinates associated with crs in the grid_mapping attribute in your example. The implication of the coordinate variables x and y in your example is that the coordinate system is a projected coordinate system. In that case, the axes in the WKT string would not be latitude and longitude.

@marqh It might be useful to provide an example that used a projected CRS WKT.

marqh commented

Hello @JimBiardCics

It might be useful to provide an example that used a projected CRS WKT.

There is a useful example in the conventions at example 5.12 in section 5.6.1 http://cfconventions.org/cf-conventions/cf-conventions#use-of-the-crs-well-known-text-format

this would be updated, with the line:

  • temp:grid_mapping = "crs" ;

replaced with the line

  • temp:grid_mapping = "crs:x y" ;

If a data producer chose to include 2D latitude and longitude coordinates, and defined a CRS definition for these, let's call that var: crs_wgs84 as in example 5.11, then the grid_mapping attribute would be updated to read

  • temp:grid_mapping = "crs:x y crs_wgs84:lat lon" ;

I would expect to update examples 5.11 and 5.12 as part of this change

Jim: Do you think it worthwhile including this example:

  1. in the conventions?
  2. as an alteration to 5.12?
  3. as an abridged example, combining 5.11 and 5.12 in short form as a new example?

many thanks
mark

marqh commented

There seems broad support for this change and so far limited concern.

Shall I proceed with making a targeted Pull Request with the changes as discussed here, to enable us to iterate on the fine detail with review comments?

Please may I have a volunteer to adopt this ticket as moderator?

thank you
mark

I think that it's valid CF to include the auxiliary coordinates in the extended grid_mapping format: Temperature:grid_mapping = "Lambert_Conformal: lat lon x y";, as is done in the example below.

dimensions:
  y = 228;
  x = 306;
variables:
  int Lambert_Conformal;
    Lambert_Conformal:grid_mapping_name = "lambert_conformal_conic";
    Lambert_Conformal:standard_parallel = 25.0;
    Lambert_Conformal:longitude_of_central_meridian = 265.0;
    Lambert_Conformal:latitude_of_projection_origin = 25.0;
  double y(y);
    y:standard_name = "projection_y_coordinate";
  double x(x);
    x:standard_name = "projection_x_coordinate";
  double lat(y, x);
    lat:standard_name = "latitude";
  double lon(y, x);
    lon:standard_name = "longitude";
  float Temperature(y, x);
    Temperature:units = "K";
    Temperature:coordinates = "lat lon";
    Temperature:grid_mapping = "Lambert_Conformal: lat lon x y";

I see now that my language was wrong when I talked about "ascertaining" the order. I meant that any N-d auxiliary coordinate variables given by the grid_mapping attribute are to be ignored when mapping the named variables to the CRS-WKT axes.

However, I have checked back with your text, and see that you have written that it is the order of the coordinate variables that is used, so my point about ignoring auxiliary coordinates is already catered for, so I'm happy. Sorry for the diversion!

marqh commented

@davidhassell

there is text in 5.6 that states

which identifies one or more grid mapping variables, and with each grid mapping associates one or more coordinate_variables, i.e. coordinate variables or auxiliary coordinate variables.

so this is already intended for use with coordinate variables and auxiliary coordinate variables.

I think that the example you presented is mis-encoded. I think that the projected coordinates are defined with respect to a different CRS instance from the geodetic coordinates.

I think the example you present is better encoded as:

dimensions:
  y = 228;
  x = 306;
variables:
  int Lambert_Conformal;
    Lambert_Conformal:grid_mapping_name = "lambert_conformal_conic";
     ...
    Lambert_Conformal:crs_wkt = ...
  int geodetic;
    geodetic.grid_mapping_name = "latitude_longitude" ;
    ...
    geodetic.crs_wkt = ...
  double y(y);
    y:standard_name = "projection_y_coordinate";
  double x(x);
    x:standard_name = "projection_x_coordinate";
  double lat(y, x);
    lat:standard_name = "latitude";
  double lon(y, x);
    lon:standard_name = "longitude";
  float Temperature(y, x);
    Temperature:units = "K";
    Temperature:coordinates = "lat lon";
    Temperature:grid_mapping = "Lambert_Conformal: x y geodetic: lat lon";

Does this make sense as a preferred encoding for the example you have presented?

thank you
mark

@marqh That makes much better sense. I think we need to make sure that we provide clear examples, as this can get confusing quickly.

I want to make sure of the reason for this proposed change. This change is being made to provide a way for software to automatically order the coordinate variables correctly when handing a WKT string and coordinate values to a function that would do coordinate transformation. We are mapping variables by name to CRS axes. Is that right?

As a comment on this, we are probably going to find that people will get this wrong a lot.

If you dig into the WKT string to find the axes, shouldn't there be a pretty clean mapping between the WKT axis names and the coordinate variable standard names? What is the particular reason why people feel that this is insufficient?

Hi @marqh, I wasn't suggesting that you can't have both dimension coordinate and auxiliary coordinate variables explicitly associated with a grid mapping - quite the opposite: given that you can I was at first concerned on how the presence of N-d variables affected the mapping to CRS-WKT axes - a concern I have already withdrawn.

You're right, though, that my example was a bit lacking; but I still think that it is possible to have 2-d latitude, longitude and 1-d (projection) coordinates explicitly linked to the same CRS.

Consider:

  char rotated_pole
    rotated_pole:grid_mapping_name = "rotated_latitude_longitude" ;
    rotated_pole:grid_north_pole_latitude = 32.5 ;
    rotated_pole:grid_north_pole_longitude = 170. ;
    rotated_pole:earth_radius = 6200000. ;

Then it would make sense to explicitly link (they're already implicitly linked) all four coordinates to "rotated_pole" grid mapping - the lat and lon variables can make use of the spherical earth definition.

Thanks, David

@davidhassell It seems to me that the issue is figuring out which variable maps to which part. The rotated pole projection is a particularly thorny example. If the goal is to allow for automated ordering, I think we'd have to lay out some pretty specific syntax rules if we aren't depending on looking inside the WKT string.

@marqh -- I've not had time to really take this in yet, but I'll gladly act as moderator. I'll edit the proposal by adding a summary of discussion above as I have time -- should be later today or this weekend.

marqh commented

Hi @JimBiardCics

If you dig into the WKT string to find the axes, shouldn't there be a pretty clean mapping between the WKT axis names and the coordinate variable standard names? What is the particular reason why people feel that this is insufficient?

This is indeed the concern I am raising.

That the pretty clean mapping does not exist in many cases. Understanding relations may require partial string matching, interpretation or inference, all difficult to encode into software. The vocabularies within CF and within CRS-WKT are similar but there are plenty of differences to manage, and I don't think it is worth the effort

They also imply the parsing of the CRS-WKT to comprehend this aspect.

For example, this example of a CS for a Projected CRS is stated in the WKT-CRS standard:

CS[Cartesian,2],
                AXIS["(E)",east],
                AXIS["(N)",north],
                LENGTHUNIT[“metre”,1.0]

The text in the quotes is optional and not controlled.

In CF terms in this case, the coordinate variable definitions may use the standard_name projection_x_coordinate and projection_y_coordinate but these are optional.

Given two sets of optional strings, one of which is not a controlled vocabulary, and implementing reference ordering based on this seems to me to be too much of a minefield. Hence I have come to the view that explicit is better here.

This is where the broader conversations within #222 re-triggered this topic and motivated this activity.

I prefer to use the already in place syntax and provide clear interpretation to enable a data producer to provide this information explicitly.
This should enable my software to be set up to simply pass coordinate values and CRS-WKT strings to a suitable application, written by specialists, which can provide me with all of the rich functionality that referencing by coordinates delivers. There's minimal complication for me at this point.

I hope that this will allow us to provide a set of good practice templates to provide to data producers to enable them to adopt.

I want to make sure of the reason for this proposed change. This change is being made to provide a way for software to automatically order the coordinate variables correctly when handing a WKT string and coordinate values to a function that would do coordinate transformation. We are mapping variables by name to CRS axes. Is that right?

yes, this is my interpretation as well.

I hope this is helpful clarification

mark

marqh commented

@marqh -- I've not had time to really take this in yet, but I'll gladly act as moderator. I'll edit the proposal by adding a summary of discussion above as I have time -- should be later today or this weekend.

thank you
mark

The description has been updated with my summary. I think this issue is at a point where a pull request with suggested modifications would be helpful.

marqh commented

The Pull Request #224 aims to close this issue. Please target all fine detail comments at the Pull Request as review comments to the proposed text.

Please continue to use this issues for wider comments, concerns about the principle of the change, and so on.

thank you
mark

Thanks for the PR, Mark.

I would still like it to be clear that all this only applies to coordinate variables (as opposed to auxiliary coordinate variables). I think this could be done with a very simple change to the proposed text (old, new):

Where an extended "grid_mapping_variable: coordinates_variable [coordinates_variable]" entity is defined, then the order of the coordinates_variable coordinate variable references within the definition provides an explicit order for these coordinate value variables, used if they are to be combined into individual coordinate tuples.

Would that be OK?

Thnaks, David

marqh commented

Hello @davidhassell

I would still like it to be clear that all this only applies to coordinate variables (as opposed to auxiliary coordinate variables).

I disagree with this point. This capability applies to coordinate variables and auxiliary coordinate variables equally.
That is a key part of the purpose of introducing this syntax in the first place and remains a key capability.

The current published version, 1.7, explicitly states:

In the second format, it is a blank-separated list of words "grid_mapping_variable: coordinate_variable [coordinate_variable …​] [grid_mapping_variable: …​]", which identifies one or more grid mapping variables, and with each grid mapping associates one or more coordinate_variables, i.e. coordinate variables or auxiliary coordinate variables.

stating that this applies equally to coordinate variables and auxiliary coordinate variables.

I aim to keep this as published in the previous version. Indeed, I think that making this apply to coordinate variables only would result in certain CF1.7 datasets being deemed not conforming to the conventions for CF1.8, which I think is a result that is to be avoided.

Please may I ask?
What is the aim of limiting this capability to netCDF Coordinate Variable instances only?
Is exploring such a limitation part of this ticket on stating Axis order for CRS WKT? (or is it an independent discussion topic?)

thank you
mark

I must not be understanding how CRS-WKT works, but if I stuck on the fact that 2-d auxiliary coordinate variable span two axes, thus making the mapping of those coordinates to one axis impossible. Does that make sense?

marqh commented

@davidhassell that's okay, I'll try to explain things as I see them in different ways and see if I can help.

Be careful of reused terminology, which doesn't always mean the same thing in different contexts.

There is no 'mapping' of 2D variables representing coordinate values to one axis.

I'll try name-spacing, to see if this helps. A cf_axis is not the same concept as a crs_wkt_axis.

A crs_wkt_axis is a coordinate reference system concept and is all about how the basis vectors in the CRS are defined and how individual positions are represented as an ordered tuple of coordinate values.

A cf_axis is a data structure concept which is all about how the data and locational metadata are laid down as a set of structured arrays with dimension references.

Given:

  • two 2-d auxiliary coordinate variables, each of which span the same two cf_axes.

Wanted:

  • pairs of values (tuples) from each of these auxiliary coordinate variables
  • within each pair, the order of values is defined by the crs_wkt_axis order

So, we take the two 2-d variables and for each index location, make a tuple, ordered by the crs_wkt_axis order

e.g.
CDL

foo:grid_mapping = "crs:x y"

x = [[1, 2, 3],
     [4, 5, 6]]
y = [[11, 12, 13],
     [14, 15, 16]] 

crs_wkt application ready output

coord_tuples = [[(1, 11), (2, 12), (3, 13)],
                [(4, 14), (5, 15), (6, 16)]]

does this help to demonstrate the different concepts at work here?

mark

@davidhassell I agree with @marqh. All of this does, in a sense, represent a generalization from the way that CF historically made use of the grid_mapping attribute and variable. In the past, this information was considered to provide information about how to map from x,y to lat,lon when x and y were 1D and lat and lon were 2D. (The insufficiency of the information provided to actually accomplish this to a high degree of accuracy is a different issue.) In truth you could use the same information to perform the reverse mapping, but we weren't thinking about it that way.

A full and proper representation of any set of coordinate value tuples, regardless of array dimensionality, requires an associated CRS. The expanded grid_mapping scheme makes this possible. This is why it's important to allow both "true" coordinates and auxiliary coordinates to be specified in the attribute.

Mark - thank you very much for taking the time to explain this "A cf_axis is not the same concept as a crs_wkt_axis" is a fact that I had not realized.

The text as you wrote it is indeed fine, but I'm not sure what the difference between coordinate_variable and coordinates_variable is (https://github.com/cf-convention/cf-conventions/pull/224/files#diff-0eab4e85fe4c323f70ce4bce0229dbe6L210-R212).

I presume that there is no requirement for the CF checker to get involved with this, as it ignores the content of CRS-WKT attributes, so there's no need to pursue the case when there are more coordinate variables are listed by the grid mapping attribute than CRS-WKT axes.

So, I'm happy with this proposal - pending on what the extra s really means in my question above!

Thank you again to everyone who helped me understand more about WKT - much appreciated.

marqh commented

Hi @davidhassell

you're welcome

The text as you wrote it is indeed fine, but I'm not sure what the difference between coordinate_variable and coordinates_variable

I have introduced the s into coordinates_variable in the text to try and contrast, to state more explicitly that this word is different from the widely used coordinate variable concept, which is widespread in the conventions document. I hope that this may help to avoid the interpretation that this is only for the coordinate variable concept but is trying to refer to any netCDF variable which may contain some coordinate values, be they auxiliary or not.

I'm not wedded to this change, and I think the doc is okay as it is now, but this change may just help to make a reader think 'is this a different concept, should I read this again carefully'. As the information is contained within these two proximate sections in the conventions doc, I thought it might be a worthwhile editorial tweak.

mark

@marqh Is there an actual attribute named coordinates_variable? If not, I would avoid this usage.

@davidhassell I think there is some level on which the checker needs to get involved. It needs to check consistency in the grid_mapping attribute string - verifying that the CRS (grid mapping) variables exist and the variable names associated with each CRS variable exist and have matching dimensionality. The checker could even verify that the variables associated with a given CRS represent a proper set using standard names and/or units. None of this would require parsing the WKT string.

marqh commented

@JimBiardCics
as in
https://github.com/cf-convention/cf-conventions/pull/224/files#diff-0eab4e85fe4c323f70ce4bce0229dbe6L210-R212

it is a placeholder in a string definition that is to be replaced by a variable name within the conventions text. So, it's not an attribute, it's defining the syntax for this complicated string representation.

I'm open to any discussion on how to put this text into the conventions document to try and be clear and explicit and avoid confusion where possible.

these labels are not marked up as though they are CF attribute names in the text

marqh commented

@davidhassell I think there is some level on which the checker needs to get involved. It needs to check consistency in the grid_mapping attribute string - verifying that the CRS (grid mapping) variables exist and the variable names associated with each CRS variable exist and have matching dimensionality. The checker could even verify that the variables associated with a given CRS represent a proper set using standard names and/or units. None of this would require parsing the WKT string.

I agree with the principle here @JimBiardCics

Myself, I'd put less onus on the checker, as long as the variables exist, I'd let this pass. I expect this to be the limit of the current conformance expectations.

Checking for matching dimensionality, standard name, units and so on feels to me (at first glance) like adding complication to the checker that is difficult to encode to meet all cases. A balance between good protection and minimal false alerts may be difficult to strike in these cases, I fear.

As always, 'What is proportionate?' is a useful detail aspect for us to reach consensus on within this ticket and I welcome this aspect of the discussion.

cheers
mark

The use of underscores like that can be problematic, for sure. Fixing that overall would be a long-term project. What about some markup like this?

In the second format, it is a blank-separated list of words "grid mapping variable: associated variable [associated variable …​] [grid mapping variable: …​]", which identifies one or more grid mapping variables, and associates one or more coordinate or auxiliary coordinate variables with each grid mapping variable.

marqh commented

@JimBiardCics my concern around this presentation is the use of space characters in labels that are parts of a space separated list. how to I separate the elements when reading?

is there a different separator that I could use, instead of the underscore '_' ?
perhaps a character that is not allowed by the netCDF attribute naming syntax

now where's that allowed character list?

@marqh I wondered at the use case where a grid mapping would have a single coordinate variable associated with it, but then I thought of the example of zonal data that had latitudes but no longitudes. It's a somewhat odd case, but it represents a valid identification.

@marqh You could use CamelCasing. In my own work, I tend to use the construction <short name> to indicate a placeholder.

marqh commented

I'd be happy to replace

grid_mapping_variable: coordinate_variable [coordinate_variable ...] [grid_mapping_variable: ...]"

with

gridMappingVariable: coordinatesVariable [coordinatesVariable ...] [gridMappingVariable: coordinatesVariable ...]"

or with

<gridMappingVariable>: <coordinatesVariable> [<coordinatesVariable> ...] [<gridMappingVariable>: <coordinatesVariable> ...]"

any further preferences? objections? suggestions?

@marqh Not from me! Either looks good.

Following on from #223 (comment) the case when the grid_mapping variable has more than two variables associated still needs addressing, too.

marqh commented

@davidhassell
I think that line 474 in the PR
https://github.com/cf-convention/cf-conventions/pull/224/files#diff-0eab4e85fe4c323f70ce4bce0229dbe6R474

explicitly addressed this, placing the expectation on the data producer that if the CRS WKT is provided and the coordinates provided then these are expected to be consistent with eachother.

It also notes that this is not a conformance requirement due to the consistency being contingent on the CRS WKT definition

I would recommend adding the direction to the axis order as well.

For example NORTH_POLE_EASTING_SOUTH_NORTHING_SOUTH or SOUTH_POLE_EASTING_NORTH_NORTHING_NORTH:
See axis maps here: https://pyproj4.github.io/pyproj/latest/_modules/pyproj/crs/coordinate_system.html

@snowman2 I read the definitions at your link. It seems to me that this is out of scope. Could you please explain what is gained by specifying direction (by some as yet unknown means) with the coordinate name tuple? I would assume that the directionality is implicit in the particular coordinate system chosen.

@JimBiardCics How about specifying the axis attributes in the coordinate variable attributes?

For example:

  double y(y);
    y:standard_name = "projection_y_coordinate";
    y:direction = "north"
    y:unit = "metre"
  double x(x);
    x:standard_name = "projection_x_coordinate";
    x:direction = "east"
    x:unit = "metre"
  float Temperature(y, x);
    Temperature:units = "K";
    ....
    Temperature:grid_mapping = "Lambert_Conformal: x y";

Side note: I have found in my experience the explicit is better than implicit. See Zen of Python.

Note: Edited to reduce redundancy.

@snowman2 I can relate to your desire and I like the idea of the attribute on the coordinate variable. I don't think this issue is the place for this proposal. If you'd like to open a new issue for this, I think that would be worthwhile.

@marqh @JimBiardCics I like the text in line 474 in the PR (https://github.com/cf-convention/cf-conventions/pull/224/files#diff-0eab4e85fe4c323f70ce4bce0229dbe6R474), and so I'm happy with the proposal. Thanks for adding that.

(If there's any discussion still to be had on general grid mapping syntax, it can happen at another time and place.)

It looks like we have sufficient agreement to start the clock on agreeing to merge #224. If there are no substantive modifications or objections, it can be merged in three weeks per the contribution guidelines.

marqh commented

hello @dblodgett-usgs

Given the imminent release of CF 1.8 and the discussion and support for this change, do you feel it is reasonable and appropriate to consider
#223 (comment)
(19 days ago)
as the start of the 3 weeks to merge?

I know that
#223 (comment)
was only added 7 days ago, but this was simply a confirmation of the previous update.

This functionality is potentially really useful and it would be of great value to have this within the 1.8 release.

It seems to me worth raising whether this can be expedited in this case in order to deliver this into the release?

many thanks
mark

Yes, I think we should merge this in a few days to get it into 1.8 if possible.

I've got no objection. We just didn't want to hold up 1.8.

No objection from me, either.

I volunteer to do the merge, and update the history appendix, unless anyone else was already planning to do so. If it hasn't happened by tomorrow afternoon (UTC), I'll go ahead. Thanks.