stac-extensions/eo

Additional cover fields

matthewhanson opened this issue · 8 comments

Following stac-extensions/landsat#2

In reviewing the Sentinel specific metadata there are some additional fields and thinking through which ones are more general.

From sentinel-2 fields, these are all 0-100% values that are calculated as a percent of pixels vs the total data pixels.

  • nodata (calculated as fraction of nodata to total number of pixels, nodata and data)
  • saturated/defective
  • cloud shadow
  • vegetation, and not vegetation
  • water
  • cloud cover (of medium probability clouds)
  • cloud cover (of high probability clouds)
  • cloud cover (cirrus)

Other than nodata, and possibly cloud shadows (though I'm not sure this is even populated in S2) I think these probably belong in a different statistics related extension - especially those based on scene classification (water, vegetation).

More generally however, these types of fields are good examples of how data providers may calculate different types of statistics depending on the use case. I'm not sure it's worth capturing all these different variations. Sensor specific metadata, such as in the landsat and sentinel data allow for an easy way to surface it and make it searchable.

Are there plans to make nodata part of this extension? I think that would make sense on a per-band basis. It's currently set separately from this extension in https://github.com/crim-ca/dlm-extension?tab=readme-ov-file#data-object

@rbavery nodata is actually in the raster extension. This does allow it to be specified per band through the bands field in the raster extension, however, note there is a new change upcoming in STAC 1.1 that will allow any fields to be specified at the Item, Asset, or Bands level.
As a result of this change there will be updates to this extension (to remove eo:bands) and raster extension (to remove raster:bands) as this will be handled by the core bands field.

gotcha thanks! I just found this discussion radiantearth/stac-spec#1213 (comment)

I will work on updating DLM to match STAC 1.1 's treatment of bands as a core field, replacing the custom data object cc @fmigneault-crim

m-mohr commented

Just to make this clear, nodata will move to common metadata, so that you can use it flexibly, also in bands.

I think these fields need to be partially in raster and partially in eo and this will hopefully not feel akwards due to the changes that we do in the context of the bands RFC.

For example, nodata (not the value but the cover, maybe inversed: data_cover) would probably go into raster because it's not EO specific. Saturated also seems more raster than EO.

On the other hand, all cloud-related fields seem more to be in the EO domain.

A number of additional cover-related fields seem also to be present in Sentinel-3...

Wondering whether we could just put all the fields with mission specific names into the new statistics field in STAC 1.1...

I think mission-specific fields should remain in specific extensions. Otherwise, statistics will become a big blob of "any key" values, and search across collections for similar properties using different mission-specific conventions will become cumbersome. If some properties can be aligned between missions (eg: cloud cover), IMO it is much better to align them toward a common and well established field to increase chances of interoperability regardless of the mission.

Agreed, but in the last 1-2 years no convention has evolved outside of cloud and snow/ice. As such statistics is at least a way to somewhat consistently specify additional coverages/percentages. It's at least better then having all the sentinel-extensions (e.g. 2, 3), I think.

Highly related PR with example: radiantearth/stac-spec#1319