stac-extensions/processing

Add a `Software` object

Closed this issue · 5 comments

This extension is used to describe processing history of both assets and the STAC metadata. For STAC metadata, the software that was used to create the data is often public (at least in part). We could provide more guidance/structure around how to provide that information. I haven't fully thought this through, but that might look like this:

Field Name Type Description
processing:software List<Software> Software that was used to create assets or the STAC object itself, in order from the first software used to the last

And then the Software object:

Field Name Type Description
name string REQUIRED: The name of the software package or script that was used, e.g. stactools-landsat
href string An href to the source code, e.g. https://github.com/stactools-packages/landsat. This could also be the href from a package management system, e.g. https://pypi.org/project/stactools-landsat/
version string A version identifier, e.g. v1.2.3. This could also be an SHA from a version control system, e.g. 6dcb09b5b57875f334f61aebed695e2e4193db5e
m-mohr commented

If you just need the href additionally, why not just provide links?

why not just provide links?

href is not required, and it would be if we used links?

m-mohr commented

But if you don't have an href for a link, just don't add a Link Object to the links array? There would be no close relation between the software and the link though. I'm a bit hesistant of the breaking change...

But if you don't have an href for a link, just don't add a Link Object to the links array? There would be no close relation between the software and the link though. I'm a bit hesitant of the breaking change...

I'm a little less worried because there will be a good forward migration path (use the current key as the name, use the current other field as the version). Your comment did make me realize that I think it should be a list, not a dictionary, and I've updated my proposal accordingly.

@gadomski I've added a relation type processing-software so that this can be externalized, see #32
This could e.g. link to an even more advanced version, e.g. a Pipfile.lock, package-lock.json or something custom as proposed by you.
I'm not quite sure we should have this level of detail in the STAC Items itself - and it's still breaking ;-)
Maybe the rel type is a good enough compromise for now?