eurec4a/meta

platform metadata does not match schema

Closed this issue · 6 comments

d70-t commented

The current platform metadata does not conform to the schema definition (https://github.com/eurec4a/meta/blob/schema/schema/main_schema.yaml)

Validation errors are shown here: https://github.com/eurec4a/meta/runs/641175121#step:5:19

In particular the issues are:

  • ci is not yet listed as a personal tag should it be one? This may go along with the tags branch, which adds a more dynamical way to define tags: https://github.com/eurec4a/meta/blob/tags/tags.yaml
  • instrument_IDs are spelled instrument_ids in the YAML comment which spelling should be used?
  • several aliases are numeric ids, but the current schema suggests that aliases should be string. Maybe it is a good idea to enforce the usage of strings in this place.

I came to conclude the ci is too restrictive, and really the person's function in the higher level files should be based on them being identified as the lc (lead contact) in the lower level file.

I had been trying to use keyword-ID barring a reason to choose something else I suggested we adapt to that.

I am unsure about the numeric ids in aliases, normally they only are numbers that appear together with a character, but for the drifters there may be some pure numbers, in each case I would prefer to interpret these 'numbers' as strings.

d70-t commented

Then I'll remove the ci tags from where they are currently used.

The issue with instrument-ID might be that the minus-sign may cause tricky behaviour in some programming languages. For example JavaScript usually would allow access to keys of a mapping using a.key but if there is a minus in it (platform.instrument-ID), that is a problem. Apart from that, accessing it by platform["instrument-ID"] should work in any case. We should adhere to a consistent notation though and use it throughout all documents (including README.md). The current options are:

  • instrument_id
  • instrument_ID
  • instrument-ID

Personally I'd prefer instrument_id, but will adapt to any other choice as well.

Enforcing strings for the aliases is fine to me as well. I'll add quote in the yaml file to disambiguate it and make the parsers happy.

Yes, I had that in mind, but the solution "keyword-ID" was my way out, as you suggest. The problem I had with _id or _ID is that before we were using hyphen for compound names, and underscore as name delimiters, i.e., WP-3D_Track ... But maybe I am hanging on too much to the desire to parse filenames. Then again, since more of the EUREC4A community is behind me in this regard, maybe we should keep this capability... I really am undecided we could wait for other opinions?

d70-t commented

Why do you need the hyphen in the identifier? If the naming convention would be:
<campaign_id> _ <platform_id>
then the actual resulting name would be:
EUREC4A_HALO
or similar. So apart from some difficulties in writing down the file naming conventions, there seems to be no clash in notation for me...

Yes, I was just trying to be consistent in the metatext and the meta-metatext. But that makes the problem too stiff. Your point is a good one, So then lets switch to _id or ID for everything.

d70-t commented

I changed it to ..._id notation. With #4 these issues should be resolved and could be merged. #3 is now in conflict with the schema branch.