Add `value_presence` slot to `slot_expression`
pkalita-lbl opened this issue · 2 comments
Similar to slots like equals_string
and equals_number
, the value_presence
slot would be useful for describing class rules, i.e. expressing rules like "if slot a
has any value, then slot b
must also have a value".
I don't mean to introduce complexity, but I'm just putting down here the possibility of having a categorical "slot status" that covers this. It could have a "value present" value, but as well: https://www.insdc.org/submitting-standards/missing-value-reporting/
At moment for NCBI Biosample submission we have it that a field in addition to whatever data type it is, can have an extra categorical list of INSDC metadata values in its range. But such metadata would be shunted into the kind of slot/ field you propose before going into a database. Rules could then be based on these values too, like if field x is missing, don't make field y required.
This issue is about a metamodel predicate that allows us to do the equivalent of is null / is not null checks as part of conditional rule evaluation
I think the enums that we might want to encode missing values in sample data is a different use case, but we can look at this. I think the two-value tuple with conditional logic you describe could be represented as:
classes:
Sample:
age:
range: integer
age_collection_missing_status:
range: MissingValueEnum
rules:
## pseudocode:
##
## IF age_collection_missing_status IS NULL:
##. THEN age is required
preconditions:
slot_conditions:
age_collection_missing_status:
value_presence: ABSENT
preconditions:
slot_conditions:
age:
required: true
this is a little unintuitive/meta at first as we are mixing the concept of null values at different levels
it might be more untuitive to force an explicit collection status:
classes:
Sample:
age:
range: integer
age_collection_status:
range: CollectionStatusEnum
required:
rules:
preconditions:
slot_conditions:
age_collection_status:
equals_string: "COLLECTED"
preconditions:
slot_conditions:
age:
required: true
yet another way:
classes:
Sample:
age:
any_of:
- range: integer
- range: MissingDataReasonEnum
which has a straightforward mapping to something like python (Union) but introduces a bit of a mismatch with mapping to a stringly typed relational database representation