phenopackets/phenopacket-tools

duchenne.json does not conform to specification

cmungall opened this issue · 2 comments

I am using the experimental linkml rendering of phenopackets which has more rigorous validation that protobuf

It finds an error with:

https://github.com/phenopackets/phenopacket-tools/blob/gh-pages/examples/phenopackets/duchenne.json

specifically the variantInterpretation

"variantInterpretation": {
"acmgPathogenicityClassification": "PATHOGENIC",
"variationDescriptor": {
"variation": {
"copyNumber": {
"allele": {
"sequenceLocation": {
"sequenceId": "refseq:NC_000023.11",
"sequenceInterval": {
"startNumber": {
"value": "31774144"
},
"endNumber": {
"value": "31785736"
}
}
}
},
"number": {
"value": "1"
}
}
},
"geneContext": {
"valueId": "HGNC:2928",
"symbol": "DMD"
},
"expressions": [{
"syntax": "hgvs.c",
"value": "NM_004006.3:c.7310-11543_7359del"
}, {
"syntax": "transcript_reference",
"value": "NM_004006.3"
}],
"moleculeContext": "genomic",
"allelicState": {
"id": "GENO:0000134",
"label": "hemizygous"
}
}
}

lacks an id as specified in the comments here https://phenopacket-schema.readthedocs.io/en/latest/variant.html

also:

  • retinoblastoma
  • holoprosencephaly5
ielis commented

Hi @cmungall
thanks for pointing this out. Yes, the code that works with VRS-like is a weak spot of the library/app.
We may need to adjust the builders on the go as I am having serious trouble with understanding the VRS specification. I am coming from a VCF/HGVS world and I am not sure how one can map variant types commonly encountered in those nomenclatures.

In #150 I reworked the examples to ensure they are valid according to phenopacket-tools base validator.