microsoft/fhir-codegen

Add syntax checks to the deserializer

ewoutkramer opened this issue · 3 comments

Currently the deserializer is mostly concerned about the happy path. Add code that does the same checks as the current SDK does:

Xml syntactic checks

  • incorrect/empty namespace . P, R:namespace ignored
  • empty attributes . P, R: whitespace,null then value=null
  • no attribute, no elements P ->S, R:empty child list, value=null SKIP?
  • repeating elements inconsecutively P,->S
  • Invalid FHIR Xhtml P ->T
  • container resources with attributes . P, R:no contained found
  • contained resources with attributes . P, R:no contained found
  • container resources with multiple children . P, R:no contained found

Json syntactic checks

  • incorrect use of null P, R: skip
  • incorrect use/combinations of name and _name . P, R: return main
  • incorrect use of arrays for name and _name . P, R:return main
  • invalid json token type . P: skip unless some useful content found
  • empty string values ->T, p:return null
  • non-arrays with null values . P, R:return null
  • empty objects . P, R:return null
  • obsolete fhir_comments use . P, R:skip

For completeness, I'll add the semantic checks done in subsequent phases in the SDK:

Type checks

  • complex nodes with values ->T
  • unparseable primitive values .
  • incorrect primitive value ->T
  • NEW empty string in primitive value ->T
  • NEW no value and no children ->T
  • missing/superfluous contained resources ->T
  • element names missing suffix for choice types .
  • choice types with incorrect type ->T

for xml:

  • elements out of order ->T
  • incorrect use of xml nodes (attributes instead of elements) .

for json:

  • missing/superfluous use of arrays .
  • invalid xml in narrative ->T
  • Invalid FHIR Xhtml in narrative ->T

While discussing this: we could decide which of these errors need to be configurable. Or we could actually allow a callback so the user can decide what to do with the errors or even insert correction logic.

I forgot one semantic check that is currently not done: all elements of an array should be of the same type.