mekomsolutions/openmrs-module-initializer

Add conceptsets domain to Initializer

mseaton opened this issue · 7 comments

The existing Concepts domain currently supports associating members with concept sets, and answers with concept questions via single columns that expect a semi-colon-delimited list of references. This poses several challenges in management, namely:

  1. It's unwieldy. If we have a diagnosis set with 500 concepts or even a question with a few dozen answers in it, that is difficult and cumbersome to manage in this way

  2. It imposes nested dependencies that impact the order in which Concepts must be listed in the file - a Concept that is the answer or within the Set of another Concept must be loaded in a row prior to the one in which it is referenced. This means that files cannot be organized in ways that facilitate their usability - i.e. alphabetically by english FSN, etc.

In OCL, Collections of Concepts are exported in a JSON file that has 2 member properties:

  1. concepts - contains an entry for each Concept, without answers or set members
  2. mappings - all concepts and mappings are present here, and link set concepts with member concepts, and question concepts with answer concepts

Proposal:

Create a new conceptsets domain which is used both all types of sets (including answer sets). The format of this will be a CSV with the following headers:

Concept Member Member Type (optional) Sort Weight (optional)
Civil Status Married Q-AND-A 1
Civil Status CIEL:1058 Q-AND-A 2
NCD diagnoses SNOMED CT:195967001 CONCEPT-SET 1
NCD diagnoses SNOMED CT:73211009 CONCEPT-SET 2

Member Type is optional, as in most cases it can be inferred from the Concept. If the Concept is a Set, then members are set members. If the Concept is a Coded Question, then members are answers. In the event that a Concept is marked both as a Set, and as a Question concept, then Member Type would be required and if not present, then the import will fail for those rows.

Sort Weight is optional, and will default to the order in which the rows exist in the file if not explicitly specified.

@mks-d - let me know what you think of this proposal. The other obvious design consideration would be to have two separate domains - one for set members, and another for answers to questions - but I chose this design initially given the amount of overlapping functionality, and given that this is consistent with the way OCL manages things.

mks-d commented

@mseaton I prefer one file and using the Member Type column, but ultimately I will defer this to those who are manipulating concepts, @reagan-meant?

Otherwise this makes a lot of sense. There is however one challenge: what would be the process to remove a member? Maybe through a special value for sort weight (like 0 or -1)?

@mks-d good point. Using sort weight in this way could certainly work. If you prefer the consistency of the "Void/Retire" column, that could possibly also be added as an option.

mks-d commented

@mseaton yes indeed, Void/Retire is the way to go then.

Thanks @mks-d @mseaton for these suggestions. Indeed it simplifies managing the concepts which will implying removal of the _order header from the concepts domain files but adding it to the conceptsets domain since we can have sets being added to sets.

@reagan-meant great to hear this will work for you. I don't think we'll need the _order, as although there can be sets of sets, all of the concept references will be defined first, and then the set memberships will be defined second.

There is however one challenge: what would be the process to remove a member?

@mks-d I have a PR for review for this issue. We will see how this performs. I implemented the Void/Retire as we agreed above, however there is no clean way right now to tell Iniz to only keep the specified set members and answers and to remove any not specified in the file. This may or may not be a common use case, but ideally I think it would be good to support it. One thought I had was that, if a "concepts.csv" domain file has an "Answers" or "Members" column defined, and if that column is empty for a given Concept row, then clear existing Answers/Members as appropriate. If the CSV does not have this column, do nothing. But if the CSV has the column and a Concept leaves values blank, then infer that this means it should have no members/answers. Thoughts on this modification?

This would basically allow us to load the concepts domain in first, have it clear out all existing answers and set members, and then load the conceptsets domain in next and have it populate these as appropriate. WYSIWYG.

@mseaton your proposal for clearing Answers or Members makes sense to me at first glance...

... and I'm late to the party, but overall this proposal makes sense...