Add translations for code systems
Opened this issue · 7 comments
The ontology generated with the ontology generator is still missing the translations for the different languages.
These translations, which can be downloaded from the the mii terminology server should be added to the elastic search files generated by the generator when the files are generated.
The structure should be changed from the current structure being an array or not existing display to a display attribute with exactly three translations - original, en, de:
"display": {
"original": "Geriatrie",
"en": "Geriatrix",
"de": "Geriatrie"
}
This should be changed for all files and so for both index file inputs (onto_es__codeable_concept_* and onto_es__ontology_*)
Note that the files for the respective incides look different.
For the codeable concept the translations should be added to each object as follows:
**Codeable concept example changes**
IS:
{
"termcode": {
"code": "0200",
"display": "Geriatrie",
"system": "http://fhir.de/CodeSystem/dkgev/Fachabteilungsschluessel",
"version": 2099
},
"value_sets": [
"http://fhir.de/ValueSet/dkgev/Fachabteilungsschluessel"
]
}
SHOULD:
{
"termcode": {
"code": "0200",
"display": "Geriatrie",
"system": "http://fhir.de/CodeSystem/dkgev/Fachabteilungsschluessel",
"version": 2099
},
"value_sets": [
"http://fhir.de/ValueSet/dkgev/Fachabteilungsschluessel"
],
"display": {
"original": "Geriatrie",
"en": "Geriatrix",
"de": "Geriatrie"
}
}
For the ontology the translations should be added to each object as follows:
Here the "name" attribute is removed and the display attribute with the translations added instead.
**ontology example changes**
IS:
{
"name": "1,1-Dimethoxy-(9Z)octadecene (DMA 18:1)/Oleate (C18:1w9) [Mass Ratio] in Fibroblast",
"availability": 0,
"terminology": "http://loinc.org",
"termcode": "74620-6",
"selectable": true,
"context": {
"system": "fdpg.mii.cds",
"code": "Laboruntersuchung",
"display": "Laboruntersuchung",
"version": "1.0.0"
},
"termcodes": [
{
"system": "http://loinc.org",
"code": "74620-6",
"display": "1,1-Dimethoxy-(9Z)octadecene (DMA 18:1)/Oleate (C18:1w9) [Mass Ratio] in Fibroblast",
"version": "2.78"
}
],
"criteria_sets": [],
"translations": [],
"parents": [],
"children": [],
"related_terms": [],
"kds_module": "Labor"
}
SHOULD:
{
"availability": 0,
"terminology": "http://loinc.org",
"termcode": "74620-6",
"selectable": true,
"context": {
"system": "fdpg.mii.cds",
"code": "Laboruntersuchung",
"display": "Laboruntersuchung",
"version": "1.0.0"
},
"termcodes": [
{
"system": "http://loinc.org",
"code": "74620-6",
"display": "1,1-Dimethoxy-(9Z)octadecene (DMA 18:1)/Oleate (C18:1w9) [Mass Ratio] in Fibroblast",
"version": "2.78"
}
],
"criteria_sets": [],
"display": {
"original": "1,1-Dimethoxy-(9Z)octadecene (DMA 18:1)/Oleate (C18:1w9) [Mass Ratio] in Fibroblas",
"en": "1,1-Dimethoxy-(9Z)octadecene (DMA 18:1)/Oleate (C18:1w9) [Mass Ratio] in Fibroblast",
"de": "1,1-Dimethoxy-(9Z)octadecene (DMA 18:1)/Oleate (C18:1w9) [Mass Verhältnis] in Fibroblas"
},
"parents": [],
"children": [],
"related_terms": [],
"kds_module": "Labor"
}
Downloading the translations from the terminology server
The translations for the respective code from the codesystem can be downloaded from the terminology server as follows:
Where translation is part of the system
example call for sct:
https://onto-server-base-url/fhir/ValueSet/$expand?url=https%3A%2F%2Fwww.medizininformatik-initiative.de%2Ffhir%2Fcore%2Fmodul-diagnose%2FValueSet%2Fdiagnoses-sct&displayLanguage=de&force-system-version=http%3A%2F%2Fsnomed.info%2Fsct%7Chttp%3A%2F%2Fsnomed.info%2Fsct%2F11000274103&includeDesignations=true
example call for loinc:
https://onto-server-base-url/fhir/ValueSet/$expand?url=https%3A%2F%2Fwww.medizininformatik-initiative.de%2Ffhir%2Fext%2Fmodul-icu%2FValueSet%2FCode-Monitoring-und-Vitaldaten-LOINC&includeDesignations=true&designation=urn%3Aietf%3Abcp%3A47%7Cde-DE
where translation is added using a FDPG translation supplement:
Note that the supplement url has to be known in order to add the supplement -
@jpwiedekopf suggested there will be a registry doc on the ontoserver. which can be called as follows:
https://onto-server-base-url/fhir/CodeSystem/fdpg-supplement-registry
@paulolaup TODO - check how to correctly load a supplement
https://onto-server-base-url/fhir/ValueSet/$expand?url=https://www.medizininformatik-initiative.de/fhir/core/modul-person/ValueSet/Vitalstatus&useSupplement=https://example.org/fhir/CodeSystem/KDS/Person/Vitalstatus/translations|1.0.0
@paulolaup , @Frontman50 - make sure to consider that some code systems (like sct) need to have the version explicitedly set in order to expand the designations
e..g:
@paulolaup , @Frontman50 - make sure to consider that some code systems (like sct) need to have the version explicitedly set in order to expand the designations
e..g:
Yes, the question is, do we need to have some initial configuration in the ontology generator that is loaded by some designation resolver so that it can determine whether we need a version to resolve the value sets for some code system it encounters. Or does the FDPG designation supplement concept map provide this information for us.
Current implementation plan:
Implement separate TermcodeDesignationResolver
class handling the designation resolution logic:
- Download and cache the supplement registry
CodeSystem
resource from the specified terminology server (system-url
of thisCodeSystem
resource should ideally be provided by some externalized configuration - config file etc.). - For every term code encountered during the elastic search file generation check if the corresponding value set expansion was already retrieved previously
- If yes, lookup the designations for the coding
- If no, call
ValueSet-expand
using the mapping in the supplement registry and cache the result and look up the designation afterward
- Generate the display entry
Open questions:
- Since both the supplement registry (as per the proposal) and the operation call refer to
ValueSet
resource likely have to resolve the correctValueSet
instance using the information provided during the generation process. @paulolaup - See comment above @paulolaup
@paulolaup , @Frontman50 - make sure to consider that some code systems (like sct) need to have the version explicitedly set in order to expand the designations
e..g:
https://onto-server-base-url/fhir/ValueSet/$expand?url=https%3A%2F%2Fwww.medizininformatik-initiative.de%2Ffhir%2Fcore%2Fmodul-diagnose%2FValueSet%2Fdiagnoses-sct&displayLanguage=de&force-system-version=http%3A%2F%2Fsnomed.info%2Fsct%7Chttp%3A%2F%2Fsnomed.info%2Fsct%2F11000274103&includeDesignations=trueYes, the question is, do we need to have some initial configuration in the ontology generator that is loaded by some designation resolver so that it can determine whether we need a version to resolve the value sets for some code system it encounters. Or does the FDPG designation supplement concept map provide this information for us.
Correction: you do not explicitly have to specify the version of SNOMED CT. The full version of the current German Edition is: http://snomed.info/sct/11000274103/version/20240515
. The example specified by @juliangruendner provides only the edition URI http://snomed.info/sct/11000274103
, which works correctly by "just" using the most current version of the German Edition indexed by the server (which I'll keep current when new versions are releases so you can take advantage of new translations ASAP).
Addition to specification above:
The current implementation should be extended to populate the translations from two different sources.
- The initial map for code systems and their translations should be populated based on the value sets from code systems where translations exist.
For this a mapping file should contain for each code system information on how to resolve the specific language information.
The informaion should be configurable via a json:
{
"code_system_translations":{
"http://snomed.info/sct": {
"parameters":[
{
"name": "version",
"valueUri": "http://snomed.info/sct/11000274103"
},
{
"name": "property",
"valueString": "designation"
},
{
"name": "displayLanguage",
"valueUri": "de"
}
]
},
"http://loinc.org": {
"parameters":[
{
"name": "property",
"valueString": "property"
}
]
}
}
}
Based on this information all codes should be looked up
but in batches as follows:
https://documenter.getpostman.com/view/145584/SWTD6wPM#97fb3c6e-2241-46d4-b4d1-382f46a98933
note that this will have to be done for
all entries in all ui_trees
and
all entries in all value sets
For each ui_tree and for each value_set check if the code system is in the json config above
and if yes create lookup in batches
if no:
skip entry
- Add translations from supplements
similar to how it is currently implemented the translations for all supplements should be added to the resolver.
Additionally there should be an option to download new translation supplements (update_translation_supplements)
which when enabled first looks up all the code systems in the fdpg supplement registry url: https://ontoserver-base-url/fhir/CodeSystem/fdpg-supplement-registry
and then downloads all the fdpg supplements and writes them to the local file system in a folder in the repository, which is set to gitignore
- Additional algorithm information
The algorithm should consider the following:
a. supplements are weaker translations than the ones directly part of a code system (code system translations)
-> when adding supplement translations, they should only be added if no code system translation is available
Addition to specification above:
The current implementation should be extended to populate the translations from two different sources.
1. The initial map for code systems and their translations should be populated based on the value sets from code systems where translations exist.
For this a mapping file should contain for each code system information on how to resolve the specific language information.
The informaion should be configurable via a json:
{ "code_system_translations":{ "http://snomed.info/sct": { "parameters":[ { "name": "version", "valueUri": "http://snomed.info/sct/11000274103" }, { "name": "property", "valueString": "designation" }, { "name": "displayLanguage", "valueUri": "de" } ] }, "http://loinc.org": { "parameters":[ { "name": "property", "valueString": "property" } ] } } }Based on this information all codes should be looked up
but in batches as follows:
https://documenter.getpostman.com/view/145584/SWTD6wPM#97fb3c6e-2241-46d4-b4d1-382f46a98933
note that this will have to be done for all entries in all ui_trees and all entries in all value sets
For each ui_tree and for each value_set check if the code system is in the json config above and if yes create lookup in batches if no: skip entry
2. Add translations from supplements
similar to how it is currently implemented the translations for all supplements should be added to the resolver.
Additionally there should be an option to download new translation supplements (update_translation_supplements) which when enabled first looks up all the code systems in the fdpg supplement registry url:
https://ontoserver-base-url/fhir/CodeSystem/fdpg-supplement-registry
and then downloads all the fdpg supplements and writes them to the local file system in a folder in the repository, which is set to gitignore
3. Additional algorithm information
The algorithm should consider the following:
a. supplements are weaker translations than the ones directly part of a code system (code system translations) -> when adding supplement translations, they should only be added if no code system translation is available
I'd propose to use a proper Parameters
resource as the value for each code system URL.
Where should the config go? Should it be located in the example
directory similar to the translations?
I modified the initial issue. All instances of "de-DE" shall be replaced with simply "de" (and "en-US" with just "en")