EHDEN/ETL-UK-Biobank

ICD codes not found in OMOP vocabulary

Closed this issue · 6 comments

After run on real data subset (25-01), 89 ICD9CM and 272 ICD10 codes were not found.

2021-01-25 16:36:08,259 - INFO - Building mapping dictionary for vocabularies: ICD9CM
2021-01-25 16:36:09,559 - WARNING - No mapping to standard concept_id could be generated for 89/503 codes: {'E86.819', '733.46', 'E95.02', 'E88.89', 'E92.089', 'E90.609', '340.9', 'E88.499', '757.68', '492.9', 'E87.99', 'E93.00', 'E88.604', 'E96.89', '920.9', 'E96.82', 'E93.61', 'E88.799', '430.9', 'E87.82', 'E91.70', 'E96.00', 'E91.700', 'E81.91', 'E88.29', 'E85.04', 'E87.81', '486.9', 'E95.03', 'E91.69', 'E91.704', '727.44', '321.5', '799.93', 'E96.69', 'E92.89', 'E90.199', 'E81.82', '226.9', '217.9', '986.9', 'E87.98', 'V25.89', '631.9', 'E95.09', '757.61', 'E88.79', '321.7', 'E91.799', '605.9', 'E90.109', 'E88.11', 'E88.595', '714.09', 'E95.00', '470.9', 'E88.60', 'E91.190', '998.19', 'E87.88', 'E87.86', 'E81.36', '462.9', '729.59', 'E95.04', '727.47', 'E91.796', 'E93.22', 'E84.29', 'V66.01', '463.9', 'E92.59', 'E81.20', 'E81.30', '493.99', '380.19', '626.09', '280.99', '412.9', '728.94', 'E91.89', '541.9', 'E95.05', 'E81.99', '752.50', '250.09', '599.79', 'E81.47', '702.9'}
2021-01-25 16:36:09,589 - INFO - Building mapping dictionary for vocabularies: ICD10
2021-01-25 16:36:11,334 - WARNING - No mapping to standard concept_id could be generated for 272/4414 codes: {'W18.2', 'M77.47', 'X84.9', 'M21.67', 'M77.02', 'X00.0', 'X45.0', 'X61.5', 'I70.21', 'U51.0', 'W29.9', 'M77.03', 'W17.8', 'M76.67', 'X23.0', 'X46.0', 'S42.00', 'M54.64', 'W29.0', 'W60.0', 'W23.2', 'M70.69', 'X51.8', 'M54.46', 'X58.2', 'M94.09', 'M79.46', 'W45.8', 'M54.57', 'Y00.9', 'W10.8', 'W10.2', 'W21.3', 'W27.9', 'I70.00', 'W11.8', 'X41.9', 'X41.0', 'J96.99', 'J96.11', 'M77.37', 'Y04.8', 'M65.43', 'M72.27', 'X42.9', 'I70.20', 'W07.2', 'Y04.0', 'W50.3', 'W01.3', 'W54.4', 'X65.5', 'M54.47', 'W20.5', 'M70.65', 'W25.6', 'M54.39', 'M85.20', 'W19.2', 'M54.55', 'Y08.30', 'X40.9', 'W12.9', 'W13.9', 'W64.3', 'X51.1', 'M76.87', 'W03.5', 'M54.37', 'W01.8', 'X61.9', 'S06.00', 'M54.38', 'W20.9', 'X50.9', 'W19.4', 'W06.9', 'W18.5', 'W02.8', 'W00.8', 'W50.9', 'W45.0', 'X12.0', 'X61.0', 'M70.46', 'I70.10', 'X50.0', 'X59.4', 'X49.2', 'W19.6', 'X58.3', 'M21.37', 'W18.0', 'X51.9', 'X68.9', 'W06.0', 'W49.9', 'W01.31', 'X41.8', 'M65.34', 'W01.9', 'W19.0', 'X78.0', 'M77.57', 'W07.9', 'W19.8', 'W50.8', 'W18.9', 'X59.2', 'W50.4', 'M72.14', 'W54.09', 'M91.19', 'W02.0', 'W64.9', 'W01.2', 'W01.48', 'W01.5', 'W01.0', 'W08.0', 'X91.9', 'J96.01', 'X40.0', 'M72.04', 'W23.8', 'X64.5', 'W25.0', 'M91.15', 'W45.9', 'W17.0', 'M76.66', 'W31.6', 'X64.4', 'W23.9', 'M54.22', 'S72.00', 'W00.4', 'M21.76', 'X65.9', 'W23.0', 'W10.0', 'W06.1', 'W19.9', 'Y28.9', 'J96.91', 'W10.3', 'X65.0', 'X99.9', 'J96.90', 'W02.3', 'M21.75', 'W16.8', 'Y29.9', 'M54.36', 'W55.8', 'X64.0', 'W19.1', 'M54.33', 'X99.0', 'M54.23', 'W79.9', 'W31.2', 'X51.4', 'Y04.9', 'W22.9', 'W44.8', 'W29.8', 'W22.0', 'W27.99', 'X60.4', 'M76.69', 'W26.2', 'X63.0', 'X16.0', 'M54.49', 'M72.09', 'X44.0', 'W01.4', 'M45.X9', 'W20.0', 'W64.2', 'S12.00', 'M54.59', 'W88.2', 'W22.5', 'W50.1', 'W57.8', 'W54.9', 'X64.9', 'W84.2', 'M54.58', 'W19.3', 'S02.00', 'W18.3', 'X49.9', 'W25.9', 'X50.3', 'M76.85', 'W07.0', 'W18.4', 'W10.9', 'W17.9', 'X61.8', 'W22.6', 'W23.52', 'W18.8', 'X45.9', 'W79.2', 'W13.0', 'W02.9', 'W64.5', 'X11.0', 'X62.0', 'W23.5', 'X58.9', 'X62.4', 'W21.30', 'M54.50', 'W07.8', 'X60.9', 'X46.6', 'W00.9', 'X59.3', 'W00.0', 'W51.3', 'W00.3', 'X50.5', 'W54.0', 'M77.12', 'W00.5', 'X62.9', 'W79.0', 'W57.9', 'W86.2', 'W55.9', 'W27.0', 'W49.0', 'W18.6', 'W19.5', 'X44.9', 'X49.0', 'W18.1', 'W25.8', 'X16.9', 'J96.00', 'W78.9', 'U51.1', 'X60.0', 'Y04.4', 'X23.9', 'W11.0', 'W44.0', 'W06.2', 'W23.6', 'W31.9', 'M71.26', 'Y00.5', 'X50.8', 'M21.64', 'X58.5', 'M99.81', 'W54.8', 'W11.6', 'M54.56', 'M70.04', 'Y09.9', 'W44.9', 'W11.9', 'W28.1', 'W25.5', 'W10.5', 'X42.0'}

Why are these ICD codes not mapped?

ICD9: Some examples, left concept is how we have it now, right concept is the right one to use:

  • 'E87.86', E878.6 (Removal of other organ (partial) (total) causing abnormal patient reaction, or later complication, without mention of misadventure at time of operation Abn reac-organ rem NEC)
  • '470.9', 470 (Deviated nasal septum)
  • '463.9', 463 (Acute tonsillitis)
  • 'E88.799'/'E88.79', E887 (Fracture, cause unspecified)
  • '799.93', 7999 (Other unknown and unspecified causes of morbidity and mortality)
  • '462.9', 462 (Acute pharyngitis)
  • '321.5', 072.1 (Mumps meningitis)
  • 'E91.89', E918 (Caught accidentally in or between objects)
  • 'E95.09', E950.9 (Suicide and self-inflicted poisoning by other and unspecified solid and liquid substances)
  • 'E95.03', E950.3 (Suicide and self-inflicted poisoning by tranquilizers and other psychotropic agents)
  • 'E91.70'/'E91.704'/E91.700, E917.0 (Striking against or struck accidentally by objects or persons in sports)
  • '321.7', 321.2 (Meningitis due to viruses not elsewhere classified)
  • 'V66.01', V66.0 (Convalescence following surgery)

ICD10:

  • 'W18.2', ICD10CM (Fall in (into) shower or empty bathtub)
  • 'M77.47', ICD10CM/ICD10 M77.4 (Metatarsalgia)
  • 'X84.9', ICD10 X84 (Intentional self-harm by unspecified means)
  • 'M21.67', ICD10 M21.6 (Other acquired deformities of ankle and foot)
  • 'M77.02', ICD10CM (Medial epicondylitis, left elbow) or ICD10 M77.0 (Medial epicondylitis)
  • 'X00.0', ICD10 X00 (Exposure to uncontrolled fire in building or structure)
  • 'X45.0', ICD10 X45 (Accidental poisoning by and exposure to alcohol)
  • 'X61.5', ICD10 X61 (Intentional self-poisoning by and exposure to antiepileptic, sedative-hypnotic, antiparkinsonism and psychotropic drugs, not elsewhere classified)
  • 'I70.21', ICD10 I70.2 (Atherosclerosis of arteries of extremities)

Two fixes we can implement:

  • For ICD9: when code starts with an 'E', put the dot after four characters.
  • For ICD10: Keep the first four characters only ('I70.21' becomes 'I70.2')

Potentially an additional rule:

  • Keep only the first three characters for ICD10 codes starting with W, X or Y. To be researched whether these codes for ICD10 never have a third digit.

The following rules are implemented:
For ICD9 codes:

  • E chapters map to format EXXX.X
  • V chapters map to format VXX.X
  • 4 or 5 number codes map to the 3 first numbers.

For ICD10 codes:

  • Keep only the first three characters for ICD10 codes starting with W, X or Y.
  • Keep the first four characters only ('I70.21' becomes 'I70.2')