CUD2V/pccc

ICD 10 DX codes for Malignancy

dewittpe opened this issue · 2 comments

Documentation for malignancy ICD10 codes include "C00-C96" (https://github.com/CUD2V/pccc/blob/master/inst/pccc_references/Categories_of_CCCv2_and_Corresponding_ICD.docx)

In the src code for the package, however, only "C" is defined.

dx_malignancy = {"C","D00","D01","D02","D03","D04","D05","D06","D07","D08","D09","D37","D38",

Do we need to explicitly define C00, C01, C02, C03, C04, ..., C96? I think so for two reasons,

  1. Note that the D01-D09 codes are explicitly defined.
  2. Here is an example of an errant mapping.
library(pccc)
packageVersion("pccc")
# [1] ‘1.0.5’

# id2 has a made up code "CB" which should not match anything, but returns true
# for malignancy
eg_data <- data.frame(id = c("id1", "id2", "id3"),
                      dx1 = c("NOTACODE", "NOTACODE", "notacode"),
                      dx2 = c("C00", "E75", "NOTACODE"),
                      dx3 = c("A", "CB", "C"))

ccc(eg_data, dx_cols = dplyr::starts_with("dx"), icdv = 10)
#   neuromusc cvd respiratory renal gi hemato_immu metabolic congeni_genetic malignancy neonatal tech_dep transplant ccc_flag
# 1         0   0           0     0  0           0         0               0          1        0        0          0        1
# 2         1   0           0     0  0           0         0               0          1        0        0          0        1
# 3         0   0           0     0  0           0         0               0          1        0        0          0        1

@dewittpe - Thanks for finding this bug. I think your suggestion for more explicit mapping is good. As an alternative or in addition to your suggestion, we could check input codes against the list of known ICD10 codes and throw a warning about invalid codes.

Now the real hard part - when to fit this change in with all the other work going on...

I have a simple patch on my fork for this right now. There is another issue that comes up -- the tests fail. I've tracked it down to the existence of a "C7AB" code in the test set. That code is not a currently valid code.

I have some ideas for patches too this package and the icd_file_generator to verify valid codes. It gets difficult as the list of codes is updated yearly.