Handle site-specific codes in fields
WerthPADOH opened this issue · 4 comments
Codes in some fields are interpreted differently depending on the tumor's site. Wherever it makes sense, use the tumor data to replace these alphanumerical codes with English phrases. Prioritize fields which are still recommended by NAACCR. Leave alone fields where English phrases wouldn't add much value.
FWIW, I've done some related work to build an i2b2 ontology:
https://github.com/kumc-bmi/naaccr-tumor-data/blob/heron/heron_staging/tumor_reg/csterms.py
https://informatics.gpcnetwork.org/trac/Project/ticket/150
Awesome! I had no idea CS had their site-specific factors in XML files. Thanks!
Site-specific fields (related bundled together):
-
schemaId
,schemaDiscriminator1
,schemaDiscriminator2
,schemaDiscriminator3
-
rxHospSurgPrimSite
,rxHospSurgSite9802
,rxHospScopeReg9802
,rxHospSurgOth9802
,rxSummSurgPrimSite
,rxSummSurgicalApproch
,rxSummReconstruct1st
,rxSummSurgSite9802
,rxSummScopeReg9802
,rxSummSurgOth9802
-
eodPrimaryTumor
,eodRegionalNodes
,eodExtension
,eodMets
,eodExtensionProstPath
,eodLymphNodeInvolv
,eodOld13Digit
,eodOld2Digit
,eodOld4Digit
,eodTumorSize
,derivedEod2018T
,derivedEod2018M
,derivedEod2018N
,derivedEod2018StageGroup
-
tnmPathT
,tnmPathN
,tnmPathM
,tnmPathStageGroup
,tnmClinT
,tnmClinN
,tnmClinM
,tnmClinStageGroup
-
ajccId
,ajccTnmClinT
,ajccTnmClinN
,ajccTnmClinM
,ajccTnmClinStageGroup
,ajccTnmPathT
,ajccTnmPathN
,ajccTnmPathM
,ajccTnmPathStageGroup
,ajccTnmPostTherapyT
,ajccTnmPostTherapyN
,ajccTnmPostTherapyM
,ajccTnmPostTherapyStageGroup
-
derivedAjcc6T
,derivedAjcc6N
,derivedAjcc6M
,derivedAjcc6StageGrp
,derivedAjcc7T
,derivedAjcc7N
,derivedAjcc7M
,derivedAjcc7StageGrp
,derivedPrerx7T
,derivedPrerx7N
,derivedPrerx7M
,derivedPrerx7StageGrp
,derivedPostrx7T
,derivedPostrx7N
,derivedPostrx7M
,derivedPostrx7StgeGrp
-
csTumorSize
,csExtension
,csTumorSizeExtEval
,csLymphNodes
,csLymphNodesEval
,csMetsAtDx
,csMetsAtDxBone
,csMetsAtDxBrain
,csMetsAtDxLiver
,csMetsAtDxLung
,csMetsEval
-
csSiteSpecificFactor1
throughcsSiteSpecificFactor25
-
derivedSeerCombinedT
,derivedSeerCombinedN
,derivedSeerCombinedM
-
npcrDerivedAjcc8TnmClinStgGrp
,npcrDerivedAjcc8TnmPathStgGrp
,npcrDerivedAjcc8TnmPostStgGrp
,npcrDerivedClinStgGrp
,npcrDerivedPathStgGrp
-
seerSiteSpecificFact1
throughseerSiteSpecificFact6
-
gradeClinical
,gradePathological
,gradePostTherapy
-
prostatePathologicalExtension
Closed until #38 is resolved for AJCC, since they hold the copyright for most of this data.