columns selected here miss the column HADM_ID ?

clinical-outcome-prediction/tasks/adm_dis_match_mimic/adm_dis_match_mimic.py

Line 108 in b376f4b

df=mimic_notes[['ROW_ID', 'SUBJECT_ID', 'TEXT_ADMISSION', 'TEXT_DISCHARGE']],

And how to pretrain the wiki before the diag task?

Hi,
since these notes are only used for the pre-training task, we don't necessarily need the HADM_ID column here. Or did you find that there is an issue with the column missing?

For pre-training on WIKI data, you can first create the data via adm_dis_match_wiki.py and then use the resulting data set for outcome pre-training via outcome_pretraining.py.

Hope this is helpful!
Best regards
Betty