mahmoodlab/HIPT

Setting patient_strat=FALSE in subtyping

HanwenXuTHU opened this issue · 3 comments

Thank you for this great work! I observed this codebase sets patient_strat=FALSE and wonder why we didn't do the patient level splitting here. Could it be possible that if two slides from the same patient are split into training and test respectively, the model will simply remember the labels and similar structures? Is it a community standard to do the slide level stratification?

Hi @HanwenXuTHU - Thank you for your comments. The code is a bit unclear as we created the splits beforehand w/o using the create_splits function. We are working on an update soon to make the code more clear and uniform with our other repositories!

Hi Richard, thank you for the quickly reply! Do you suggest doing the patient-based stratification or slide-based stratification?

For slide classification tasks, I would suggest doing slide stratification (with survival analysis doing patient-based stratification).