
Number of LUAD, LUSC and Normal images

Closed this issue · 3 comments

I have a question regarding the number of .svs files used per LUAD, LUSC, and normal class.
The paper says the numbers are LUAD --> 567, LUSC --> 609, and normal --> 459
But on the github page numbers are different, LUAD --> 823, LUSC --> 753, and normal --> 591.

Why did the numbers change?


Hi - The paper was based on images we were able to download a while ago from the legacy website. On the website, we used the full dataset we were able to download from the latest GDC-TCGA website.

Hi Nicolas,

I have a question regarding to this issue. The number of LUAD slides is 567(403+85+79) in LUAD vs LUSC classification, while it's 382(320+62) in Gene mutation prediction. Is it just caused by removing the diagnostic slides(FFPE slides),rights?


we only used Frozen slides indeed, not FFPE.
You may want to look at the updated example page which is using all the frozen slides available on TCGA as of Jan 2020. Note that in step 3.2, when you do the segmentation of the tiles, the function "" has a "threshold" option that allows you to select all tiles from a given class above a certain threshold. The number of tiles (and slides) included after segmentation can therefore vary depending on your threshold as well
