mskcc/RNAseqDB

Duplicated sample IDs with different expression of genes in TCGA BRCA data

mxdeluca opened this issue · 1 comments

Hi, after downloading the data you provided on figshare (brca-rsem-count-tcga.txt.gz and brca-rsem-count-tcga-t.txt.gz), I noticed duplicated samples in the data, but with a completely different expression pattern of gene expression (they are not even close together after doing a PCA/tSNE visualization...) Any idea of what may be happening there?

Check out this: