mskcc/RNAseqDB

duplicated entrez gene ids

Opened this issue · 0 comments

Hi, first of all thanks on working on this. Very useful resource.

I noticed that some rows in RPKM files have duplcated entrez ids. E.g., in thyroid-rsem-fpkm-gtex.txt.gz there are two lines:

     Hugo_Symbol Entrez_Gene_Id GTEX.Y111.1926.SM.4SOIS
1343        CSH2           1442                       0
9482        CSH1           1442                       0

They have identical Entrez_Gene_Id (1442, which is CSH1). However, the correct Entrez_Gene_Id for CSH2 would be 1443, not 1442. Any suggestions on where is this coming from?