reneshbedre/bioinfokit

Normalization on TCGA dataset

vappiah opened this issue · 2 comments

Hello bioinfokit team, I have downloaded raw htseq gene expression data from TCGA and I would like to perform FPKM normalization. But on checking the documentation the commands require the gene lengths but I don't have that info. Is there a way to normalize without supplying the gene length? Thanks

@vappiah
FPKM is based on the gene length for normalization. Check formula here. If you do not have gene length you can use CPM or DESeq2 for normalization. Check here.

Thanks @reneshbedre I will do that