PayneLab/cptac

Units of protein and RNA expression data

seunghun23 opened this issue · 2 comments

Hi,

Thank you for building a nice package and tutorial notebooks.
Could you tell me what the units are for the CPTAC protein expression and RNA expression data
accessible via your package?

Thanks

The RNASeq data has been processed a number of ways depending on the source (see https://www.cell.com/cancer-cell/fulltext/S1535-6108(23)00219-2). The 'unit' of measurement is a transform of read count, depending on the algorithm (e.g. RSEM, FPKM, etc). For mass spectrometry, the 'unit' is a relative measure of the number of ions detected by the mass spectrometer. That original number has been scaled and transformed, depending on the algorithm.

Both protein and RNA are using relative measures of abundance. Bigger number is more, smaller number is less. But none of these measurements claims to be an absolute measure directly corresponding to copies per cell. Please read through the link above and the supplement describing our data processing pipelines.

Thank you so much for a prompt response. I will take a look at the paper