Proteomics dataframes between "umich" and "bcm"
Opened this issue · 0 comments
sunhuaiyu commented
It seems when BRCA proteomics data from "umich" are loaded, there is a subtraction of reference value across all columns; however, when proteomics data from "bcm" are loaded, the resulting dataframe contains the original values. Here is the code that generated the two dataframes:
##cptac==1.15.10
import cptac
import os
import pandas as pd
brca = cptac.Brca()
umich_prot = brca.get_dataframe("proteomics", "umich")
bcm_prot = brca.get_dataframe("proteomics", "bcm")
This is where the subtraction is done in cptac/cancers/umich/umichbrca.py:
ref_intensities = df.loc["ReferenceIntensity"] # get reference intensities to use to calculate ratios
df = df.subtract(ref_intensities, axis="columns") # subtract reference intensities from all the values
Is this difference supposed to be? Thank you!