Suggestions for API improvements to facilitate competition experiment analysis
Opened this issue · 2 comments
From our discussion today, here is what we decided we would need for input:
- Compound stock dictionary. A master data structure listing
compound_stocks = [
{ 'id' : 'BOS001',
'compound_name' : 'bosutinib',
'compound_mw' : 530.446, # g/mol
'compound_mass' : 12.45, # mg compound dispensed
'compound_purity' : 0.96, # purity
'solvent_mass' : 12.52, # g solvent dispensed
}
...
]
The compound stock dictionary could be used for all experiments, and would eventually be retrieved from a lightweight database running on Amazon EC2 or someplace.
- D300 XML file specifying compound and DMSO concentrations in each well
- For each assay plate:
- A list of compound stock ids for used for the rows (e.g.
['BOS001', 'GEF002', 'ERL001', 'BSI004']
) - The protein name pipetted into rows and its stated concentration
- The Infinite XML file after the plate has been read
- A list of compound stock ids for used for the rows (e.g.
- An error assumption dictionary which would contain
- D300 dispense CVs
- protein concentration uncertainties
- any other assumptions that go into error modeling
For the interested, this document has some good statistics about the reliability of protein concentrations as a function of measured absorbance. CV seems to be as high as 13.1% for low conc (0.227 mg/mL), 3.7% for med-low (0.898 mg/mL), and better than 1% for high (>2 mg/mL).
We'll need to collect at least one example of all of this data for development and testing.
Would be useful to have someone (@sonyahanson @Lucelenie @MehtapIsik) check in such an example, especially a coherent set of
- which compound stock id was used in each row
- HP D300 XML files
- Infinite XML files from the plate read
- protein info and stated concentration
I think I can generate the rest!