[MICOM 1.0 API] Proposed new format for fluxes
cdiener opened this issue ยท 2 comments
This is a proposal for a new format for fluxes slated for MICOM 1.0. Feel free to comment ๐
Checklist
- There are no similar issues or pull requests for this yet.
- The request is not specific for MICOM Qiime 2 plugin (q2-micom)
Current state
The current format for fluxes returned by MICOM is a table in wide format:
In [1]: from micom import Community
In [2]: from micom.data import test_taxonomy
In [3]: com = Community(test_taxonomy())
Building โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 100% 0:00:00
In [7]: sol = com.cooperative_tradeoff(fluxes=True)
In [8]: sol.fluxes
Out[8]:
reaction ACALD ACALDt ACKr ACONTa ACONTb ACt2r ADK1 ... SUCDi SUCOAS TALA THD2 TKT1 TKT2 TPI
compartment ...
Escherichia_coli_1 0.049190 -0.008897 -0.004224 5.999485 5.999485 -0.004224 3.388665e-11 ... 5.017641 -5.017641 1.489184 1.924736e-10 1.489184 1.173698 7.513137
Escherichia_coli_2 -0.079989 -0.115231 0.072559 6.001066 6.001066 0.072559 4.264225e-11 ... 5.033051 -5.033051 1.491048 1.924125e-10 1.491048 1.175562 7.495742
Escherichia_coli_3 0.102350 0.197394 -0.100513 6.004985 6.004985 -0.100513 3.662292e-11 ... 5.083935 -5.083935 1.506075 1.926208e-10 1.506075 1.190589 7.460396
Escherichia_coli_4 -0.071551 -0.073266 0.032177 6.023463 6.023463 0.032177 4.133342e-11 ... 5.122875 -5.122875 1.501628 1.926284e-10 1.501628 1.186143 7.440253
medium NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN
[5 rows x 115 columns]
This has resulted in some issues:
- It is incompatible with
cobra.Solution.fluxes
which breaks a lot of the cobra functionality like for instance summary methods. - It can be pretty sparse for very divergent models (many NA entries)
- It mixes medium and taxa fluxes
- It does not specify if export fluxes denote import or export which is one of the most common help requests we receive
- Basically all methods using flux results in MICOM will convert them to a long format
Proposed new API for fluxes
CommunitySolution.fluxes
will retain the cobrapy format and will superseded by new accessors that all return fluxes in long format:
CommunitySolution.exchange_fluxes
Similar to the previous one but with the taxa annotated.
reaction name taxon flux direction micom_id
0 EX_ac_m ac_m medium exchange medium 1.814984e-11 export EX_ac_m
1 EX_acald_m acald_m medium exchange medium 1.328645e-11 export EX_acald_m
2 EX_akg_m akg_m medium exchange medium 3.225128e-12 export EX_akg_m
3 EX_co2_m co2_m medium exchange medium 2.280983e+01 export EX_co2_m
4 EX_etoh_m etoh_m medium exchange medium 1.515389e-11 export EX_etoh_m
.. ... ... ... ... ...
CommunitySolution.internal_fluxes
reaction name taxon flux micom_id
0 ACALD Acetaldehyde dehydrogenase (acetylating) Escherichia_coli_1 1.312146e+00 ACALD__Escherichia_coli_1
1 ACALDt Acetaldehyde reversible transport Escherichia_coli_1 3.236132e+00 ACALDt__Escherichia_coli_1
2 ACKr Acetate kinase Escherichia_coli_1 -1.304078e+00 ACKr__Escherichia_coli_1
3 ACONTa Aconitase (half-reaction A, Citrate hydro-lyase) Escherichia_coli_1 5.987675e+00 ACONTa__Escherichia_coli_1
4 ACONTb Aconitase (half-reaction B, Isocitrate hydro-l... Escherichia_coli_1 5.987675e+00 ACONTb__Escherichia_coli_1
This will consolidate GrowthResults
and CommunitySolution
and gives a more readable format. All those properties are generated on the fly when accessing the property.
Additionaly, we may also want to save the annotations in the solution but they may be large, so it might be better to have a property on the model class like Community.annotations
.
Additional context
A similar format change is planned for Community.knockout_taxa
. elasticities
already uses a long format.
Sorry for perhaps the outdated question.
I am a graduate student working with MES scores from Marcelino et al., Nature Communications 2023
I am currently implementing MES framework for my metgenome data and confused about the res.exchanges['flux'] from the micom.workflow.grow function.
If I want to get the total production flux or consumption flux of a certain metabolite, should I weight each flux of each metabolite with relative abundance of each species? or are the flux of each metabolite already weighted by each species abudance?
I looked at CD_focus/MetModels_summarize_total_produc_consump.py and R_scripts_4_figs
/sulfur_stats_He2017.R and it seemed to me that it was the former case but I just wanted to be sure.
Yes, exactly it would be scaled by relative abundance, though you can also use the production_rates function that does that for you. Also note that MES scores are part of MICOM since version 0.35.0. See the MES function and new visualizations as well.