SBRG/cobrame

Runtime error when attempting to remove genes from model

djinnome opened this issue · 3 comments

Hi folks,

If you do the following in the Docker image:

import pickle, cobrame
with open('/home/meuser/me_models/iJL1678b.pickle', 'rb') as f:
     me = pickle.load(f)
     me.remove_genes_from_model( ['b0002'] )

Then I get the following error:

Complex (ASPKINIHOMOSERDEHYDROGI-CPLX) removed from model
Traceback (most recent call last):
  File "/source/cobrame/cobrame/core/model.py", line 385, in remove_genes_from_model
    from optlang.interface import SymbolicParameter
ImportError: cannot import name 'SymbolicParameter'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/source/cobrame/cobrame/core/model.py", line 387, in remove_genes_from_model
    protein.remove_from_model(method='destructive')
  File "/usr/lib/python3.6/site-packages/cobra-0.5.11-py3.6-linux-x86_64.egg/cobra/core/Metabolite.py", line 142, in remove_from_model
    for x in self._reaction:
RuntimeError: Set changed size during iteration

Please advise

A bit more context:

I would like to comprehensively knock out genes in the ME model, so if there is a workaround to perform this operation, that would be great.

CobraPy has remove_from_model(), which is actually called during the CobraME remove_genes_from_model() method, but a reasonable workaround for me would be just a CobraME version of the CobraPy gene.knock_out()

The problem is that the pickled ME object does not contain any genes

>>> me.genes
[]

Perhaps something as simple as this:

def knock_out_genes_from_model( self, gene_list ): 
   for gene in gene_list:
            # Find all complexes the gene product is part of and knock out the associated reactions
            protein = self.metabolites.get_by_id('protein_'+gene)
            for cplx in protein.complexes:
                print('Complex (%s) knocked out in model' % cplx.id)
                for rxn in cplx.metabolic_reactions:
                    rxn.bounds = (0,0)

Hi Jeremy,

The issue here is that there is a bug in COBRApy 0.5.11 that breaks the remove_from_model(method='destructive) call. I've known about this but forgot to address it so thanks for the reminder!

While I'm correcting this, you could avoid this problem by just changing the bounds of each gene's translation reaction (if the gene is translated into a protein) or, more generally, by changing the bounds of each reaction that the gene participates as a reactant in (shown in the example below). You can do this more effectively using a function like below (which was modified from https://github.com/coltonlloyd/cobrame_supplement/blob/master/Table_3_model_essentiality/essentiality.py):

import json
import ecolime
from cobrame.core.reaction import TranscriptionReaction
from qminospy.me1 import ME_NLP1

def compute_gene_essentiality_at_growth_rate(me, gr, out_location):
    me_nlp = ME_NLP1(me, growth_key='mu')
    expressions = me_nlp.compile_expressions()
    me_nlp.compiled_expressions = expressions

    hs = None

    all_genes = me.metabolites.query(re.compile("^RNA_b[0-9]"))

    results = {}
    for gene_RNA in list(all_genes):

        default_bounds = {}
        for r in gene_RNA.reactions:
            if not r.id.startswith("DM") and not \
                    isinstance(r, TranscriptionReaction):
                default_bounds[r] = (r.lower_bound, r.upper_bound)
                r.knock_out()
        x, status, hs = me_nlp.solvelp(gr, basis=hs)

        if status == 'optimal':
            results[gene_RNA.id] = 'NONESSENTIAL'
        else:
            results[gene_RNA.id] = 'ESSENTIAL'

        print("%s\t%s" % (gene_RNA.id.split("_")[1], str(status)))

        with open("%s/iJL1678b_essentiality_%.2f_gr.json" % (out_location, gr),
                  "w") as outfile:
            json.dump(results, outfile, indent=True)

        # Reset bounds
        for r in default_bounds:
            r.lower_bound = default_bounds[r][0]
            r.upper_bound = default_bounds[r][1]

If you use a small enough growth rate (I think I used .1 in manuscript), this will save you from having to run a full bisection for each gene. This also compiles all of the symbolic expressions upfront and reuses the previous simulations basis vector to speed up computations.

Hope this helps!

Thanks for the quick response and the detailed solution. I will definitely try it out and let you know how it works.!