bjodah/chempy

underdetermined set of equations giving negative numbers

saraaamin opened this issue · 4 comments

in the code this comment states what happens when setting the underdertermined = none

underdetermined : bool
Allows to find a non-unique solution (in addition to a constant factor
across all terms). Set to False to disallow (raise ValueError) on
e.g. "C + O2 -> CO + CO2". Set to None if you want the symbols replaced
so that the coefficients are the smallest possible positive (non-zero) integers.

yet i ran several examples where i get back -ive values
example 1 of a result :

{'C10H15N5O10P2': -57, 'CH4NO5P': 70, 'C3H6O3': 35, 'H3PO4': -88}
{'C3H7NO2': 5, 'H2O': 5, 'HCO3': 30, 'C10H16N5O13P3': -44}

example 2:

{'C10H14N5O7P': -6, 'C3H6O3': 8, 'H4P2O7': 15, 'C4H8N2O3': 43}
{'C3H7NO2': 8, 'H2O': 64, 'C4H7NO4': 8, 'C10H16N5O13P3': 8}

Yes, it's a known issue. The code for underdetermined = None is currently using SymPy to solve the system of linear equations. However, SymPy's assumption system is not powerful enough to handle non-negativity well. I have been meaning to switch to integer-linear-programming solver to solve the system of equations and minimize the sum of coefficients (the minimization requirement is what allows us to have a canonical representation). It looks as if cvxpy should be able to solve this kind of problems. But I have yet to find the time to leverage it in the balancing code.

Out of curiosity: what's the use case for determining coefficients for an underdetermined system? (I had education as possible use-case in mind when I added the feature).

The reason for me asking is that if it's only for verifying that reaction formulae are balanced you can e.g. do this:

In [2]: chempy.ReactionSystem.from_string("H2O -> H+ + OH-")
Out[2]: <chempy.reactionsystem.ReactionSystem at 0x7f9c27e255c0>

In [3]: chempy.ReactionSystem.from_string("H2O -> H+ + 2 OH-")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
...
ValueError: Composition violation in H2O -> H+ + 2 OH-

whereas if you want to canonicalize equations with unknown coefficients we need to fix balance_stoichiometry. I'll see if I can look into it this weekend (no promises though).

I'm using it for a research project within my PhD :)
I have a bacteria model that i'm adding new chemical equations to and I wish to balance those equations before adding them to the model. Therefore the numbers are important because I am retrieving them via your code and then I will use them as part of a bigger liner model.

Within my runs, i found that some of my equations can't be balanced, and I ignored those, but i'm interested in the under-determined condition because you can actually balance the equation, it's just not a unique solution.

I see. Sounds good. Let me get back to you on how cvxpy pans out (or if you want to give it a try yourself a pull-request is of course most welcome).