bjodah/chempy

Non-integer subscripts in formulas

spizwhiz opened this issue · 6 comments

Hi,

Is it possible in ChemPy to parse formulas for substances that have non-integer subscripts?

e.g.

'Ca2.832Fe0.6285Mg2.5395(CO3)6'

If so, how would you format the formula above to achieve the desired result?

Right now, I get the following:

cp.Substance.from_formula('Ca2.832Fe0.6285Mg2.5395(CO3)6')

Ca2⋅832Fe0⋅6285Mg2⋅5395(CO3)6

Using formulas like this is a common need when working with natural minerals that behave as a solid solution. I thought about just increasing the subscripts until they are all integers, but would then have a molar mass that is incorrect.

image

Thanks, I think I understand the issues.

Would requiring the user to use some kind of special notation for decimal subscripts potentially help with the hydrates issue?

E.g. putting the subscript in square brackets or curly braces?

'Ca{2.832}Fe{0.6285}Mg{2.5395}(CO3)6'

Awesome, thanks!

I wish I could help more, but I don't think I would be that much help to you at this point.

For anyone else needing to work with decimal subscripts, here is a function to quickly calculate the smallest integer subscripts, and the scaling factor.

import decimal

def integer_subs(frac_subs):
    dec = [decimal.Decimal(str(sub)).as_tuple().exponent for sub in frac_subs]
    decmin = min(dec)
    mf = 10**-decmin
    subs = frac_subs*mf
    subs = subs.round()
    subs = subs.astype(int)

    cd = gcd.reduce(subs)

    subsf = (subs/cd).astype(int)
    print('Integer Subscripts: ', subsf)

    sf = mf/cd
    print('Scaling Factor:', sf)
subs1 = np.array([2.832,0.6285,2.5395,6])

integer_subs(subs1)

Integer Subscripts:  [1888  419 1693 4000]
Scaling Factor: 666.6666666666666

Hi, and thank you both for looking into this.

I'm open to changing the syntax for crystal water to allow parsing non-integer stoichiometric coefficients.

What about "Na2SO4:10H2O", does that look "natural" enough in your eyes? (just a spontaneous suggestion of mine).
I personally won't be able to code up a prototype anytime soon I'm afraid, but if you want to go ahead I'll definitely make time for code review and publishing an updated release etc.

Changing the syntax will be a breaking change, so we'll need to bump the version number, ideally one could have the parser accept both old and new syntax for one intermediate release, issuing a warning instructing the user to migrate to the new syntax. But if that is not possible I'm open to skipping the deprecation cycle.