jgm/texmath

MathML -> Tex conversion, mfenced element: ( |

felix-smashdocs opened this issue · 5 comments

Hello texmath-team,

this issue is about converting the -Element from MathML to Tex.
I have the following example:

initial_formula

which is represented in MathML as:

<math>
    <mfenced separators="|">
        <mrow>
            <mi>A</mi>
        </mrow>
        <mrow>
            <mi>B</mi>
        </mrow>
    </mfenced>
    <mi></mi>
    <mfenced close="|" open="|">
        <mrow>
            <mi>C</mi>
            <mfenced separators="|">
                <mrow>
                    <mi>D</mi>
                </mrow>
                <mrow>
                    <mi>E</mi>
                </mrow>
            </mfenced>
        </mrow>
    </mfenced>
</math>

When I convert this MathML to tex in https://johnmacfarlane.net/texmath.html, I get as result:

\left( A \middle| B \right)\left| {C\left| D \middle| E \right|} \right|

which looks like

result_formula

The parentheses () around D and E have been transformed in the end to vertical lines ||. But expected is that the formula looks like the initial version.

Is this a bug in the texmath lib?

Best greetings,

Felix

jgm commented

It took me a while to figure out why the parser was doing this!

% texmath -f mathml -t native
<math>
     <mfenced separators="|">
                <mrow>
                    <mi>D</mi>
                </mrow>
                <mrow>
                    <mi>E</mi>
                </mrow>
            </mfenced>

</math>
[EDelimited "(" ")" [Right (EIdentifier "D"),Left "|",Right (EIdentifier "E")]]

(here we get the correct parentheses for open and close), but when we embed this in the outer mfenced:

% texmath -f mathml -t native
<math>
    <mfenced separators="|">
        <mrow>
            <mi>A</mi>
        </mrow>
        <mrow>
            <mi>B</mi>
        </mrow>
    </mfenced>
    <mi></mi>
    <mfenced close="|" open="|">
        <mrow>
            <mi>C</mi>
            <mfenced separators="|">
                <mrow>
                    <mi>D</mi>
                </mrow>
                <mrow>
                    <mi>E</mi>
                </mrow>
            </mfenced>
        </mrow>
    </mfenced>
</math>
[EDelimited "(" ")" [Right (EIdentifier "A"),Left "|",Right (EIdentifier "B")],EIdentifier "",EDelimited "|" "|" [Right (EGrouped [EIdentifier "C",EDelimited "|" "|" [Right (EIdentifier "D"),Left "|",Right (EIdentifier "E")]])]]

Now we get | for open and close!

The reason, it seems, is that the MathML reader stores the attributes of outer elements in state and uses them when parsing children in some cases. The relevant code is here (Text.TeXMath.Readers.MathML, line 599):

findAttrQ :: String -> Element -> MML (Maybe T.Text)
findAttrQ s e = do
  inherit <- asks (lookupAttrQ s . attrs)
  return $ fmap T.pack $
    findAttr (QName s Nothing Nothing) e
      <|> inherit

So what's happening here is that the inner mfenced is inheriting the outer one's open and close attributes, since it doesn't explicitly specify them. Clearly, that's not what should be happening: rather, open and close should receive default values.

This code was added long ago by @mpickering - I wonder if he can remember why we have this inherit?

jgm commented

When I change this inheritance so it doesn't accumulate attributes from parents, I see a number of test failures, e.g. in munder5.mml

    <munder accentunder="false">
      <mi>x</mi> 
      <mo> &#x02DC;</mo> 
    </munder> 

we get ESymbol Accent "\732" instead of ESymbol Ord "\732", apparently because the attribute accentunder="false" is not getting seen. On a quick glance, all the test failures are like this, so maybe this was the reason for the inheritance? Something more limited should then work?

jgm commented

Looking at the spec, it seems that many attributes inherit, but these don't.

Cool, thanks a lot @jgm for the answer and solution! Works for me.
Interesting, that the inheritance behaviour is defined in a different way for different attributes.

jgm commented

Yeah, there may be other problems of this kind, but I couldn't find a handy table of which attributes inherit and which don't -- and I didn't have time to comb through the whole spec exhaustively.