Speech-Rule-Engine/speech-rule-engine

issues around `A \rightarrow {{\mathcal{b}}c}(d)`

pkra opened this issue · 8 comments

pkra commented

Both v3 (e.g., mathjax.org#demo) and v4 show some strange behavior for something like A \rightarrow {{\mathcal{b}}c}(d)

The (d) part is grouped separately from the initial part (with an invisible times before it) but the semantic structure keeps it in a group with b and c.

This causes problems

  • highlighting, e.g., on mathjax.org#demo, the highlighting will exclude the (d) part unless you reach it directly (i.e., the full tree is never highlighted)
  • then when exploring down to (d), speech is missing.

Full example \iota :{\mathcal{C}}\rightarrow {{\mathcal{C}}one}({\mathcal{C}})) (more available).

pkra commented

Looking at the serialization, the (d) part (incl. invisible times) is in a separate row from the rest (even though data-semantic structure points to it).

pkra commented

Sorry, that was just duplicating what I wrote in the issue. What I had meant to add was that the toEnriched() output lacks data-semantic-speech attributes for the (d) part.

pkra commented

With the latest, the most critical part is fixed but I think the enriched output could be improved.

Taking the full real world example

\iota :{\mathcal{C}}\rightarrow {{\mathcal{C}}one}({\mathcal{C}})

SRE creates a structure where Cone(C) is grouped semantically (great) but (C) is placed outside the row containing \iota :{\mathcal{C}}\rightarrow {{\mathcal{C}}one} (instead of inside the row it creates for Cone)

This seems unfortunate.

I am aware of this. This is mainly due to the preference handling on punctuation, which should be more fine-grained.
But, in particular, SRE should also be recognising the entire element as a function application (that is, {{\mathcal{C}}one}({\mathcal{C}}) and {{\mathcal{B}}c}(d)).

I started working on some improvements. But none that will hit SRE before v4.1 and definitely not before MathJax 3.2.1 is out.

pkra commented

Thanks for the follow up. It does seem like the semantic tree gets it right but that serialization makes it look weird (even though it seems like it could make it look more reasonable).

Anyway, thanks again for fixing the bug!

Indeed the semantic tree does get the punctuation right:
https://speech-rule-engine.github.io/semantic-tree-visualiser/?110000111100%5Ciota%20%3A%7B%5Cmathcal%7BC%7D%7D%5Crightarrow%20%7B%7B%5Cmathcal%7BC%7D%7Done%7D%28%7B%5Cmathcal%7BC%7D%7D%29

But Cone is split up by MathJax. There are workarounds:

\iota :{\mathcal{C}}\rightarrow {\mathit{\mathcal{C}one}}({\mathcal{C}})

would at least leave the one in one piece. Ideally one could use

\iota :{\mathcal{C}}\rightarrow {\mathcal{Cone}}({\mathcal{C}})

as this would visually look the same and leave Cone in one word. But it is rather unnatural to write as it exploits the fact that caligraphic is only available for capitals.

pkra commented

My comment was not about Cone but where the following (mathcal C) ends up in the visual output:

From the visualizer: mathjax creates the expected flat row, in particular Cone and (C) are siblings.

<math>
  <mstyle displaystyle="true" scriptlevel="0">
    <mi>&#x3B9;</mi>
    <mo>:</mo>
    <mrow data-mjx-texclass="ORD">
      <mrow data-mjx-texclass="ORD">
        <mi data-mjx-variant="-tex-calligraphic" mathvariant="script">C</mi>
      </mrow>
    </mrow>
    <mo stretchy="false">&#x2192;</mo>
    <mrow data-mjx-texclass="ORD">
      <mrow data-mjx-texclass="ORD">
        <mrow data-mjx-texclass="ORD">
          <mi data-mjx-variant="-tex-calligraphic" mathvariant="script">C</mi>
        </mrow>
      </mrow>
      <mi>o</mi>
      <mi>n</mi>
      <mi>e</mi>
    </mrow>
    <mo stretchy="false">(</mo>
    <mrow data-mjx-texclass="ORD">
      <mrow data-mjx-texclass="ORD">
        <mi data-mjx-variant="-tex-calligraphic" mathvariant="script">C</mi>
      </mrow>
    </mrow>
    <mo stretchy="false">)</mo>
  </mstyle>
</math>

however, enrMML has a row containing everything but (C) (and an additional invisible times).

<math xmlns="http://www.w3.org/1998/Math/MathML"/ display="block">
  <mrow type="punctuated" role="sequence" id="19" children="0,1,18" content="1">
    <mi type="identifier" role="greekletter" font="italic" annotation="clearspeak:simple" id="0" parent="19">ι</mi>
    <mo type="punctuation" role="colon" id="1" parent="19" operator="punctuated">:</mo>
    <mrow type="relseq" role="arrow" id="18" children="2,17" content="3" parent="19">
      <mrow data-mjx-texclass="ORD">
        <mrow data-mjx-texclass="ORD">
          <mi data-mjx-variant="-tex-calligraphic" type="identifier" role="latinletter" font="script" annotation="clearspeak:simple" id="2" parent="18" mathvariant="script">C</mi>
        </mrow>
      </mrow>
      <mo stretchy="false" type="relation" role="arrow" id="3" parent="18" operator="relseq,→">→</mo>
      <mrow data-mjx-texclass="ORD">
        <mrow type="infixop" role="implicit" annotation="clearspeak:unit" id="17" children="4,5,6,7,15" content="8,9,10,16" parent="18">
          <mrow data-mjx-texclass="ORD">
            <mrow data-mjx-texclass="ORD">
              <mi data-mjx-variant="-tex-calligraphic" type="identifier" role="latinletter" font="script" annotation="clearspeak:simple" id="4" parent="17" mathvariant="script">C</mi>
            </mrow>
          </mrow>
          <mo type="operator" role="multiplication" id="8" parent="17" added="true" operator="infixop,⁢">⁢</mo>
          <mi type="identifier" role="latinletter" font="italic" annotation="clearspeak:simple" id="5" parent="17">o</mi>
          <mo type="operator" role="multiplication" id="9" parent="17" added="true" operator="infixop,⁢">⁢</mo>
          <mi type="identifier" role="latinletter" font="italic" annotation="clearspeak:simple" id="6" parent="17">n</mi>
          <mo type="operator" role="multiplication" id="10" parent="17" added="true" operator="infixop,⁢">⁢</mo>
          <mi type="identifier" role="latinletter" font="italic" annotation="clearspeak:simple" id="7" parent="17">e</mi>
        </mrow>
      </mrow>
    </mrow>
  </mrow>
  <mo type="operator" role="multiplication" id="16" parent="17" added="true" operator="infixop,⁢">⁢</mo>
  <mrow type="fenced" role="leftright" id="15" children="13" content="12,14" parent="17">
    <mo stretchy="false" type="fence" role="open" id="12" parent="15" operator="fenced">(</mo>
    <mrow data-mjx-texclass="ORD">
      <mrow data-mjx-texclass="ORD">
        <mi data-mjx-variant="-tex-calligraphic" type="identifier" role="latinletter" font="script" annotation="clearspeak:simple" id="13" parent="15" mathvariant="script">C</mi>
      </mrow>
    </mrow>
    <mo stretchy="false" type="fence" role="close" id="14" parent="15" operator="fenced">)</mo>
  </mrow>

</math>
pkra commented

Since this has gone stale: the main issue has been solved, I'm guessing the remaining subtlety isn't likely to happen.