[skeleton] empheq causing no top-level speech
pkra opened this issue · 7 comments
Here's an attempt at a minimal example:
\begin{empheq} [left = \empheqlbrace \,]{align} b \tag{1}\end{empheq}
This generates a skeleton that starts with
<math xmlns="http://www.w3.org/1998/Math/MathML" display="block" data-semantic-structure="(12 2 (11 3 (9 (8 (1 0) 6))))">
<mrow data-semantic-added="true">
<mo data-semantic-type="punctuation" data-semantic-role="dummy" data-semantic-id="10" data-semantic-parent="11" data-semantic-added="true" data-semantic-operator="punctuated" data-semantic-speech="comma"></mo>
<mrow data-semantic-added="true" data-semantic-type="punctuated" data-semantic-role="text" data-semantic-id="11" data-semantic-children="3,9" data-semantic-content="10" data-semantic-parent="12" data-semantic-owns="3 9" data-semantic-speech="StartLayout 1st Row with Label left parenthesis 1 right parenthesis EndLabel b EndLayout"/>
...
The root element is missing speech and children, making labeling and navigation impossible.
FWIW, the original example was
\begin{empheq} [left = \empheqlbrace \,]{align} &\dot{Z}(t) = J_{{2n}}g^{\operatorname *{DD}}_{\mathcal{H}}(U,Z,t), & \text{in }\mathcal{T}, \cssId{texmlid30}{\tag{4.15a}}\\ &\begin{aligned} \dot{U}(t) & = (I_{{2N}}-UU^\top )(J_{{2N}}G^{\,p^*}_{\mathcal{H}}(U,Z)Z^\top {-}\\ & \qquad G^{\,p^*}_{\mathcal{H}}(U,Z)Z^\top J_{{2n}}^\top ) S(Z)^{-1}, \end{aligned} & \text{in }\mathcal{T}, \cssId{texmlid31}{\tag{4.15b}}\\ &U(t_0)Z(t_0) = U^0 Z^0,& \tag{4.15c} \end{empheq}
Below is the result I get. Note that, here the root node (id=12
) is not the topmost node in the tree but an mrow element further down. The skeleton on the other hand will always be in the root note. SRE has a getSemanticRoot
method in the walker_util
module.
<math xmlns="http://www.w3.org/1998/Math/MathML" data-latex="\begin{empheq} [left = \empheqlbrace \,]{align} b \tag{1}\end{empheq}" display="block" data-semantic-structure="(12 2 (11 3 (9 (8 (1 0) 6))))">>
<mrow data-semantic-added="true">
<mo data-semantic-type="punctuation" data-semantic-role="dummy" data-semantic-id="10" data-semantic-parent="11" data-semantic-added="true" data-semantic-operator="punctuated" data-semantic-speech="comma"></mo>
<mrow data-semantic-added="true" data-semantic-type="punctuated" data-semantic-role="text" data-semantic-annotation="depth:2" data-semantic-id="11" data-semantic-children="3,9" data-semantic-content="10" data-semantic-parent="1
2" data-semantic-owns="3 9" data-semantic-speech="StartLayout 1st Row with Label left parenthesis 1 right parenthesis EndLabel b EndLayout"/>
<mrow data-semantic-added="true" data-semantic-type="punctuated" data-semantic-role="startpunct" data-semantic-annotation="Emph:left;Emph:top;depth:1" data-semantic-id="12" data-semantic-children="2,11" data-semantic-content="2
" data-semantic-attributes="latex:\begin{empheq} [left = \empheqlbrace \,]{align} b \tag{1}\end{empheq}" data-semantic-owns="2 11" data-semantic-speech="left brace StartLayout 1st Row with Label left parenthesis 1 right parenthesi
s EndLabel b EndLayout"/>
<mtable displaystyle="true" columnalign="right right" columnspacing="0em " rowspacing="3pt" data-break-align="bottom" data-latex="\begin{align} b \tag{1}\end{empheq}" data-semantic-type="multiline" data-semantic-role="unknown"
data-semantic-annotation="Emph:table;depth:3" data-semantic-id="9" data-semantic-children="8" data-semantic-parent="11" data-semantic-owns="8" data-semantic-speech="StartLayout 1st Row with Label left parenthesis 1 right parenthes
is EndLabel b EndLayout">
<mlabeledtr data-semantic-type="line" data-semantic-role="multiline" data-semantic-annotation="depth:4" data-semantic-id="8" data-semantic-children="6" data-semantic-content="1" data-semantic-parent="9" data-semantic-owns="1
6" data-semantic-speech="with Label left parenthesis 1 right parenthesis EndLabel b" data-semantic-prefix="1st Row">
<mtd id="mjx-eqn:1" data-semantic-type="cell" data-semantic-role="label" data-semantic-id="1" data-semantic-children="0" data-semantic-parent="8" data-semantic-owns="0" data-semantic-speech="left parenthesis 1 right parenth
esis" data-semantic-prefix="1st Column">
<mtext data-latex="\text{(1)}" data-semantic-type="text" data-semantic-role="annotation" data-semantic-font="normal" data-semantic-id="0" data-semantic-parent="1" data-semantic-attributes="latex:\text{(1)}" data-semantic-
speech="left parenthesis 1 right parenthesis">(1)</mtext>
</mtd>
<mtd>
<mpadded height="0" depth="0" voffset="height">
<mpadded height="0" depth="0" voffset="-1height">
<mo data-latex="\empheqlbrace" data-semantic-type="punctuation" data-semantic-role="openfence" data-semantic-annotation="Emph:left;depth:2" data
-semantic-id="2" data-semantic-parent="12" data-semantic-attributes="latex:\empheqlbrace" data-semantic-operator="punctuated" data-semantic-speech="left brace">{</mo>
<mtext data-semantic-type="text" data-semantic-role="space" data-semantic-annotation="Emph:left;clearspeak:unit;depth:3" data-semantic-id="3" data-semantic-parent="11" data-semantic-speech=""> </mtext>
<mphantom>
<mpadded width="0">
<mtable displaystyle="true" columnalign="right" columnspacing="" rowspacing="3pt" data-break-align="bottom">
<mlabeledtr>
<mtd>
<mtext data-latex="\text{(1)}">(1)</mtext>
</mtd>
<mtd>
<mi data-latex="\tag{1}">b</mi>
</mtd>
</mlabeledtr>
</mtable>
</mpadded>
</mphantom>
</mpadded>
<mphantom>
<mpadded width="0">
<mtable displaystyle="true" columnalign="right" columnspacing="" rowspacing="3pt" data-break-align="bottom" align="baseline 1">
<mlabeledtr>
<mtd>
<mtext data-latex="\text{(1)}">(1)</mtext>
</mtd>
<mtd>
<mi data-latex="\tag{1}">b</mi>
</mtd>
</mlabeledtr>
</mtable>
</mpadded>
</mphantom>
</mpadded>
</mtd>
<mtd>
<mi data-latex="\tag{1}" data-semantic-type="identifier" data-semantic-role="latinletter" data-semantic-font="italic" data-semantic-annotation="clearspeak:simple;depth:5" data-semantic-id="6" data-semantic-parent="8" data-semantic-attributes="latex:\tag{1}" data-semantic-speech="b">b</mi>
</mtd>
</mlabeledtr>
</mtable>
</mrow>
</math>
The reason why the semantic tree is rather unshapely is the \,
space, which SRE interprets as semantically relevant. Similar to a\,b
vs a\quad b
where SRE would only deem the latter semantically relevant. Not sure why it does so in this case.
Compare the tree for the above expression
to the one we get for
\begin{empheq} [left = \empheqlbrace]{align} b \tag{1}\end{empheq}
or even
\begin{empheq} [left = \empheqlbrace\;]{align} b \tag{1}\end{empheq}
where SRE interprets a case statement.
Thanks for looking into this! Now I realize that the top root also doesn't provide a data-semantic-own
attribute.
I've been naively querySelecting the first DOM node with data-semantic-speech and picked that as the root.
I'm guessing I should peek into the top-level data-semantic-structure to find the real root instead. Does that sound about right?
The way the SRE method is to look for the node that in the element that does have an id
but no parent
. I believe I use an Xpath expression for that.
But since you know that you have the structure
element, which will always be in the expression's root, you can just lookup the id of the semantic root node and look for this with
querySelector(`[data-semantic-id="${id}"]`);
Thanks for clarifying. I'll adjust downstream. Thanks again for looking into this!