jgm/texmath

"aligned" should be used instead of some "matrix" when converting to latex

ZhuangQu opened this issue · 7 comments

demo.docx
Execute the command: pandoc demo.docx -o demo.tex
The formula in word
f(x)={█(x&=1 & y&=2@xxx&=111 & yyy&=222)┤
is converted to

\[
	f(x) = \left\{
		\begin{matrix}
			x\& = 1\ \&\ y\& = 2 \\
			xxx\& = 111\ \&\ yyy\& = 222 \\
		\end{matrix} 
	\right.
\]

which is wrong. The correct conversion is

\[
	f(x) = \left\{
		\begin{aligned}
			x& = 1\ &\ y& = 2 \\
			xxx& = 111\ &\ yyy& = 222 \\
		\end{aligned} 
	\right.
\]
jgm commented

Here's the ooxml that is being converted from demo.docx:

<m:oMathPara>
  <m:oMath>
    <m:r>
      <w:rPr>
        <w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"
        w:cs="Times New Roman" />
      </w:rPr>
      <m:t>f</m:t>
    </m:r>
    <m:d>
      <m:dPr>
        <m:ctrlPr>
          <w:rPr>
            <w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"
            w:cs="Times New Roman" />
            <w:i />
          </w:rPr>
        </m:ctrlPr>
      </m:dPr>
      <m:e>
        <m:r>
          <w:rPr>
            <w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"
            w:cs="Times New Roman" />
          </w:rPr>
          <m:t>x</m:t>
        </m:r>
      </m:e>
    </m:d>
    <m:r>
      <w:rPr>
        <w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"
        w:cs="Times New Roman" />
      </w:rPr>
      <m:t>=</m:t>
    </m:r>
    <m:d>
      <m:dPr>
        <m:begChr m:val="{" />
        <m:endChr m:val="" />
        <m:ctrlPr>
          <w:rPr>
            <w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"
            w:cs="Times New Roman" />
            <w:i />
          </w:rPr>
        </m:ctrlPr>
      </m:dPr>
      <m:e>
        <m:eqArr>
          <m:eqArrPr>
            <m:ctrlPr>
              <w:rPr>
                <w:rFonts w:ascii="Cambria Math"
                w:hAnsi="Cambria Math" w:cs="Times New Roman" />
                <w:i />
              </w:rPr>
            </m:ctrlPr>
          </m:eqArrPr>
          <m:e>
            <m:r>
              <w:rPr>
                <w:rFonts w:ascii="Cambria Math"
                w:hAnsi="Cambria Math" w:cs="Times New Roman" />
              </w:rPr>
              <m:t>x&amp;=1 &amp; y&amp;=2</m:t>
            </m:r>
            <m:ctrlPr>
              <w:rPr>
                <w:rFonts w:ascii="Cambria Math"
                w:eastAsia="Cambria Math" w:hAnsi="Cambria Math"
                w:cs="Cambria Math" />
                <w:i />
              </w:rPr>
            </m:ctrlPr>
          </m:e>
          <m:e>
            <m:r>
              <w:rPr>
                <w:rFonts w:ascii="Cambria Math"
                w:hAnsi="Cambria Math" w:cs="Times New Roman" />
              </w:rPr>
              <m:t>xxx&amp;=111 &amp; yyy&amp;=222</m:t>
            </m:r>
          </m:e>
        </m:eqArr>
      </m:e>
    </m:d>
  </m:oMath>
</m:oMathPara>

The eqArr in XML should be interpreted as aligned in latex instead of matrix
@jgm

This issue should probably be moved to jgm/texmath.

jgm commented

Changing this to produce aligned would require changes in the underlying types. We don't have a constructor that corresponds to the align environment, so we fall back on array which should give similar results. What I can do is fix the bug that produces the & symbols, and ensure that they instead split the formula into columns.

jgm commented

Output with my changes:

f(x) = \left\{ \begin{array}{rlll}
x & = 1\  & \ y & = 2 \\
xxx & = 111\  & \ yyy & = 222 \\
\end{array} \right.\ 

which is certainly much better.

jgm commented

rendered using tex:

Screen Shot 2022-10-08 at 12 44 03 PM

jgm commented

Ah wait, I forgot that the TeX writer already renders certain arrays as "aligned" by default.
So we can get

f(x) = \left\{ \begin{aligned}
x & = 1\  & \ y & = 2 \\
xxx & = 111\  & \ yyy & = 222 \\
\end{aligned} \right.\

after all.