please use █ instead of ■ when converting {aligned} into docx

Question

please use █ instead of ■ when converting {aligned} into docx

ZhuangQu opened this issue 2 years ago · 6 comments

I use pandoc 3.1.1 in Windows11. When converting

\begin{equation*}
    \begin{aligned}
        1= & 2 &  & 3 \\
        =  & 4 &  & 5 \\
    \end{aligned}
\end{equation*}

from LaTeX into docx, we get

■(1=&2&&3@=&4&&5)

in Word. we can see that you convert {aligned} to ■, which is wrong. The correct output is █.
In UnicodeMath, ■ U+25A0 represents a matrix, █ U+2588 represents an aligned structure.

Answer 1 · 2023-03-13T02:49:22.000Z

Transferring to jgm/texmath which does our math conversion.

Note: we don't use UnicodeMath; we use Word's XML representation of math.
The above aligned environment is translated as

<m:oMathPara>
  <m:oMathParaPr>
    <m:jc m:val="center" />
  </m:oMathParaPr>
  <m:oMath>
    <m:m>
      <m:mPr>
        <m:baseJc m:val="center" />
        <m:plcHide m:val="1" />
        <m:mcs>
          <m:mc>
            <m:mcPr>
              <m:mcJc m:val="right" />
              <m:count m:val="1" />
            </m:mcPr>
          </m:mc>
          <m:mc>
            <m:mcPr>
              <m:mcJc m:val="left" />
              <m:count m:val="1" />
            </m:mcPr>
          </m:mc>
          <m:mc>
            <m:mcPr>
              <m:mcJc m:val="right" />
              <m:count m:val="1" />
            </m:mcPr>
          </m:mc>
          <m:mc>
            <m:mcPr>
              <m:mcJc m:val="left" />
              <m:count m:val="1" />
            </m:mcPr>
          </m:mc>
        </m:mcs>
      </m:mPr>
      <m:mr>
        <m:e>
          <m:r>
            <m:t>1</m:t>
          </m:r>
          <m:r>
            <m:rPr>
              <m:sty m:val="p" />
            </m:rPr>
            <m:t>=</m:t>
          </m:r>
        </m:e>
        <m:e>
          <m:r>
            <m:t>2</m:t>
          </m:r>
        </m:e>
        <m:e />
        <m:e>
          <m:r>
            <m:t>3</m:t>
          </m:r>
        </m:e>
      </m:mr>
      <m:mr>
        <m:e>
          <m:r>
            <m:rPr>
              <m:sty m:val="p" />
            </m:rPr>
            <m:t>=</m:t>
          </m:r>
        </m:e>
        <m:e>
          <m:r>
            <m:t>4</m:t>
          </m:r>
        </m:e>
        <m:e />
        <m:e>
          <m:r>
            <m:t>5</m:t>
          </m:r>
        </m:e>
      </m:mr>
    </m:m>
  </m:oMath>
</m:oMathPara>

Please suggest more appropriate OMML.

Answer 2 · 2023-03-17T03:34:39.000Z

Sorry, I don't know what is OMML.
I only know that █ is correct and ■ is wrong.
Maybe you can convert UnicodeMath to OMML to get more appropriate OMML.

Answer 3 · 2023-03-17T06:47:24.000Z

Experimenting with Word: using U+25A0, I get

and XML

     <m:oMathPara>
        <m:oMathParaPr>
          <m:jc m:val="center" />
        </m:oMathParaPr>
        <m:oMath>
          <m:m>
            <m:mPr>
              <m:plcHide m:val="1" />
              <m:mcs>
                <m:mc>
                  <m:mcPr>
                    <m:count m:val="1" />
                    <m:mcJc m:val="right" />
                  </m:mcPr>
                </m:mc>
                <m:mc>
                  <m:mcPr>
                    <m:count m:val="1" />
                    <m:mcJc m:val="left" />
                  </m:mcPr>
                </m:mc>
                <m:mc>
                  <m:mcPr>
                    <m:count m:val="1" />
                    <m:mcJc m:val="right" />
                  </m:mcPr>
                </m:mc>
                <m:mc>
                  <m:mcPr>
                    <m:count m:val="1" />
                    <m:mcJc m:val="left" />
                  </m:mcPr>
                </m:mc>
              </m:mcs>
              <m:ctrlPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
              </m:ctrlPr>
            </m:mPr>
            <m:mr>
              <m:e>
                <m:r>
                  <w:rPr>
                    <w:rFonts w:ascii="Cambria Math"
                    w:hAnsi="Cambria Math" />
                  </w:rPr>
                  <m:t>1</m:t>
                </m:r>
                <m:r>
                  <m:rPr>
                    <m:sty m:val="p" />
                  </m:rPr>
                  <w:rPr>
                    <w:rFonts w:ascii="Cambria Math"
                    w:hAnsi="Cambria Math" />
                  </w:rPr>
                  <m:t>=</m:t>
                </m:r>
              </m:e>
              <m:e>
                <m:r>
                  <w:rPr>
                    <w:rFonts w:ascii="Cambria Math"
                    w:hAnsi="Cambria Math" />
                  </w:rPr>
                  <m:t>2</m:t>
                </m:r>
              </m:e>
              <m:e />
              <m:e>
                <m:r>
                  <w:rPr>
                    <w:rFonts w:ascii="Cambria Math"
                    w:hAnsi="Cambria Math" />
                  </w:rPr>
                  <m:t>3</m:t>
                </m:r>
              </m:e>
            </m:mr>
            <m:mr>
              <m:e>
                <m:r>
                  <m:rPr>
                    <m:sty m:val="p" />
                  </m:rPr>
                  <w:rPr>
                    <w:rFonts w:ascii="Cambria Math"
                    w:hAnsi="Cambria Math" />
                  </w:rPr>
                  <m:t>=</m:t>
                </m:r>
              </m:e>
              <m:e>
                <m:r>
                  <w:rPr>
                    <w:rFonts w:ascii="Cambria Math"
                    w:hAnsi="Cambria Math" />
                  </w:rPr>
                  <m:t>4</m:t>
                </m:r>
              </m:e>
              <m:e />
              <m:e>
                <m:r>
                  <w:rPr>
                    <w:rFonts w:ascii="Cambria Math"
                    w:hAnsi="Cambria Math" />
                  </w:rPr>
                  <m:t>5</m:t>
                </m:r>
              </m:e>
            </m:mr>
          </m:m>
        </m:oMath>
      </m:oMathPara>

while with U+2588, I get

and XML

      <m:oMathPara>
        <m:oMath>
          <m:eqArr>
            <m:eqArrPr>
              <m:ctrlPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" w:cs="Arial" />
                  <w:color w:val="24292F" />
                  <w:sz w:val="21" />
                  <w:szCs w:val="21" />
                  <w:shd w:val="clear" w:color="auto"
                  w:fill="FFFFFF" />
                </w:rPr>
              </m:ctrlPr>
            </m:eqArrPr>
            <m:e>
              <m:r>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>1</m:t>
              </m:r>
              <m:r>
                <m:rPr>
                  <m:sty m:val="p" />
                </m:rPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>=</m:t>
              </m:r>
              <m:r>
                <m:rPr>
                  <m:sty m:val="p" />
                </m:rPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>&amp;</m:t>
              </m:r>
              <m:r>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>2</m:t>
              </m:r>
              <m:r>
                <m:rPr>
                  <m:sty m:val="p" />
                </m:rPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>&amp;</m:t>
              </m:r>
              <m:r>
                <m:rPr>
                  <m:sty m:val="p" />
                </m:rPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>&amp;</m:t>
              </m:r>
              <m:r>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>3</m:t>
              </m:r>
              <m:ctrlPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
              </m:ctrlPr>
            </m:e>
            <m:e>
              <m:r>
                <m:rPr>
                  <m:sty m:val="p" />
                </m:rPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>=&amp;</m:t>
              </m:r>
              <m:r>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>4</m:t>
              </m:r>
              <m:r>
                <m:rPr>
                  <m:sty m:val="p" />
                </m:rPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>&amp;&amp;</m:t>
              </m:r>
              <m:r>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>5</m:t>
              </m:r>
              <m:ctrlPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
              </m:ctrlPr>
            </m:e>
          </m:eqArr>
        </m:oMath>
      </m:oMathPara>

The first (current behavior) is actually closer in appearance to what pdflatex gives us, which is

Answer 4 · 2023-03-17T07:02:06.000Z

No, the second is closer!
Your 2 and 3 are crowded together because there are no spaces added. Please try:

█(1=&2&  &3@=&4&  &5)

I advocate that ■ corresponds to {matrix} and █ corresponds to {aligned}, because of the meaning of &.
Both ■ in docx and {matrix} in LaTeX, & means a column.
Both █ in docx and {aligned} in LaTeX, odd & means an aligning-point and even & means a padding-point.
Do you find that in your first case, the space between 2 and 3 is too wide?
Because the 2nd & is treated as a new empty column! Not an aligning-point.

Answer 5 · 2023-03-17T07:38:11.000Z

I understand that format converting is not always perfect and exact. If the cost of modification is too high, please close this issue.

Answer 6 · 2023-03-17T16:20:43.000Z

I'll keep this open. It would not be a small change, because currently we don't have an AST element for aligned environments that is separate from that for matrices -- we use the same form for both. That's not ideal.