jgm/texmath

Wrong equation conversion from docx

Closed this issue · 4 comments

Explain the problem.

When doing conversion from docx to markdown, some simple equations are wrongly converted.

image

--> $x^{2}$

See attached example (was created with Google Docs, downloaded as docx, opens fine in macOS Pages).

Prueba.docx

Pandoc version?

macOS 12.3
pandoc 3.1.8
Features: +server +lua
Scripting engine: Lua 5.4

jgm commented

XML is

  <m:oMath>
  <m:nary>
      <m:naryPr>
      <m:chr m:val=""/>
      <m:ctrlPr>
          <w:rPr/>
      </m:ctrlPr>
      </m:naryPr>
      <m:sub>
      <m:r>
          <w:rPr/>
          <m:t xml:space="preserve">i=1                                       
          </m:t>
      </m:r>
      </m:sub>
      <m:sup>
      <m:r>
          <w:rPr/>
          <m:t xml:space="preserve">n                                         
          </m:t>
      </m:r>
      </m:sup>
  </m:nary>
  <m:sSup>
      <m:sSupPr>
      <m:ctrlPr>
          <w:rPr/>
      </m:ctrlPr>
      </m:sSupPr>
      <m:e>
      <m:r>
          <w:rPr/>
          <m:t xml:space="preserve">x                                         
          </m:t>
      </m:r>
      </m:e>
      <m:sup>
      <m:r>
          <w:rPr/>
          <m:t xml:space="preserve">2                                         
          </m:t>
      </m:r>
      </m:sup>
  </m:sSup>
  </m:oMath>
jgm commented

Apparently m:naryPr can contain an m:chr element that specifies the operator character.
I don't think we handle that currently. Moving this to texmath.

jgm commented

Actually, we do handle m:chr.
The problem is that we expect an m:nAry element to contain one m:e element.
Indeed, it looks like that's required, but perhaps some software is more forgiving...
https://schemas.liquid-technologies.com/OfficeOpenXML/2006/?page=omath.html
(look under "min occurrences")

Thanks!!!