DOCX allows multiple w:t per w:r
TinoDidriksen opened this issue · 1 comments
TinoDidriksen commented
Google Docs and presumably other editors may produce DOCX files with XML akin to <w:p><w:r><w:t>...</w:t><w:br/><w:br/><w:t>...</w:t><w:br/></w:r></w:p>
- that is, each w:r
can have multiple w:t
and w:br
intermingled.
MS Word itself will produce <w:p><w:r><w:t>...</w:t></w:r><w:r><w:br/></w:r><w:r><w:br/><w:t>...</w:t></w:r><w:r><w:br/></w:r></w:p>
- that is, each w:r
holds max one w:t
Unfortunately, the schema does allow multiple: http://www.datypic.com/sc/ooxml/e-w_r-2.html
(cf. apertium/apertium#110)
TinoDidriksen commented
Probably also an issue for PPTX's a:t