jgm/texmath

Improper placement of subscript in math block (but not when placed inline)

Closed this issue ยท 3 comments

Hello, I am using Pandoc version 2.19.2 on a mac. I've noticed a typesetting error when producing a docx file, whether straight from the command line or when converting from tex using pandoc. The error arises when typesetting a specific mathematical expression within a math block, but it does not arise when typesetting the very same expression inline. The expression in LaTex is \succcurlyeq_{i} although the same issue arises with \succeq _{i}. (MSWord doesn't seem to recognize a difference between these two symbols, but that flaw with Word doesn't account for this typesetting bug, I don't think.)

Here is an inline example produced using the command line:

echo 'This is an inline instance of $U_{i}(a) \geq U_{i}(b) \ \text{if\ and\ only\ if} \ a \succcurlyeq_{i} b$. Did it work?' | pandoc -o test.docx && open test.docx

Screen Shot 2023-01-12 at 9 51 17 AM

And here is a math block example using the same expression:

echo 'This is a standalone instance of $$U_{i}(a) \geq U_{i}(b) \ \text{if\ and\ only\ if} \ a \succcurlyeq_{i} b$$ Did it work?' | pandoc -o test.docx && open test.docx

Screen Shot 2023-01-12 at 9 35 59 AM

When the expression is part of a math block, the subscripted i is improperly set below the \succcurlyeq symbol.

Thanks for Pandoc!

jgm commented

Same thing happens with geq.
Transferring to jgm/texmath.

jgm commented

OK, I have a diagnosis.
From TeXMath.Readers.TeX:

superOrSubscripted :: Maybe Bool -> Bool -> Exp -> TP Exp
superOrSubscripted limits convertible a = try $ do
  c <- oneOf "^_"
  spaces
  b <- expr
  case c of
       '^' -> return $ case limits of
                        Just True  -> EOver False a b
                        Nothing
                          | convertible || isConvertible a -> EOver True a b
                          | isUnderover a -> EOver False a b
                        _          -> ESuper a b
       '_' -> return $ case limits of
                        Just True  -> EUnder False a b
                        Nothing
                          | convertible || isConvertible a -> EUnder True a b  -- <----- HERE
                          | isUnderover a -> EUnder False a b
                        _          -> ESub a b
       _   -> mzero

In this case we are taking the path marked HERE, because isConvertible is true of a, which in this case is (ESymbol Rel "\8829").

Here's the definition of isConvertible:

isConvertible :: Exp -> Bool
isConvertible (EMathOperator x) = x `elem` convertibleOps
  where convertibleOps = [ "lim","liminf","limsup","inf","sup"
                         , "min","max","Pr","det","gcd"
                         ]
isConvertible (ESymbol Rel _) = True
isConvertible (ESymbol Bin _) = True
isConvertible (ESymbol Op x) = x `elem` convertibleSyms
  where convertibleSyms = ["\x2211","\x220F","\x22C2",
           "\x22C3","\x22C0","\x22C1","\x2A05","\x2A06",
           "\x2210","\x2A01","\x2A02","\x2A00","\x2A04"] -- โˆ‘โˆโ‹‚โ‹ƒโ‹€โ‹โจ…โจ†โˆโจโจ‚โจ€โจ„
isConvertible _ = False

So you can see that any symbol with class Rel (relation) is being considered convertible, which means that its behavior with _ will be like \lim (where the subscript moves to being below the lim operator in display contexts).

This seems wrong, but I can't recall why it's there or where the notion of "convertible" comes from. The code goes back to 2010!

jgm commented

Did some experiments with tex:
image

Looks like we should keep the first clause of isConvertible, remove the second two, and modify the fourth, keeping only โˆ‘โˆโˆ and \bigwedge (22C0) and \bigvee (22C1), \bigcap (22C2), \bigcup (22C3), \bigsqcap (2A05), \bigsqcup (2A06).