jgm/texmath

Minus for subtraction and minus for negative numbers not differentiated in default HTML math output

frabjous opened this issue · 5 comments

If you use a LaTeX-based output, LaTeX is smart enough to distinguish between minus signs (usually hyphens in the input in math mode) used for subtraction (a binary operation) and minus signs used for negatives (a monadic operation), and spaces them differently. E.g., $5-3$ will give "5 − 3" (spaces), whereas just $-3$ will give "−3" (no space).

But with pandoc's default method for handling math in HTML based output (and presumably other non-LaTeX formats), you always get a space before and after a minus sign if not inside parentheses, which is pretty bad looking for something as simple as $-3$.

I realize it may be a stretch to expect it to differentiate between them, but it would certainly lead to better output.

A related aside, I gather it was an intentional decision not to put spaces around binary and relational operators when inside parentheses. But it's an awful decision, and the results look both bad and inconsistent. I can't believe anyone would prefer "∀x(Fx→Gx)" to "∀x(Fx → Gx)", or "2 + (2+3)" to "2 + (2 + 3)". I would very very very much appreciate an option to always have spaces for binary and relational operators, or perhaps an option to wrap them in a span with a css class so the spacing can be tweaked via css.

(Pandoc 3.1.11.1 on ArchLinux x86_64)

jgm commented

I agree on all these points. Transferring this to texmath, which actually does the conversion.

jgm commented

One issue should perhaps be addressed in the tex reader:

% texmath -t native         
-1
[ ESymbol Bin "\8722" , ENumber "1" ]

Ideally this would not be parsed as a symbol in category Bin (binary operator). But this would require a bit more sophistication than we currently have in the parser.

For the other issue we have

% texmath -t native         
2 + (2 + 3)
[ ENumber "2"
, ESymbol Bin "+"
, EDelimited
    "("
    ")"
    [ Right (ENumber "2")
    , Right (ESymbol Bin "+")
    , Right (ENumber "3")
    ]
]

which is fine, but something is going on in the pandoc writer, which adds the spacing around the binary operator in one context but not the other:

% texmath -t pandoc 
2+(2+3)
[ Str "2"
, Str "\8197"
, Str "+"
, Str "\8197"
, Str "("
, Str "2"
, Str "+"
, Str "3"
, Str ")"
]
jgm commented

Pushed a change that fixes the second issue. Not yet the first.

jgm commented

Looks like we already have code in the TeX reader to change some Bins to Ops. This code is too simple; we need to look at the atom before the symbol as well as the one after it.

jgm commented

https://tex.stackexchange.com/questions/392081/how-does-tex-figure-out-whether-a-should-be-typeset-as-unary-or-binary

Texbook says:

  1. If the current item is a Bin atom, and if this was the first atom
    in the list, or if the most recent previous atom was Bin, Op, Rel, Open,
    or Punct, change the current Bin to Ord and continue with Rule 14.
    Otherwise continue with Rule 17.

  2. If the current item is a Rel or Close or Punct atom, and if the most
    recent previous atom was Bin, change that previous Bin to Ord. Continue with Rule 17.