remarkjs/remark-math

Prevent spaced inline math from being parsed

LikaKavkasidze opened this issue · 12 comments

Subject of the issue

Math is breaking a perfectly normal sentence with dollar signs, therefore we would like to prevent spaced inline math from being parsed.

Your environment

All OSes and versions are impacted; the renderer needs to use at least remark-math and rehype-katex, I am not sure which one of the two is causing the problem, altough I would go for these Regex.

Actual behaviour

Right now, math is parsed even when it breaks the sentence; for instance:

Hey, I'd like to buy this computer, but $1,000 is really expensive, I'll buy the one at $600 instead.

Here, the content between the two dollar signs is parsed as math, altough it is not what we would expect.

Expected behaviour

Inline math should only be parsed if no space is leading or trailing the content inside; in other words, the leading delimiter shall have no trailing space, and the trailing delimiter shall have no leading space.

@TitiAlone Hi there, thanks for the detailed issue!

Could you expand on why you think math should be different than code?

Hey, I'd like to buy this computer, but `1,000 is really expensive, I'll buy the one at `600 instead.

Yields:

<p>Hey, I'd like to buy this computer, but <code>1,000 is really expensive, I'll buy the one at </code>600 instead.</p>

You can also escape dollars btw:

Hey, I'd like to buy this computer, but \$1,000 is really expensive, I'll buy the one at \$600 instead.

Well, I was thinking about emphasis and strong emphasis that require no space between the opening * character and the content or between the closing * and the content.

I know this is not handled by remark right now, but GFM example 360 and CommonMark example 351 require it:

*This is not emphasis *

Seems to me like it's not natural to escape the $ sign with a backslash while it is used for some other context in a sentence, while this is not the case with backtick that seldom are used in a normal sentence.

However, I know this is not made clear in any of the specifications, but we ran into the very specific case I included as an example of a forum, and it was quite unclear for the user why this happened.

Yeah, remark-parse currently doesn’t handle emphasis correctly according to CM, that’s a bug and should be fixed (in micromark).

Still, the question is whether math is like code or like emphasis.

Like emphasis/importance: One way to look at is, is whether the marker ($, *, _, `) is used in “normal” text, and I think you’re right that the dollar ($) is used more frequently than the grave accent ( `).
Btw, about normal text: you don’t have two dollars in a paragraph too often. One is fine and won’t turn into math.

Like code: Another way is that code and emphasis/importance are different: code has a raw value, emphasis/importance have children (they can include more emphasis, importance, links, etc).
And math is more similar to code in that sense: it only has raw content. Therefore I think it makes sense for math to operate more like code.
Emphasis only allows one marker, importance two. I think math is similar to code in that it could have one or two.

Own rules: Finally, there’s Pandoc, which is followed by markdown-it-katex: https://github.com/waylonflinn/markdown-it-katex#syntax, which aligns with what you’re proposing

What do we do, then? My opinion on this is to treat it like emphasis, but how to decide?

Emphasis is not entirely the same as the third option (pandoc), did you see the digit handling?

I don't really see the difference, it is stated that

$20,000 and $30,000 won’t parse as math

which makes it the same?

Now I see the difference; I like this syntax; moreover, it's a "convention" in some way. No idea if it is simple to implement, but following this rule would be great.

Yeah! I like going with the “standard” of the extension. And there’s tests.

Can you work on a PR implementing it?

I tried before opening the issue, but did not succeed. My major problem was to be able to detect opening vs. closing $ using the actual Regex method.

Alright, working on it. I’ve got the inline part done, but it seems there’s also two parts in block math where markdown-it-katex differs from remark-math, I’ll figure that out in the coming days!

Thanks a lot, will look forward for your changes.