commonmark/cmark

Parsing of "____a__!__!___"

Closed this issue · 2 comments

xiaq commented

Sorry for the line noise, it was discovered when I was fuzzing my Markdown parser and I haven't been able to reduce it to a smaller case.

The following Markdown:

____a__!__!___

generates the following output:

__<strong>a</strong>!<strong>!</strong>_

However, none of the rules in https://spec.commonmark.org/0.30/#emphasis-and-strong-emphasis seems to prohibit the second _ and the last _ to be paired to form another emphasis, thus becoming:

_<em><strong>a</strong>!<strong>!</strong></em>

Interestingly, if I replace all _ with * I do get the desired output, and in this particular case the distinction shouldn't matter.

commonmark.js has the same behavior: https://spec.commonmark.org/dingus/?text=____a__!__!___. But GitHub's Markdown implementation has the behavior I expect: _a!!

jgm commented

Interestingly, my Haskell implementation (jgm/commonmark-hs) gets

<p>_<em><strong>a</strong>!<strong>!</strong></em></p>
jgm commented

As do older versions of commonmark.js - so this must have been introduced in a change after 0.28.1.