Fix splitting delimiter runs
Opened this issue · 0 comments
bachbui commented
We have a utility in our Commonmark renderer to adjust the boundaries of certain annotations when they would produce an invalid delimiter run. This logic had assumed that the rules for valid delimiter runs were the same regardless of what the specific delimiter character was, but this is not the case.
Here are the rules for delimiters, from least to most restrictive:
If the delimiter is ^
or ~
:
- the inner boundary must not be a whitespace character
If the delimiter is *
, **
, or ~~
- the inner boundary for a delimiter run must not be a whitespace character
- the outer boundary for a delimiter run must be a whitespace or punctuation character if the inner boundary is a punctuation character
If the delimiter run is _
or __
- the inner boundary for a delimiter run must not be a whitespace character
- the outer boundary for a delimiter run must be a whitespace or punctuation character
Here are some examples of the correct behavior. Here square brackets represent the delimiter boundary, an underscore represents a whitespace character, and a dash represents a punctuation character:
Original | Split for ^, ~ | Split for *, **, ~~ | Split for _, __ |
---|---|---|---|
[_a_b] |
_[a_b] |
_[a_b] |
_[a_b] |
a[-b] |
a[-b] |
a-[b] |
a-[b] |
a[b_c] |
a[b_c] |
a[b_c] |
ab_[c] |
a[bc] |
a[bc] |
a[bc] |
abc[] |