readthedocs/commonmark.py

nested emphasis and strong tags when phrase ends in colon

swier opened this issue · 1 comments

swier commented

The following code results in a nested <strong> tag:

import commonmark
parser = commonmark.Parser()
renderer = commonmark.HtmlRenderer()

test = "normal**bold:**normal normal**bold bold**normal"
renderer.render(parser.parse(test))
# '<p>normal<strong>bold:<strong>normal normal</strong>bold bold</strong>normal</p>\n'

Using a single asterisk results in identical behavior but with nested <em> tags. Underscores aren't converted.

The following input behaves differently:

'normal**bold:**normal normal'

Which yields:

'<p>normal**bold:**normal normal</p>\n'

The combination of a colon, no space, and another bold item seems to result in this behaviour, as the following inputs behave as expected:

'normal**bold**normal normal**bold bold**normal'
'normal**bold:** normal normal**bold bold**normal'

Yield:

'<p>normal<strong>bold</strong>normal normal<strong>bold bold</strong>normal</p>\n'
'<p>normal<strong>bold:</strong> normal normal<strong>bold bold</strong>normal</p>\n'

I'm using python 3.7.2 with commonmark 0.8.1 (conda-forge distribution)

It seems like this is a problem with CommonMark's specification, and not this python library. I'm seeing the same problem on here:

2019-05-02-102102_1203x517_scrot

I would raise this issue here so the spec can be updated: https://talk.commonmark.org/

Correct me if I'm wrong though - if this really is a bug in commonmark.py feel free to re-open this issue.