elliotgao2/tomd

<b> bold </b> only works inside <p> </p>

Opened this issue · 3 comments

tomd.convert('<p><b> bold </b></p>')  # '\n** bold **\n',   works
tomd.convert('<b> bold </b>')  # "", does not work 

maybe pyquery can be useful, something like this:

from pyquery import Pyquery as pq
from tomd import MARKDOWN

html = "<b> bold </b>"
doc = pq(html)
for elm, val in MARKDOWN.items():
    # for item in doc(elm): replace item.html() with val[0] + pq(item).text() + val[1]

@yucongo The tage <b> is an inline tag which should be in a block tag. I wonder when using tag <b> outside a block tag like <p> or <div>.

I worked it out using pyquery, for my need at least:

from pyquery import PyQuery as pq
from tomd import MARKDOWN

html = '''
<h1>h1</h1>
<h2>h2</h2><h3>h3</h3>
<h4>h4</h4>
<del>del</del>
<b>bold</b>
<i>italic</i>
<b><i>bold italic</i></b>'''

doc = pq(html)

for elm, val in MARKDOWN.items():
    for item in doc(elm).items():
        item.replace_with(val[0] + item.html() + val[1])
print(doc.text())

Output

# h1


## h2

### h3


#### h4

~~del~~
**bold**
*italic*
***bold italic***