TeamHG-Memex/html-text

Blank lines created by <br> cannot be parsed correctly

luyuhuang opened this issue · 3 comments

Hi all,

When I try to convert the following html to plain text:

<div>aaa</div>
<br>
<div>bbb</div>

the output is

aaa
bbb

but I think there should be a blank line between aaa and bbb. I try to read the code and found that the blank line created by <br> is ignored because of context.prev is _NEWLINE(created by the previous <div>). Is there a way to solve this problem? Thank you very much.

I agree that an extra newline makes sense here 👍

Is this issue still open , can i have a go on it ?

@printROSHN yes, that would be great!