Parsing error and missing content on theregister.com
lgrn opened this issue · 0 comments
lgrn commented
Data
- Shiori version: 1.6.0 (build 595cb45)
- Database Engine: sqlite
- Operating system: Debian 12
- CLI/Web interface/Web Extension: None
Describe the bug / actual behavior
Shiori fails to parse quotes, they are not included in the saved content.
Expected behavior
The quotes are a part of the article, and should be included, preferably with some kind of UI indication that they are quotes, but at the very least included at all.
To Reproduce
Steps to reproduce the behavior:
- Save the article https://www.theregister.com/2024/03/18/truenas_abandons_freebsd/
- Inspect the saved content
- Note that the paragraph beginning with "The creator of PC-BSD(...)" has been saved
- Note that the following quote beginning with "Right now the plan(...)" is missing
Notes
This is an HTML excerpt of the problematic section -- the <p>
within the <div>
is not included:
<p>The creator of PC-BSD(...)</p>
<div class="blockextract">
<p>Right now the plan(...)</p>
</div>