Issues
- 0
- 4
lxml.html.clean is now a separate project
#31 opened by wRAR - 3
Consider switching from lxml's clean_html for enhanced security (and possibly performance)
#30 opened by frenzymadness - 3
- 0
.extract_text returning incorrect format.
#29 opened by hg0428 - 0
Preserve space inside <pre> tags
#28 opened by mitar - 2
extract_text fails with misleading error message when given bytes instead of unicode [py3]
#26 opened by keturn - 1
extract_text does not work on lxml XHTML element
#24 opened by keturn - 1
guess_layout does not work on XHTML elements
#25 opened by keturn - 4
Don't always insert spaces around inline tags?
#16 opened by lopuhin - 1
improve newline handling
#5 opened by kmike - 0
support unicode punctuation better
#10 opened by kmike - 4
- 0
- 1
button values?
#3 opened by kmike - 0
img alt handling
#4 opened by kmike - 4
whitespace issues
#1 opened by codinguncut