Parse html to lxml etree
Convenience methods for parsing html documents to lxml etree.
Lxml has limited capabilities for handling different encodings, and this library is intended as a reusable utility parsing byte-code html responses into ElementTrees using sane character decoding.
- Free software: BSD license
- Python versions: 2.7, 3.4+
- Parse html to lxml etree
- Handle character decoding
Parse HTML given as byte strings:
tree = parse_html_bytes(body=body_bytes, content_type=res.headers.get('content-type'))
Parse HTML given as already decoded unicode string:
tree = parse_html_unicode(uni_string=body_unicode)
This package was created with Cookiecutter and the `fluquid/cookiecutter-pypackage`_ project template.