html to etree
Parse html to lxml etree
Convenience methods for parsing html documents to lxml etree.
Lxml has limited capabilities for handling different encodings, and this library is intended as a reusable utility parsing byte-code html responses into ElementTrees using sane character decoding.
- Free software: BSD license
- Python versions: 2.7, 3.4+
Features
- Parse html to lxml etree
- Handle character decoding
Quickstart
Parse HTML given as byte strings:
tree = parse_html_bytes(body=body_bytes, content_type=res.headers.get('content-type'))
Parse HTML given as already decoded unicode string:
tree = parse_html_unicode(uni_string=body_unicode)
Credits
This package was created with Cookiecutter and the `fluquid/cookiecutter-pypackage`_ project template.