hyphenator shall specify parser for Beautifulsoup
Closed this issue · 0 comments
jrief commented
django-softhyphen works perfectly if html5lib is not installed. However with html5lib in your Python search path, the given example code
from softhyphen.html import hyphenate
>>> hyphenate("<h1>I love hyphenation</h1>")
u'<html><body><h1>I love hy­phen­a­tion</h1></body></html>'
gives the result string wrapped into a <html><body>...
, which is not what we want.
This can be overridden by Monkey-patching with Beautifulsoup.DEFAULT_BUILDER_FEATURES = ['html.parser']
, but that might cause other unwanted side-effects. A better approach would be to add a configuration setting in django-softhyphen, which invokes
html.py (line 54)
soup = BeautifulSoup(html, features=BEAUTIFULSOUP_BUILDER_FEATURES)
where BEAUTIFULSOUP_BUILDER_FEATURES defaults to ['html.parser']
.
If you accept this feature request, I'll send a PR.