Black recomends replacing lxml with defusedxml
CommanderStorm opened this issue · 4 comments
event though we kind of trust the stuwerk, it is better to be save than sorry
Issue: [B410:blacklist] Using html to parse untrusted XML data is known to be vulnerable to XML attacks. Replace html with the equivalent defusedxml package.
Severity: Low Confidence: High
Location: src/menu_parser.py:19
More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_imports.html#b410-import-lxml
18 import requests
19 from lxml import html
Is this still relevant?
I looked into it a bit, and noticed, that the lxml package of defusedxml (as recommended in the comment) is deprecated. https://pypi.org/project/defusedxml/#defusedxml-lxml
So as far as I see it, a replacement would require to rewrite the parser?
Or is there another option?
tiran/defusedxml#38 states, that the package is no longer nessesary, but will be availibele until a real fix is found, so probably still worth to keep subscribed on this issue.
the lxml-FAQ has some recomendations, on how to avert the size/compute-blowup vulnerabilities, but does not seem to specific on how to secure the html parsing ability we use from this library.
https://lxml.de/FAQ.html#how-do-i-use-lxml-safely-as-a-web-service-endpoint
I have not researched, if the html parsing of lxml is also vulnerable to these kinds of attacks.
May be fixed now? lxml is apparently safe
Good catch, thanks 👍