Question regarding forbid_dtd
mschwager opened this issue · 2 comments
Why do many of the parsing functions explicitly default forbid_dtd
to False
? It seems like the secure default would be to enable this option, and allow it to be manually disabled.
Without it it appears as though consumers are vulnerable to schema poisoning attacks.
I would like to bump this, updated link to schema poisoning attacks is here
Thanks for pointing me to this form of attack. I'll add it to the documentation together with comment handling issue https://duo.com/blog/duo-finds-saml-vulnerabilities-affecting-multiple-implementations
DTDs are allowed by default because they are common and I don't consider a DTD as security risk. File resources, network resources, or entity expansion are a problem.
Neither Python's xml package nor the expat library perform any schema validation or evaluation. The expat library just validates the syntax and offers callbacks for schema elements. Here is an example of a schema validation that passes silently:
>>> from defusedxml.ElementTree import fromstring
>>> root = fromstring("""<!DOCTYPE root [
... <!ELEMENT root (a, b)>
... <!ELEMENT a EMPTY>
... <!ELEMENT b EMPTY>
... ]>
... <root><a>not empty</a><!-- b is missing --></root>
... """)
>>> root.tag
'root'
>>> list(root)
[<Element 'a' at 0x7f5cc314e728>]