HTML Parsing is not robust enough
ericdbarry opened this issue · 0 comments
ericdbarry commented
I am pointing to a login page that has some oddities in it, exposing some assumptions about the pages being processed.
Two examples:
- escape function does not like None being passed in
- name tag is required, yet login_form_html_node.findall(".//input")) does not filter
Both of these are valid markup - the first being a boolean attribute being present, the second being an input tag without a name attribute.