buriy/python-readability

Failure if best_elem is root

Closed this issue · 2 comments

I have some documents that are raising the following exception

Traceback (most recent call last):
...
  File ".../site-packages/readability/readability.py", line 168, in summary
    html_partial=html_partial)
  File ".../site-packages/readability/readability.py", line 214, in get_article
    for sibling in best_elem.getparent().getchildren():
Unparseable: 'NoneType' object has no attribute 'getchildren'

I assume there should be a fix along the lines of:

if best_elem.getparent() is None:
    siblings = [best_elem]
else:
    siblings = best_elem.getparent().getchildren()
for sibling in siblings:

Could you make a patch please?

@jnothman: Please attach at least one of those documents.