grangier/python-goose

title from opengraph can return None

Opened this issue · 1 comments

  File "/usr/local/lib/python2.7/dist-packages/goose/extractors/title.py", line 42, in clean_title
    title = title.replace(site_name, '').strip()
AttributeError: 'NoneType' object has no attribute 'replace'
ubi15 commented

Also, the title may correspond to the website name thus being an empty string (or a string of trailing spaces) and result in an error at line 55, where title_words is now an empty list after the after title.split() of line 52.

I wrapped from line 52 up to title = u" ".join(title_words).strip() with an

if title_words:
    <former code>
else:
    title = ""

Hope this helps.