buriy/python-readability

Summary is fooled by a modal popup

rpdelaney opened this issue · 0 comments

>>> r = fetch_url('https://www.democracynow.org/2023/9/5/headlines/biden_administration_to_supply_ukraine_with_depleted_uranium_munitions')
>>> type(r)
<class 'requests_html.HTMLResponse'>
>>> doc = Document(r.content)
>>> doc.summary()
'<html><body><div><div class="daily_digest_modal_content"><a href="" id="continue_to_site"><img src="https://assets.democracynow.org/assets/icons/modal-close-5ec5cd072a7d7752e6073e5c8761274c67b9287aecead27a7e72cc2f30753d1f.png"></a><div><h1>Independent news has never been so important.</h1>\r\n<p><span class="plea">Get Democracy Now! delivered to your inbox every day!</span> Don\'t worry, we won\'t share or sell your information.</p></div></div></div></body></html>'
>>>

Am I doing something wrong, or should readability be removing this also?