onlyhavecans/ReadItLater-Calibre-Plugin

AttributeError: 'NoneType' object has no attribute 'findAll'

Closed this issue · 7 comments

I am seeing this when I run it on my protected rss feed at readitlater :-

Traceback (most recent call last):
File "site.py", line 58, in main
File "site-packages/calibre/ebooks/conversion/cli.py", line 287, in main
File "site-packages/calibre/ebooks/conversion/plumber.py", line 963, in run
File "site-packages/calibre/customize/conversion.py", line 208, in call
File "site-packages/calibre/ebooks/conversion/plugins/recipe_input.py", line 105, in convert
File "site-packages/calibre/web/feeds/news.py", line 861, in download
File "site-packages/calibre/web/feeds/news.py", line 1005, in build_index
File "", line 76, in parse_index
AttributeError: 'NoneType' object has no attribute 'findAll'

it also marked my readitlater items as read even though it failed.

As a comparison I tested the ReadItLater_Official.recipe included in this repo and it works perfectly.

The plugin marks as read during it's cleanup phase, after it's already pulled and generated the content. I'm not easily seeing where this error is thrown. It might have been a problem with something you had in RIL

I have a few questions

  • What OS & Version are you using (Just in case)
  • What point of the process creates this error? The script is quite verbose in it's status
    • You can watch what it is doing in the Job's window.
  • If you run it a second time does it still produce the error while pulling NEW content?
  • Can you UNMARK the 50 articles that it marked and try it again with the official?

After taking a second look I know where it failed but it's not in a section of code that is all that different from official.

It sounds the browser didn't return, or returned something invalid when it went to go parse the web page. Again it sounds like a certain piece of content failed or went wonky and I would love to know what.

It looks like you're right about it being a problem with the actual content.
This morning I added three new articles and tried again with V3 and it has worked perfectly.
I have also un-archived the previous days articles and they have now processed correctly , one of them was from the EFF's HTTPS site which I thought could caused the problem.
The only config change I made when encountering the problem was reducing the minimum_articles to 1 , now it is set to 2
I am using Ubuntu 10.04 / Calibre 0.8.47 and running ebook-convert via a bash script using crontab.
Sorry to have alarmed you unnecessarily.

If the EFF link isn't private I'd love to see if I can possibly patch the original code and fix the problem in the main branch.
I can't make any promises but it never hurts to try!

The EFF feed I read is https://www.eff.org/rss/updates.xml but articles from here certainly weren't the cause of the problem.