
Various RSS PDF download issues

xthursdayx opened this issue · 1 comments

Thanks for this useful script!

For some reason I seem to only be able to download the specific blog I'm after through the rss feed (with option -p), however every time I run the command the scrapping and downloading stops at a particular post.

I've tried using the -a -s and -p flags to download a specific year (or month) after the post which seems to be causing the problem, but I get the following error:

title: Baru Samarinda
Download html as PDF, please be patient...18/71
file path: /home/xthursdayx/blogspot-downloader/blog /Baru Samarinda Terima Penganugerahan P....pdf
pdfkit IOError

I also tried the command python3 -lo, exporting the results to urls.list and then ran the command python3 -p -1 <urls.list and got the following error:

URL: Create single pdf: /home/vidrir/blogspot-downloader/flores
IOError --one:  wkhtmltopdf reported an error:
Loading page (1/2)
Error: Failed to load, with network status code 302 and http status code 400 - Error transferring - server replied:
Printing pages (2/2)
Exit with code 1 due to network error: ProtocolInvalidOperationError

Any idea what my problem is? Thanks for the help!

**blog name and post changed for the owner's benefit.

The blog page no longer exist. But try not using -p if got error since it rely on 3rd party library pdfkit and tool wkhtmltopdf out of my control unlike Epub (pypub bundled with this script). Also -p do not support multiple links.

And I fixed pypub to make it able to download some images and text. So -p is not preferred unless Epub version not working due to Javascript.