niqdev/packtpub-crawler

Newsletter store

niqdev opened this issue · 8 comments

@juzim The ebook downloaded from the newsletter can't be uploaded to drive or stored on firebase and break the script, I would suggest at least to change the newsletter feature to optional i.e. add --newsletter or -n as parameter

[*] getting free ebook from newsletter
[*] fetching url... 200 | https://www.packtpub.com/packt/free-ebook/javascript-high-performance
[*] fetching url... 200 | https://www.packtpub.com/account/my-ebooks
[+] book successfully claimed
[-] downloading file from url: https://www.packtpub.com/ebook_download/20590/pdf
[################################] 12594/12594 - 00:00:00
[+] new download: XXX/packtpub-crawler/ebooks/Mastering_Javascript_High_Performance.pdf
[+] new file upload on Drive:
[+] uploading file...
-[+] updating file permissions...

-       [path] XXX/packtpub-crawler/ebooks/Mastering_Javascript_High_Performance.pdf
	[download_url] https://drive.google.com/uc?id=XXX&export=download
	[name] Mastering_Javascript_High_Performance.pdf
	[mime_type] application/pdf
	[id] XXX
[-] skip store info: missing upload info
[-] <type 'exceptions.TypeError'> can't multiply sequence by non-int of type 'str' | spider.py@119
Traceback (most recent call last):
  File "script/spider.py", line 119, in main
    handleClaim(packpub, args, config, dir_path)
  File "script/spider.py", line 55, in handleClaim
    Notify(config, packpub.info, upload_info, args.notify).run()
  File "XXX/github/packtpub-crawler/script/notify.py", line 30, in run
    self.service.send()
  File "XXX/packtpub-crawler/script/notification/gmail.py", line 98, in send
    message = self.__prepare_message()
  File "XXX/packtpub-crawler/script/notification/gmail.py", line 41, in __prepare_message
    html *= "</ul>"
TypeError: can't multiply sequence by non-int of type 'str'
juzim commented
juzim commented

yep, sorry, the problem is not the upload but with firebase.
Just checked now and the ebook is on drive.

By the way Firebase option depends on Drive option

Just few questions about the newsletter

  • is the spreadsheet updated every day now and the newsletter url added only if there is a "new" newsletter? if yes, can you update the code in the gist?
  • there is a reason for why line 104 to line 132 are inside the main while you externalize handleClaim?
  • can you just share me the link of the old newsletter, I need to test the script with firebase and I can't download the newsletter now
  • do you agree that the newsletter should be another option? like -w or --newsletter
juzim commented

Ok for all the points and thanks for the link, probably I didn't explain well point 2, I meant just refactor the method inside the main.
Anyway I will leave the issue open, if you agree, until we can't test again with the firebase option

Already fixed. Obsolete