icy/google-group-crawler

Started to fail when using cookie with "error trying to read config from the 'curl-options.txt' file"

Zedseayou opened this issue · 4 comments

Hi, thank you for writing this script!

I noticed that downloading from private groups appears to have broken for some reason. The included test files fail on _test_public_2_with_cookie() as well. The error is:

Warning: error trying read config from the 'curl-options.txt' file
:: Unable to find any mail messages from ./viettug.org-google-group-crawler-public2//mbox/

It seems like this error is being thrown by curl itself? I'll try to look into it some more but not the most familiar with these tools. I have curl 7.38.0 since I am on an old version of Debian, but it doesn't seem like either the user-agent or the header options have changed and this was working a few weeks ago.

icy commented

@Zedseayou thanks for your feedback and using my script.

Warning: error trying read config from the 'curl-options.txt' file

Do you have anything in the curl-otions.txt? It's cookie file used by curl, and it's supposed to be generated by yourself (https://github.com/icy/google-group-crawler#private-group-or-group-hosted-by-an-organization).

icy commented

Please note that, cookie generated from your browser has quite short TTL (time to live). You likely have to generate that file after a few days (I don't know exactly)

icy commented

_test_public_2_with_cookie

The test is designed to work with Travis CI system (https://github.com/icy/google-group-crawler/blob/master/.travis.yml). I'm sure the tests would be broken now because cookie data are out of date. Everytime I need to execute the tests, I have to update those data. In the source you will that file encrypted (https://github.com/icy/google-group-crawler/blob/master/tests/curl-options.txt.enc), and you can't decrypt them from your laptop ;)

I did have updated values in curl-options.txt when I ran this, but I haven't tried again recently. Will follow up and let you know