HTTP Error 413
want-to-export-group opened this issue · 8 comments
I am trying to archive messages from a large private group. The script seems to run fine, until the "Fetching data" step. Here is the output (the name has been changed to "group"):
:: Downloading all topics (thread) pages...
:: Creating './group//threads/t.0' with 'categories/group'
:: Fetching data from 'https://groups.google.com/forum/?_escaped_fragment_=categories/group'...
--2019-12-20 13:16:16-- https://groups.google.com/forum/?_escaped_fragment_=categories/group
Resolving groups.google.com (groups.google.com)... 2607:f8b0:400d:c0f::8a, 172.217.197.102, 172.217.197.113, ...
Connecting to groups.google.com (groups.google.com)|2607:f8b0:400d:c0f::8a|:443... connected.
HTTP request sent, awaiting response... 413 Request Entity Too Large
2019-12-20 13:16:16 ERROR 413: Request Entity Too Large.
As you can see, there is an Error 413. What is causing this, and how can it be fixed?
The test script works fine for "google-group-crawler-public" but fails for "google-group-crawler-public2" due to HTTP error 500. Could something be going wrong with the cookies?
@want-to-export-group Was you able to resolve the issue?
I haven't seen that issue. Maybe it's a temporary network issue, you can look at the wget command and retry if that helps.
The test script works fine for "google-group-crawler-public" but fails for "google-group-crawler-public2" due to HTTP error 500. Could something be going wrong with the cookies?
Yes I can confirm this issue. Google has changed something to prevent our script from working :(
:( it's used to work. Now accessing from the web browser also generates an error https://groups.google.com/forum/?_escaped_fragment_=categories/google-group-crawler-public2
By mistake google-group-crawler-public2
was set to private mode. Now it's fine. Btw, I have rewritten the script using curl hopefully it can help to resolve a few strange issue. Stay tuned.
The problem should be fixed in the latest version 2.0.0 (using curl). Please have a look if it's better. Thanks.