ArchiveTeam/grab-site

Getting 502 Bad Gateway Errors

syberphunk opened this issue · 2 comments

I have IP address overrides specified in my /etc/hosts for a domain.

If I directly wpull an URL from one of the addresses, it resolves fine, I get the file I expect using the override IP address.

If I use grab-site, and it finds one of these domains, I get a 502 gateway error, setting --debug doesn't tell me any further information, just that now for some reason I'm pulling a html page instead of say, a png image I'm expecting.

I've set dns wait times, --no-http-keep-alive, and --no-skip-getaddrinfo but I really can't nail down what's causing this problem when wpull is called via grab-site to access this domain.

The venv doesn't have its own proxy or dns settings.

I'm really at a loss for narrowing down what grab-site is doing or how to debug it or ensure it's connecting to the domain properly to not cause this problem.

We have access to the server-side, and it's not even really hitting the site properly to grab the image. I'm bypassing any cdn we have.

It's weird.

Are you using --no-skip-getaddrinfo or --wpull-args=--no-skip-getaddrinfo ? The former won't work.

Are you using --no-skip-getaddrinfo or --wpull-args=--no-skip-getaddrinfo ? The former won't work.

The latter.