foomo/pagespeed_exporter

Error on page scrape

Closed this issue · 4 comments

The pagespeed exporter is throwing an error when I am trying to run lightspeed tests on an internal article page one of my websites. Im running reports for the homepage and it works fine, and I'm also testing the homepage and the same article page on this site's staging server, no errors there.

I've tested the same URL on the actual pagepseed insigts website with no problems, and I dont see any events for this URI in my WAF event log, so I am pretty cofident the requests are not being blocked. But Im having trouble tracking this down. Any ideas where this error might be coming from?

time="2019-02-08T05:18:03Z" level=warning msg="target scraping returned an error" error="googleapi: Error 500: Lighthouse returned error: FAILED_DOCUMENT_REQUEST. Lighthouse was unable to reliably load the page you requested. Make sure you are testing the correct URL and that the server is properly responding to all requests. (Details: net::ERR_BLOCKED_BY_CLIENT), " strategy=desktop target="https://url.com/goes/here/"

@jasondewitt

The pagspeed exporter uses an external lighthouse API that needs the website to be accessible through a public endpoint! Meaning the website should be accessible through public!

I haven't checked if there is a local version of lighthouse API that you can run, but other than that, I'd point out that it should be accessible through the internet! (perhaps secure it with basic auth or something)

Yeah, this site is publicly available, so that isnt the problem (even though that is what the error message was telling me too). I believe this problem is stemming from some limits on the pagespeed api side. Everything was fine for months running this with 5 minute checks on a single page, but once I started adding more pages to be checked (i added up to 4 total) then I started getting this error. I have since backed off to only checking a single page again and I am no longer having any problems.

I havent had a lot of time to investigate this just yet, I was hoping you would recognize the error. I will have some time resource to do work on some tooling around pagespeed testing in the near future and I may have some further feedback on this at that point.

If you'd like, perhaps i could set-up a way to disable the parallelisation option, which would in turn evenly distribute the scrapes? Meaning that the time it takes might be longer, but it would lower the number of requests to the API?

Hay! I wanted to note that a new version is up that does probing instead of scraping

https://github.com/foomo/pagespeed_exporter/blob/master/example/prometheus/config.yml

Meaning that you can scrape for multiple endpoints and keep the metrics separated.

There is now also the flag "parallel" that disables parallelization just in case!