mozilla/http-observatory-website

Sites behind CloudFlare Bot protection can't be scanned

ybh10 opened this issue · 3 comments

ybh10 commented

Observatory reports :

"This site returned an HTTP status code other than 200 (OK), which may cause its results to be inaccurate."

As many websites, I am using Cloudflare's Bot fight mode that blocks bad bots.
If Bot fight mode is disabled, the issue does not happen.
In order to fix this, the best solution is to add all the scanners/bots to the allow list:
https://support.cloudflare.com/hc/en-us/articles/360035387431-Cloudflare-bot-products-FAQs#h_5itGQRBabQ51RwT5cNJX8u
Another solution is to publish the Observatory and rest of the scanners IP/User agent/Country so I can allow it in my firewall, but that might not work and require the upgrade to an expensive CDN plan.

@ybh10 Can you share the domain name of your site that triggers this error?

@gene1wood this happens for example with https://you.com

Thank you.

So it looks like we don't currently meet the requirements that CloudFlare places on bots like the observatory so in order for CloudFlare to not block observatory the tasks below would be required. If there's an appetite for this, feel free to open feature requests for this.

  • I'm unsure if observatory meets the minimum traffic requirement. It's possible.
  • We'd need to honor robots.txt which observatory does not do currently. This would be a good candidate for a feature request issue to be opened.
  • We'd need to enact rate limiting which we don't currently. This would be a good candidate for a feature request issue to be opened.
  • We'd need to publish the IPs used by the observatory scanners in a machine readable format. This is doable as we have fixed IPs. This would be a good candidate for a feature request issue to be opened.
  • We'd need to set reverse DNS for both our IPv4 addresses (e.g. 63.245.208.130 ) and our IPv6 addresses (e.g. 2620:101:8030:74::1:14 ) as they are not currently set. This would need to be done by Mozilla's SREs. If the previous code based tasks are completed I can work with SREs to get this step done.