Automatically updated lists of known IP addresses in IPv4 and IPv6 ranges. Currently offering a list of known datacentre and crawler IP ranges.
Detecting whether traffic is coming from a known IP address is useful in many contexts. But the data is quite difficult to come by in a useful form, so this auto-updating repository provides a reliable output to be ingested by other applications.
Currently the following datasets are available:
- Datacentre IP Ranges (Available as JSON, CSV, and Text)
- Crawler IP Ranges (Available as JSON, CSV, and Text)
You can integrate this project by downloading the desired datasource with the following links:
This script uses a number of sources freely available on GitHub and the Web to collect the IP ranges. They are:
- Amazon AWS published IP ranges
- Google Cloud Platform public IP ranges
- Microsoft Azure public IP ranges
- Cloudflare public IP ranges (IPv6)
- ASN-IP Project to retrieve the IP ranges for each ASN
- IP Index Project
USE AT YOUR OWN RISK! The IP's in the list are only based on ASN allocation, there is no foolproof way of knowing whether users or robots are attached to the IP addresses
Reads and scrapes pages of published IP addresses for the following crawlers:
- GoogleBot
- BingBot
- AhrefsBot
- AppleBot
- FacebookBot
- OpenAI Bots
- DuckDuckBot
- OnCrawl Bot
- SiteImprove Bot
- YandexBot
- UptimeRobot
- PingdomBot
- AmazonBot
Daily, if any of the IP ranges have been updated, or new ASN's are captured.
I am happy to accept suggestions of ASN's that should be captured for the datacentre IP ranges list.
If you wish for the tool to capture crawler IP addresses, please submit an issue with a link to the published list.