/ip-ranges

Automatically updated list of known datacentre IP addresses in IPv4 and IPv6 ranges

Primary LanguagePHPMIT LicenseMIT

IP Ranges

Automatically updated lists of known IP addresses in IPv4 and IPv6 ranges. Currently offering a list of known datacentre and crawler IP ranges.

Why?

Detecting whether traffic is coming from a known IP address is useful in many contexts. But the data is quite difficult to come by in a useful form, so this auto-updating repository provides a reliable output to be ingested by other applications.

What Output is Produced?

Currently the following datasets are available:

  • Datacentre IP Ranges (Available as JSON, CSV, and Text)
  • Crawler IP Ranges (Available as JSON, CSV, and Text)

How can I integrate the data?

You can integrate this project by downloading the desired datasource with the following links:

How are the IP's collected?

This script uses a number of sources freely available on GitHub and the Web to collect the IP ranges. They are:

Datacentres

USE AT YOUR OWN RISK! The IP's in the list are only based on ASN allocation, there is no foolproof way of knowing whether users or robots are attached to the IP addresses

Crawlers

Reads and scrapes pages of published IP addresses for the following crawlers:

  • GoogleBot
  • BingBot
  • AhrefsBot
  • AppleBot
  • FacebookBot
  • OpenAI Bots
  • DuckDuckBot
  • OnCrawl Bot
  • SiteImprove Bot
  • YandexBot
  • UptimeRobot
  • PingdomBot
  • AmazonBot

How often is the list updated

Daily, if any of the IP ranges have been updated, or new ASN's are captured.

ASNXXXX or crawler XXXX should be on the list

I am happy to accept suggestions of ASN's that should be captured for the datacentre IP ranges list.

If you wish for the tool to capture crawler IP addresses, please submit an issue with a link to the published list.