Ruby gem to check that an IP belongs to a bot, typically a search engine. This can be of help in protecting a web site from fake search engines.
Suppose you have a Web request and you'd like to make sure it's not from a fake search engine:
bot = Legitbot.bot(userAgent, ip)
bot
will be nil
if no bot signature was found in the User-Agent
. Otherwise,
it will be an object with methods
bot.detected_as # => :google
bot.valid? # => true
bot.fake? # => false
Sometimes you already know what search engine to expect. For example, you might be using rack-attack:
Rack::Attack.blocklist("fake Googlebot") do |req|
req.user_agent =~ %r(Googlebot) && Legitbot::Google.fake?(req.ip)
end
Or if you do not like all these nasty crawlers stealing your content or maybe evaluating it and getting ready to invade your site with spammers, then block them all:
Rack::Attack.blocklist 'fake search engines' do |request|
Legitbot.bot(request.user_agent, request.ip)&.fake?
end
- Ahrefs
- Applebot
- Baidu spider
- Bingbot
- DuckDuckGo bot
- Facebook crawler
- Google crawlers
- Twitterbot, the list of IPs is in the Troubleshooting page
- Yandex robots
Apache 2.0
- Play Framework variant in Scala: play-legitbot
- Article When (Fake) Googlebots Attack Your Rails App
- Voight-Kampff is a Ruby gem that
detects bots by
User-Agent
- crawler_detect is a Ruby gem and Rack
middleware to detect crawlers by few different request headers, including
User-Agent