JayBizzle/Crawler-Detect

Some Google bots are not identified

Closed this issue · 5 comments

Hi, it seems some Google Bot from cloud are not identified as it.
Example:
IP = 104.199.13.48 | Referer = | Lang = | Host = 48.13.199.104.bc.googleusercontent.com | Nav = Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.85 Safari/537.36 | Translate = 0 | Bot = 0

Because I've a lot of bot, I verify (first) if $_SERVER['HTTP_ACCEPT_LANGUAGE'] is empty, and (second), I do gethostbyaddr($_SERVER['REMOTE_ADDR']). If Google is present in host, it's a google cloud bot.

Is it possible to detect this?
Jef (sorry for my poor english...)

I guess you could add a rule matching googleusercontent. I haven't seen this bot yet.

But recently I'm seeing requests from GoogleOther:

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GoogleOther) Chrome/117.0.5938.132 Safari/537.36
Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.5938.132 Mobile Safari/537.36 (compatible; GoogleOther)

Same here:

Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.6261.94 Mobile Safari/537.36 (compatible; GoogleOther)

Is this package still maintained? Do you know some similar alternatives?

Yes, still maintained

PRs welcome 🙏🏻

Amazing, thanks a lot, @JayBizzle !