Reduce false positives by detecting two variables for some bots
summercms opened this issue · 0 comments
Some bots using the normal regex patterns could create a load of false positives with common words. Using two variables inside their user agent would reduce the number of false positive results. In this example, we take the common pattern compatible;
and use a custom regex for Blogger Bot
. More bots could be added inside the compatible;
regex - if they are deemed to have high false positives from using common word patterns.
/* Reduce false positives by detecting two variables
* the bots are based on finding `compatible;` and
* their unique regex to improve results
*/
if (preg_match('/compatible;/u', $ua, $match)) {
/* Detect Blogger Bot */
if (preg_match('/blogger\.com/u', $ua, $match)) {
$this->data->browser->name = 'Blogger Bot';
$this->data->device->type = Constants\DeviceType::BOT;
}
..
}
The above code example would detect the user agent:
Mozilla/5.0 (compatible; blogger.com)
It would first find compatible;
and then find blogger.com
Instead of doing the pr code line:
Parser-PHP/data/applications-bots.php
Line 46 in 76be691
Which could create a false positive
result if someone created the user agent:
https://example.blogger.com bad bot
Some other bot user agents could also be put inside this if
statement container to avoid other false positive
results.