oalders/http-browserdetect

Undefined user agent triggers warnings

mauro76 opened this issue · 3 comments

I've created a fork, but then looking at the unit test I'm not sure how I can unit test it.
I have examples of the user agent in input is undef. This is causing method version() to trigger a warning on:

my $version = $self->{major} + $self->{minor};

I could workaround this by calling using HTTP::BrowserDetect only on defined user agent, but I think the module itself should cope with it. I would suggest to do a:

return undef unless defined $self->{user_agent}

or even check for defined minor/minor.

Any comment?

$ perl -e 'use strict; use warnings; use HTTP::BrowserDetect; my $ua; my $browser = HTTP::BrowserDetect->new($ua); print $browser->version();'
Use of uninitialized value in addition (+) at /usr/local/share/perl5/HTTP/BrowserDetect.pm line 1173.
Use of uninitialized value in addition (+) at /usr/local/share/perl5/HTTP/BrowserDetect.pm line 1173.
0

If the module just silently accepted an undef, that would be behaviour which I would find surprising. ie if the calling code at some point passes an undef variable consistently, you'd never find out about it via warnings. Also, in the wild, if you're not getting a UserAgent string then either A) something is very wrong or B) you're dealing with a badly behaved bot. Those are two cases which I'd want to get a warning about.

Also, keep in mind that there are many methods in this module. Like a lot. If any of them are relying on variables being concatenated then all of those methods will likewise need to account for undef.

Having said all that, version() is probably not what you want. Have a look at public_version() and engine_version(). I just documented this but have yet to release it.

What's the actual scenario here which is causing the problem? Is it not something that can easily be dealt with by the calling code?

Looking at one of the app servers I can see, in about 20 minutes, about 600 requests with undefined user agent from 91 different IPs. Multiply this for many servers we have and you can understand it's not an isolated issue. Warning as quite continuous.
The IPs are spread all over the world, many from cloud providers, and we have no intention to stop the crawling.
We run a rater important website and we have many crawlers fetching our pages.

It's quite easy for me to workaround this, by checking the user agent in input and not call HTTP::BrowserDetect when undefined.
At the same time I expect websites of medium size to be crawled by any sort of bot, including bad ones with empty user agent.
We don't want to block them, we use the version only for the usual browser edge cases, e.g. IE.
I do appreciate this module has many methods and I can see how hard it would be to change all of them.

I'm happy to switch to public_version(), I can see it works well for our case.

Thanks for sharing your use case. I'm certainly not suggesting that it's OK to clutter your logs with warnings about undef UserAgent strings. I think for this case, it's good to handle the logic in your code as you're proposing.

BTW, warnings were introduced in 7b3317f (Dec 2012), so at the very least it's not a new change. We've been able to resolve most of the issues around the warnings and they've been quite helpful in find bugs we didn't know about.