Recommend User-Agent variation?
Closed this issue · 3 comments
I was chasing a bug and looking through my httpd logs and there are a whole lot of lines in there where the User-Agent is “https://github.com/blakeembrey/popsicle” which is unhelpful and probably bad practice by whoever’s using popsicle. I’m not sure there’s an automated solution (love to be wrong) but perhaps the README could gently suggest that people override the agent string?
@timbray The User-Agent
nowadays is Popsicle (https://github.com/serviejs/popsicle)
but, yes, if you wanted something customized you should override it. Is there a reason this is unhelpful though? What would you expect for the User-Agent
?
For reference, I think request doesn't add a header, got uses a similar scheme while requests (python) also appears to match.
User-agents that are browsers normally name the browser. Those that are bots or apps usually name the bot or app. Then there are those that just unhelpfully name a library, like these people using Popsicle. In most cases it's not a big deal, unless they're hitting a site hard (which some of the Popsicle users are), in which case if they don't have a real User-Agent you can't use robots.txt (and they probably don't support it, which they should).
Of the agents hitting my blog hard and identifying themselves only as libraries, Popsicle is currently in the lead. If it gets worse I can of course configure my httpd to simply reject all requests with a User-Agent string including "Popsicle", but I’d rather not if I don’t have to and that would be unfair to users of Popsicle who aren't writing stupid bots.
Things you might do to make your users more polite by default: 1. just put a line on your README.md saying “if you are doing anything nontrivial, please override the User-Agent and respect robots.txt, 2. Make the the default user-agent string include "PLEASE REPLACE ME".
BTW I fully understand that you’re not responsible for what people do with your open-source software.
Closed with 6b479ea.