taxon: provide better hinting for "best match" for common name
synrg opened this issue · 2 comments
Problem: the user meant to match a particular common name and the bot selected a different one
Two examples brought up on Discord:
,t kagu
matches an Estonian common name for a bird with more observations than the desired match:
The correct result can be brought up with the ,t lang
subcommand, but almost nobody knows about this:
In fact, there's a second problem here, which is that "kagu" is not the preferred common name for the genus. "kagus" is. This, too, can be provided for using an rank sp
filter, but again, most users don't remember that this can be done.
,t redshank
matches a non-preferred common name:
If the bird was meant, the user needs to know that in birds
must be specified to filter:
Subproblems:
There is a whole grab-bag full of different issues here:
- non-preferred name is always considered "best" by iNat (i.e. try the same searches on the iNat website and you'll get the same "top" result for "kagu", as well as the same "top" result for "redshank")
- non-english common names are not filtered out, and Dronefly provides no hint that you can even do that with the subcommand
- Dronefly doesn't give you a chance to select the second-best name if the "best" was not the one you meant
Possible approaches:
- a user setting and/or server setting for language e.g.
,user set lang en
and/or,inat set lang en
that makes it ignore any non-English names (i.e. changes,t
behaviour to be like,t lang en
- to override for one command when they have a default set, the user would need to type, e.g.
,t lang any kagu
- to override for one command when they have a default set, the user would need to type, e.g.
- always prioritize preferred common name matches over other matches
- though this may not be always the "best" choice for the user. it seems ok for these cases, but what if the user really did want spotted lady's thumb? they'd be confused, and even more confused by the fact that the iNat website gives them the "correct" result, from their perspective, whereas Dronefly gives them the "incorrect" result
- therefore, this is not an approach that i particular like
- give better feedback when there are other possible matches
- if a common name has other possible matches, summarize what they are
- at least count them
- possibly also group the alternatives by the iconic taxa or else lowest rank at which the choices diverge, e.g.
- if a common name has other possible matches, summarize what they are
There are 4 other matches for
redshank
, 2 in Aves (Birds), and 2 in Plantae (Plants).
Use,s taxa redshank
to search for other matching names.
Tryin
to matchredshank
in another taxon, e.g.,t redshank in aves
- provide a command to restrict matches to the preferred common name for the user's home place, e.g.
,t preferred redshank
- and if the top non-preferred name is matched by
,t redshank
, then show a tip about using,t preferred redshank
to only match preferred names
Each one of these possible solutions should have its own issue. None of them are mutually exclusive, and each covers a different aspect of the overall problem.
Note that #164 partially implements ,user set lang en
, but it does not prevent non-English common names from matching.
Of all of the possible approaches I listed above, I like keeping the matched name as close as possible to how it works on the web, but also provide an easy way to select an alternative if the top match isn't what was desired. That, in effect, is what the autocomplete on the website accomplishes: the top match is usually, but not always, what the user wanted. Therefore, the rest of the matches are shown in case they wanted a different match.
Here's another case, but it's not quite the same as the cases above. If you search for rhododendrons
, the matched_term is rhododendronsläktet
which very likely is not what you'd like to see:
See https://api.inaturalist.org/v1/taxa/autocomplete?q=rhododendrons
If you look at how it works on the web, it's not showing matched term here, but instead shows the preferred common name: