taxon: handle "of" as non-keyword in middle of phrase
synrg opened this issue · 3 comments
The query ,t chicken of the woods
unexpectedly returns Verbascum thapsus (wooly mullein) (Charmin of the woods). Apparently, of
is treated as a keyword. However, in this context, where the words preceding "of" aren't part of another option, it makes no sense to treat it as a keyword. Therefore, the natural language query parser should treat it as a non-keyword instead so that the expected result, Laetiporus sulphureus (chicken of the woods) will be found instead.
This could be achieved in two passes:
- pass 1: substitute all keywords except "of" with the unix-like option instead (--by, --from, etc.)
- pass 2: if after pass 1, there are words preceding the first option, then:
- add the implicit --of before them, and make no of -> --of substitutions
- otherwise, scan the whole query for any "of". if there is a match, substitute only the first one
Expected outputs for example queries following the above steps:
,t chicken
-> ,t chicken
,obs of chicken
-> ,obs --of chicken
,t chicken of the woods
-> ,t --of chicken of the woods
,obs by me of chicken
-> ,obs --by me --of chicken
,obs by me of chicken of the woods
-> ,obs --by me --of chicken of the woods
Thanks, @Riviera
, for drawing this to my attention on iNat Discord.
It would be cleaner to handle this left-to-right in one pass, i.e.
- scan and expand (or collect, and expand at the end) all tokens left to right
- if a non-option, non-macro keyword is encountered,
--of
is immediately inserted into the expanded token list - after an
--of
is either inserted by this method or is explicitly encountered later in the argument list, no further occurrences of the tokenof
will be transformed into--of
, i.e. it will just be treated as the ordinary wordof