Cannot search headers containing colon (`:`)
Closed this issue ยท 6 comments
Context
No response
Bug description
When a symbol is defined as a separator, we expect that symbol to searchable, that is the search string being tokenized exactly the same as the docs contents. Based on observations it looks like this is not the case with the built-in plugin.
Related links
- plugins.search.separator docs
- Original observation of this issue from @viceice and @HonkingGoose in Renovate's docs and our investigation history.
Reproduction
Created by following
- https://squidfunk.github.io/mkdocs-material/bug-report/reproduction/
- https://squidfunk.github.io/mkdocs-material/setup/setting-up-site-search/#built-in-search-plugin
- Configure separator to include spaces, dash and colon
example.zip (browse same sources on GitHub)
Steps to reproduce
- remove
- info
fromplugins:
mkdocs serve
- Search for
dash-separated
and observe match:
- Search for
colon:separated
and be baffled:
- For control, search for
separated
and observe tokenization works:
- Try
"colon:separated"
๐ค it works because that's the usual for "exact match":
Browser
No response
Before submitting
- I have read and followed the bug reporting guidelines.
- I have attached links to the documentation, and possibly related issues or discussions.
- I assure that I have removed all customizations before submitting this bug report.
- I have attached a .zip file with a minimal reproduction.
Thanks for reporting. This is a feature and not a bug ๐ :
is considered a control character that allows to specify which field to search on. For example title:foo
will only search the title of documents for foo
. What we might be able to do is to allow for :
as long as the word before the colon is not the name of a field, but I'm not sure this is a good idea, because then behaviour for title
, text
and tags
differs from the rest of cases.
Normally, this isn't a problem because people don't search for :
. I'll check if we can somehow limit to active fields, but if that doesn't prove to be viable, you might still fork the theme and adjust the query lexing stage:
Fixed in a6436bd. When the query lexer returns an unknown field, we transform the query to replace the :
with whitespace, so the search pipeline doesn't consider this as a field name. Note that the query lexing approach is very new and needs to be better incorporated into the search worker. I'll tackle this when I work on search the next time, which will be very soon given how many open change requests concerning search there currently are.
For the time being, this fix should make the search in your docs functional again ๐
I closed this too early. We need to keep this open until the fix is released.
thanks for working on this. โค๏ธ
I can confirm that 9.0.7
fixed this problem for us in production as well.
Thank you very much for working on this, and getting it fixed so quickly! โค๏ธ