BambusControl/obsidian-unicode-search

Missing Unicode Characters

Closed this issue · 4 comments

Unicode combining characters (e.g., U+0304 Combining Macron), do not appear in the results when codepoints searched fro with Unicode Search.

You are correct, not all characters are present in the plugin.

Your character belongs in the Nonspacing Mark category, and that is why you cannot find it.
Included in the plugin are only characters based on criteria:

  • Only characters from the Basic Multilingual Plane (Hex value is less than 0xFFFF) are included.
  • General categories starting with C (control) and M (marks) are excluded.
  • Codepoints with names beginning and ending with brackets <> are excluded.

This was a compromise, for usability of the plugin, mainly from a performance perspective of the search (it would freeze periodically during search).
The former plugin version (especially on mobile devices) wasn't great at filtering (quickly or well), so I removed as I could at the time.

The overall expansion of UCD characters is one of the next steps in the project, although it requires some thought.

Thank you for this fabulous and slick plugin! This fills such an important and critical need within Obsidian.
I'd also really like to be able to use it to add unicode characters from the Control and Mark categories.
Would it be possible please to add an option to support the full unicode set, or, if that is too heavy, then to add an option for users to specify additional ranges of unicode characters to support?

Thank you for your kind words!
I'm glad that you find the plugin useful.
I was considering implementing the ability for users to choose code planes/blocks they would like to have accessible to them.

Implemented in Release 0.6.0.

I've added a plugin settings tab, where you can select which blocks and categories do you want to include in the search results.