universal-ctags/issues-we-will-not-fix-in-soon

[Feature Request] Avoid comments in tag search pattern

Opened this issue · 1 comments

Sometimes, the comments are multi-bytes and not UTF-8 encoded. In this case, the comment bytes in tag file could not match the one in source codes, since the encoding is different.

Could an option provided to avoid comments in tag search pattern to avoid such problem?

Thanks!

This is difficult to solve.

The pattern fields are not made by a parser.
The common part of ctags makes the pattern from the line number told by a parser.
It means we cannot use the knowledge of the target programming language when making the pattern fields.

Introducing a hook to the parser to clean up the pattern is an idea I found.

  1. a pattern made by the common part "/* multi bytes string */".
  2. passing it to C parser to clean up.
  3. new pattern cleaned up by C parser "/* .* */".

Good idea at first glance.
However, this approach has a big limitation for the multi line comments.

/* multi
   bytes
   string */

If the parser receiving " string */" line, the C parser cannot recognize it as a comment block.

A more large-scale mechanism may be needed.