bvaughn/react-highlight-words

Highlight only part of the match

keul opened this issue ยท 6 comments

keul commented

I've a regex as the following:

`(\\b|_|-)${filter}`

So I want to find words inside a string, but if the word is "_foo" I like to match it also, while commonly word boundaries (\b) are not matching it.

The limit I find is i don't want to highlight the _ char.

I think I cant reach this by using the findChunks function, but it seems complex for a common task like this. So, I'm wondering if adding a prop like "highlightOnlyGroup" can be useful.

The idea is to being able to do something like:

`(?:\\b|_|-)(${filter})`

So, a non capturing group, which is ignored, and only the text inside the capturing group is used.

Future alternatives can be the named groups of ES2018, but probably we are not ready yet.

I think I cant reach this by using the findChunks function, but it seems complex for a common task like this

This does not sound like a common task to me. ๐Ÿ˜„ It's the first time I've heard this feature mentioned in the two years since I released this little library.

If you'd like to put together a PR- including unit tests and documentation- I'll be happy to review it. No promises it gets merged though. That would depend on the complexity.

keul commented

This does not sound like a common task to me. ๐Ÿ˜„ It's the first time I've heard this feature mentioned in the two years since I released this little library.

๐Ÿ˜†

OK, let me approach this. If the implementation will be too complex there's no problem, I can still use my own branch.

keul commented

I'm going to work on this (not a very high priority). I found that probably the missing feature is in the highlight-words-core package and not this.

Seems likely that changes would need to be proposed for both, yes.

keul commented

@bvaughn looked into this last night and I fear this is not easily doable due to JS regex limit.

Main idea was to replace only groups when you have groups available, but there's no (easy?) way to know where a group starts or ends.

I tried also the next regex specs (with polyfill or in chrome, where it seems is implemented) with named groups or a library like xregexp with no luck.

For example: in Python I can do something like:

import re

pattern = r'(?:_|-|\b)(ipsum)'
prog = re.compile(pattern)

match = prog.search('lorem-ipsum dolor')
print(match.group(1))
print(match.start(1), match.end(1))

This prints:

ipsum
6 11

In JS I can only grab the captured group from the match array, but no start/end.

If you have any idea I can look into this more, but for what I can see now this will be too complex, so I will probably switch to use findChunks (and you can close the issue).

Thanks for the update! Closing this for now then.