Are matches supposed to happen on partial tokens?
charles-gray opened this issue · 4 comments
Describe the bug
Comby matches "token(:[foo])" against things like "longer_token(...)". ie. the token in the input string is treated as a suffix, rather than a full token.
Reproducing
bit.ly/3AhdQFn
Expected behavior
I would expect that "foo(:[hole])" to not match on "prefix_foo(bar)", otherwise, a whole bunch of the examples on the website break.
Additional context
You can break examples on the website with tokens like "foo_if" that match on plain old "if". You can try and work around this by putting a space before your token in the match string, but then you can't match things like "bar(foo(:[hole]))" in your source because there's no space.
I assume there's some convoluted regex I can put before my token to force separation, but given that it's required to include special characters that affect balancing, I can't find one that works just yet...
Currently this is intended behavior. Two options:
- Try adding
-disable-substring-matching
on the command line, does that work?
echo "serif(font); if(font); ser.if(font)" | comby -stdin 'if(font)' 'foo(font)' -disable-substring-matching
- Alternatively, there is a simple regex that may work well enough using
\b
for word boundaries like:[~\bif](foo)
https://bit.ly/3wrshWd
Ah! \b
looks like exactly the regex magic I was looking for, but didn't know where to start. I've done a couple of spot checks and it seems to give results I'm expecting. I'll confirm will the entire source base soon.
I'm using the Python API so the command-line argument doesn't seem to help me here. It seems to be a missing option in the API bindings, but I guess I'd go file that on that project.
Either way, assuming I confirm the regex tweak works, that fits perfectly into what I'm doing. Thanks for the prompt reply!
OK, so after a much deeper dive, '\b' seems to be the ticket. The -disable-substring-matching
doesn't work for the general case because it seems to still miss the case of an "_" prefix, eg, the last addition here:
echo "serif(font); if(font); ser.if(font); ser_if(font)" | comby -stdin 'if(font)' 'foo(font)' -disable-substring-matching
Thanks so much for the help again. I'll close this issue out. I'd love to see something in the docs / FAQ about this, but no longer necessary for me. Thanks!!
cool cool. I'm (slowly) revamping some of the docs page, and will try compile a "common questions and answers" that this might fit into. cheers!