mudge/re2

Add interface for incremental replacement

Closed this issue · 6 comments

mudge commented

As raised by @ljharb in #23, re2 doesn't currently have a way to incrementally find and replace ala Ruby's String#gsub when passed a block.

👍 awesome, looking forward to it!

I know this is a really old issue, but this would make re2 much more useful. I know there is the replace_all, but having something that is blocked based would be incredibly useful.

mudge commented

GitLab implemented a version of RE2’s GlobalReplace that takes a block in their UntrustedRegexp. It works by incrementally building up a new string by repeatedly matching and partitioning the input text, yielding the match until either the input is exhausted or there are no more matches.

I'm actually the one that wrote that 😄. You're welcome to incorporate it, or are you interested in a PR adding it?

mudge commented

Ha, well I’m glad you found a solution in the meantime.

I’m torn about whether to incorporate it in the gem as the underlying RE2’s Replace and GlobalReplace functions are quite limited so it’s really a function of Match (as you’ve done) and perhaps best left to the client.

If FindAndConsumeN (which powers the gem’s RE2::Scanner) returned the entire match as well as its capturing groups it’d be ideal: we could use it to incrementally extract and yield matches while benefiting from it using a string view internally over the original input (so there’d be no need to build new remainder strings as we go). Unfortunately it only returns the capturing groups and it feels risky to modify the client’s original pattern to add an extra, all-encompassing capturing group.

Makes sense. I wish mine didn't need to jump through the hoops it does.

btw, thanks for a great library.