Feature request: supporting lookarounds (PCRE2 or fancy-regex)
LeoniePhiline opened this issue · 2 comments
Feature request
First of all, thank you very much for opensourcing fastmod
! I find it quite joyful to use!
At the moment, it does not have support for positive and negative lookaheads and lookbehinds.
These can be very much required with complex searches.
Security
Note that fastmod
is usually run over trusted code, not over untrusted user input, thus an opt-in --pcre2
--fancy
flag should not pose any unacceptable security risk.
Prior art
Note that ripgrep
(which does search, but not replace) has optional support for switching its regex engine to use PCRE2.
Among other things, this makes it possible to use look-around and backreferences in your patterns, which are not supported in ripgrep's default regex engine. PCRE2 support can be enabled with -P/--pcre2 (use PCRE2 always) or --auto-hybrid-regex (use PCRE2 only if needed). An alternative syntax is provided via the --engine (default|pcre2|auto-hybrid) option.
Pivot from rust-pcre2
to fancy-regex
See #49 (comment). grep::pcre2
is unlikely to expose pcre2_substitute
any time soon, due to obvious maintenance overload.
The fancy-regex
crate is the go-to escape from rust's regex
crate where-ever lookarounds are uncircumventable.
Implementation
https://docs.rs/pcre2/ should be usable as a base for this feature.
This crate is recommended by and instead of https://github.com/BurntSushi/ripgrep/blob/master/crates/pcre2/README.md
However, since fastmod
already uses grep
(rg
as a library), it might just as well enable the pcre2
cargo dependency flag and use grep::pcre2
just as ripgrep
does: https://github.com/BurntSushi/ripgrep/blob/327d74f1616e135a6eb09a0c3016f8f45cfc0cfc/crates/core/search.rs#L199
Enum-dispatched regex matcher and replacer based on regex
and fancy-regex
.
Update:
The fastmod
crate is updated in #50 (can be merged!).
Looking at BurntSushi/rust-pcre2#26 and BurntSushi/rust-pcre2#27 for the implementation of PCRE2 support, this looks like a dead end.
Andrew appears to not have the bandwidth for maintenance, as even quality PRs from years ago are unreviewed.
I will drop the attempts to implement PCRE2 support and pivot to fancy-regex
, which does not require unsafe code (implemented in pure Rust).
It supports lookarounds, with the same risk for catastrophic backtracking, which is not relevant to fastmod
.
Another update: Andrew Gallant expressed openness towards accepting a PR to the pcre2 crate to expose substitution.
BurntSushi/ripgrep#2763 (reply in thread)
I am interested in trying to create a quality patch and get it merged. If that succeeds, then fastmod could get its PCRE2 mode after all.
The stale PRs in https://github.com/BurntSushi/rust-pcre2 do not give me much hope for a fast solution, but there are lots of reasons for hope of resolving this entanglement:
Fish shell appears to have more bandwidth (or more need) to maintain a PCRE2 substitution patch in their fork, which is already implemented.
Their fork remains maintained, since UTF-32 matching was not upstreamed after all.
This means, we have a quite fragmented situation:
- BurntSushi's
grep-pcre2
(which is exposed asgrep::pcre2
) uses hispcre2-sys
C bindings crate, which does not have support for substition and probably never will. - Fish shell's maintained
pcre2-sys
fork does support substitution. But in order to use it,grep-pcre2
would need to be forked.
All this sounds quite doable for a private project, but getting the changes merged back into https://github.com/facebookincubator/ sounds ... somewhat unlikely?
➡️ Ideally, the substitution feature (which BurntSushi has indicated might well be accepted into his upstream pcre2
& pcre2-sys
crates) is to be backported from Fish shell's fork into BurntSushi's upstream.
A possible route to success:
- Backport the Fish shell
pcre2-sys
&pcre2
fork implementation of PCRE2 substitution to BurntSushi'spcre2-sys
&pcre2
, and try to have it merged.
Fish might want to resync their fork and keep maintaining only the UTF-32 patch on top of upstream. - Change
fastmod
(possibly as an optional feature), enablinggrep::pcre2
in order to use the PCRE2 Searcher/Matcher API of BurntSushi's higher levelgrep-pcre2
library. - In
fastmod
, depend on BurntSushi'spcre2
to apply replacements found and matched via the PCRE2 Searcher/Matcher API of BurntSushi's higher levelgrep-pcre2
library, (exposed asgrep::pcre2
). - Expose to users as
fastmod -P
/fastmod --pcre2
.