Inconsistent behavior with zero-width matches on empty strings
rootCircle opened this issue · 0 comments
rootCircle commented
What version of regex are you using?
v1.10.3
Describe the bug at a high level.
replace_all
in the regex crate replaces empty strings before non-matching characters differently than Python's standard library regex engine. (Rust version of regex doesn't consider empty strings before non-matching characters as valid matches.)
What are the steps to reproduce the behavior?
- Create a Regex object with the pattern
r"a*"
(matches zero or more "a"s). - Apply replace_all to the string
"abxd"
with a hyphen as the replacement string. - Observed output (Rust):
"-a-b-d-"
- Expected output (Python):
"-a-b--d-"
Rust Code
use regex::Regex;
fn main() {
let re = Regex::new(r"x*").unwrap();
let hay = "abxd";
println!("{:?}", re.replace_all(hay, "-"));
}
Equivalent Python Code:
import re
regex = r"x*"
test_str = "abxd"
subst = "-"
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)
if result:
print (result)
What is the actual behavior?
replace_all
only replaces the empty string before "b"
in Rust, not the one before "d"
.
What is the expected behavior?
Both empty strings should be replaced, resulting in "-a-b--d-"
.
By the way, I am not sure, if this is an intentional difference or a potential bug?