rust-lang/regex

Valid prefix search (with ^) goes into dead state

acarl005 opened this issue · 3 comments

What version of regex are you using?

regex-automata = "0.4.5"

Describe the bug at a high level.

I'm trying to do a prefix search by adding a ^ at the beginning of my pattern. I'm searching right-to-left, but the dfa is failing to find a match and entering the "dead" state.

What are the steps to reproduce the behavior?

use regex_automata::{hybrid::dfa::DFA, nfa::thompson, Input};
use std::error::Error;

fn main() -> Result<(), Box<dyn Error>> {
    let pattern = r"^Qu";
    let dfa = DFA::builder()
        .thompson(thompson::Config::new().reverse(true))
        .build_many(&[pattern])?;

    let mut cache = dfa.create_cache();
    let hay = "Quartz";

    let mut state = dfa.start_state_reverse(&mut cache, &Input::new(hay))?;
    for &b in hay.as_bytes().iter().rev() {
        state = dfa.next_state(&mut cache, state, b)?;
        if state.is_dead() {
            panic!("DEAD");
        }
        if state.is_match() {
            println!("BREAK");
            break;
        }
    }

    state = dfa.next_eoi_state(&mut cache, state)?;
    assert!(state.is_match());
    Ok(())
}

What is the actual behavior?

❯ cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.00s
     Running `target/debug/reg`
thread 'main' panicked at src/main.rs:17:13:
DEAD
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

What is the expected behavior?

I expect the program to run through to the end, finding a match and passing the assertion, as the pattern ^Qu should match the string "Quartz".

If I remove the ^ from the pattern, this program behaves as expected.

This is fixed in regex-automata 0.4.6 on crates.io.

Thank you for the nice repro. :)

Thanks for the quick fix!