How to extract capture group in token callback?
Closed this issue · 6 comments
Hi, thanks for making this useful crate!
If I have a capture group inside a regex like
#[derive(Logos, Debug, PartialEq, Clone)]
#[logos(error = LexingError)]
pub enum Token {
#[regex(r#"arbitrary regex with capture group"#, |lex| /* extract capture group */)]
Foo(String),
How can I get access to the regex's capture groups (parentheses) inside the lexer callback? :)
Hello, you cannot get capture groups :/
Capture groups are (currently) ignored and flattened as simple regexes.
Expanding the docs to clarify what sorts of regex behavior is allowed would be very useful.
It's quite understandable that if, for example, the regex is expanded into an automaton that does, for example, a one pass on the data that it would not want to deal with the sort of lookback involved in capture groups in a general regex. But none of this is clear from the docs.
I'm not familiar enough with the crate internals to comment. Is there a good way to gather what is and isn't possible aside from detailed analysis of source code?
Hello @ethanmsl, did you take a look at the book?
This page in particular: https://logos.maciej.codes/common-regex.html.
The word "capture" isn't even on that page. It mentions that it avoids backtracking, but I think that's a bit of theory (divergence of raw automata behavior from what now passes as standard regex due to requirement for backtracking as a result of how repeating patterns can nest) that most will not be able to connect to behavior.
For my part I did take a look at the book, but did miss that -- I imagine I keyword searched for regex behavior terms and ended up empty handed.
I think more direct and examplefull documentation might assist, but I appreciate your pointing out the page. Perhaps more methodical readers will find less need than I did! :)
Actually, I did document this in the past, but I don't know why it does not appear in the online version of the book, see 671be13 :/
logos/book/src/common-regex.md
Lines 36 to 38 in 671be13
I will see if I can retrigger a build :-)
Done :-) This is now online