Unreachable branch in LUTs are still linked to
RustyYato opened this issue · 0 comments
RustyYato commented
I have some code like this:
#[derive(Logos)]
#[logos(source = [u8])]
enum Token {
// NOTE: This is needed because logos has dot_matches_newline(false) set for regex_syntax (which is the default)
#[token("\n")]
Newline,
#[regex(b".", priority = 0)]
UnknownByte,
}
And this lexer should be impossible to error from so I use the error type enum LexerError {}
which will cause a linker error in release mode like so
impl Default for LexerError {
#[cfg(not(debug_assertions))]
fn default() -> Self {
extern "C" {
fn __lexer_error_unreachable_default() -> !;
}
// force a linker error
unsafe { __lexer_error_unreachable_default() }
}
#[cfg(debug_assertions)]
fn default() -> Self {
panic!("It is impossible for the lexer to error")
}
}
This would work if the LUT didn't generate the error branch. And for some reason LLVM is unable to optimize out this branch. I suspect it's because the LUT is stored in a static
, which tends to be an optimization barrier.
To fix this, the error branch simply shouldn't be generated if it is unreachable.