assertion failed: bpos.to_u32() >= mbc.pos.to_u32() + mbc.bytes as u32
dwrensha opened this issue · 14 comments
rustc crashes on the following input, found by fuzz_rustc:
fn i(){println!("🦀%%%";r
error: this file contains an unclosed delimiter
--> bug.rs:1:25
|
1 | fn i(){println!("🦀%%%";r
| - - ^
| | |
| | unclosed delimiter
| unclosed delimiter
error: expected `,`, found `;`
--> bug.rs:1:23
|
1 | fn i(){println!("🦀%%%";r
| ^ expected `,`
error: argument never used
--> bug.rs:1:24
|
1 | fn i(){println!("🦀%%%";r
| ^ argument never used
|
thread 'rustc' panicked at 'assertion failed: bpos.to_u32() >= mbc.pos.to_u32() + mbc.bytes as u32', compiler/rustc_span/src/lib.rs:1710:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
error: internal compiler error: unexpected panic
note: the compiler unexpectedly panicked. this is a bug.
note: we would appreciate a bug report: https://github.com/rust-lang/rust/issues/new?labels=C-bug%2C+I-ICE%2C+T-compiler&template=ice.md
note: rustc 1.59.0-nightly (c09a9529c 2021-12-23) running on x86_64-unknown-linux-gnu
searched nightlies: from nightly-2021-01-01 to nightly-2021-12-24
regressed nightly: nightly-2021-10-02
searched commits: from aa7aca3 to c02371c
regressed commit: b6057bf
bisected with cargo-bisect-rustc v0.6.0
Host triple: x86_64-unknown-linux-gnu
Reproduce with:
cargo bisect-rustc --start=2021-1-1 --end=2021-12-24 --regress ice
The regression happened in #89340. cc @FabianWolff.
Assigning priority as discussed in the Zulip thread of the Prioritization Working Group.
@rustbot label -I-prioritize +P-low
I've been taking a look at this. It's pretty interesting. I'm not entirely sure I'll be able to work it out, but I figure I may as well assign myself for the time being while I attempt to solve it.
Another test case:
fn f(){(print!(á
fn f(){(print!(á
I'm somewhat confused. Are you getting the same error from this? I'm having trouble reproducing.
The problem seems to reside within the crate rustc_builtin_macros
, in the files format.rs
and format_foreign.rs
. The function expand_preparsed_format_args
in format.rs
has a macro, check_foreign!
, which is in charge of looking for foreign substitutions - that is, someone using printf or shell style formatting rather than Rust's style. It's a macro so it can be generic over the two types of substitution.
The first time this macro is called, for printf substitutions, it finds two (I think "%%" and "%", though I'm not sure it's exactly that). The code that actually detects the substitutions is in format_foreign.rs
. When parsing the substitution, in the function printf::parse_next_substitution
, it is somehow finding that the second substitution has a boundary in the middle of the 🦀
character. I'm not sure exactly how yet. This gets turned into a span in expand_preparsed_format_args
. The span is malformed, as it splits a character in two, so the second the code tries to use it for anything it trips an assertion and causes this ICE.
The remaining problem is figuring out what's going wrong in printf::parse_next_substitution
. This sort of string manipulation is not my forte. I'll probably take another few cracks at it though.
Something is broken about printf::Substitutions::pos
.
This change makes the ICE go away (but probably breaks other things):
diff --git a/compiler/rustc_builtin_macros/src/format_foreign.rs b/compiler/rustc_builtin_macros/src/format_foreign.rs
index bfddd7073ff..3b9e9f76f45 100644
--- a/compiler/rustc_builtin_macros/src/format_foreign.rs
+++ b/compiler/rustc_builtin_macros/src/format_foreign.rs
@@ -289,8 +289,8 @@ fn translate(&self, s: &mut String) -> std::fmt::Result {
}
/// Returns an iterator over all substitutions in a given string.
- pub fn iter_subs(s: &str, start_pos: usize) -> Substitutions<'_> {
- Substitutions { s, pos: start_pos }
+ pub fn iter_subs(s: &str, _start_pos: usize) -> Substitutions<'_> {
+ Substitutions { s, pos: 0 }
}
/// Iterator over substitutions in a string.
@@ -303,15 +303,16 @@ impl<'a> Iterator for Substitutions<'a> {
type Item = Substitution<'a>;
fn next(&mut self) -> Option<Self::Item> {
let (mut sub, tail) = parse_next_substitution(self.s)?;
+ let pos_diff = self.s.len() - tail.len();
self.s = tail;
match sub {
Substitution::Format(_) => {
if let Some(inner_span) = sub.position() {
sub.set_position(inner_span.start + self.pos, inner_span.end + self.pos);
- self.pos += inner_span.end;
+ self.pos += pos_diff;
}
}
- Substitution::Escape => self.pos += 2,
+ Substitution::Escape => self.pos += pos_diff,
}
Some(sub)
}
When parse_next_substitution()
returns here:
it seems like we should increment
self.pos
by start + 2
, but we actually only increment it by 2:
@inquisitivecrystal yes, I get the exact same error. Playground. The error disappears if you add a new line after the á
, so maybe your text editor did that automatically.
@inquisitivecrystal yes, I get the exact same error. Playground. The error disappears if you add a new line after the
á
, so maybe your text editor did that automatically.
That was it, thanks.
I'm going to unassign myself from this. I do hope my exploration ends up being helpful to whoever fixes it though. @dwrensha: if you want to fix this yourself, you certainly seem further along than I was, though you shouldn't feel any pressure to do so if you don't want to.