Panics when searching for multi-byte characters
Closed this issue · 4 comments
kotatsuyaki commented
When searching for strings containing double width characters, the program panics at this call to String::drain
, since the endpoint is not on a char boundary. I haven't figured out how to fix it.
Steps to reproduce
Search 甲州
in the pager with the following code.
Dummy text in Japanese: text.txt
main.rs
use std::fmt::Write;
const STR: &str = include_str!("./text.txt");
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut pager = minus::Pager::new();
write!(pager.lines, "{}", STR)?;
minus::page_all(pager)?;
Ok(())
}
Cargo.toml
# ...
minus = { version = "3.3", features = ["search", "static_output"] }
Panic backtrace
thread 'main' panicked at 'assertion failed: self.is_char_boundary(end)', /home/akitaki/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/string.rs:1572:9
stack backtrace:
0: rust_begin_unwind
at /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/std/src/panicking.rs:493:5
1: core::panicking::panic_fmt
at /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/core/src/panicking.rs:92:14
2: core::panicking::panic
at /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/core/src/panicking.rs:50:5
3: alloc::string::String::drain
at /home/akitaki/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/string.rs:1572:9
4: minus::search::find
at /home/akitaki/.cargo/registry/src/github.com-1ecc6299db9ec823/minus-3.3.3/src/search.rs:194:9
5: minus::search::highlight_search
at /home/akitaki/.cargo/registry/src/github.com-1ecc6299db9ec823/minus-3.3.3/src/search.rs:166:13
6: minus::init::static_paging
at /home/akitaki/.cargo/registry/src/github.com-1ecc6299db9ec823/minus-3.3.3/src/init.rs:78:32
7: minus::static_pager::page_all
at /home/akitaki/.cargo/registry/src/github.com-1ecc6299db9ec823/minus-3.3.3/src/static_pager.rs:86:13
8: indicatif_test::main::{{closure}}
at ./src/main.rs:10:5
9: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /home/akitaki/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:80:19
10: tokio::park::thread::CachedParkThread::block_on::{{closure}}
at /home/akitaki/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/park/thread.rs:263:54
11: tokio::coop::with_budget::{{closure}}
at /home/akitaki/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/coop.rs:106:9
12: std::thread::local::LocalKey<T>::try_with
at /home/akitaki/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:272:16
13: std::thread::local::LocalKey<T>::with
at /home/akitaki/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:248:9
14: tokio::coop::with_budget
at /home/akitaki/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/coop.rs:99:5
15: tokio::coop::budget
at /home/akitaki/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/coop.rs:76:5
16: tokio::park::thread::CachedParkThread::block_on
at /home/akitaki/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/park/thread.rs:263:31
17: tokio::runtime::enter::Enter::block_on
at /home/akitaki/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/runtime/enter.rs:151:13
18: tokio::runtime::thread_pool::ThreadPool::block_on
at /home/akitaki/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/runtime/thread_pool/mod.rs:71:9
19: tokio::runtime::Runtime::block_on
at /home/akitaki/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/runtime/mod.rs:452:43
20: indicatif_test::main
at ./src/main.rs:11:5
21: core::ops::function::FnOnce::call_once
at /home/akitaki/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
AMythicDev commented
Thanks for putting up the issue. A new version of minus is in the works, if you want to fix it, you should probably do it in the 4.0
branch, which will be the next major release of minus
kotatsuyaki commented
After some investigation,
- It seems like the panic is caused by the mixed usage of byte indices and unicode character indices, which are not always the same. Any multibyte character (for example
à
) can trigger the bug, so it also happens on languages with accented letters like Spanish. - On the
master
branch,- The panic can be solved by replacing
let truncate = x + query.char_indices().count();
withlet truncate = x + query.len();
, becauseString::drain
requires a byte range, not a character index range. - After fixing the panic, the cursor positioning is wrong. I'm not familiar enough (yet) with the codebase enough to fix this.
- The panic can be solved by replacing
- On the
4.0
branch, the panic occurs before the pager is shown, due to a call tosplit_at()
with confused index. I've used thetextwrap
library to resolve it, but I'm not sure if we should include that as a dependency. It seems better to just delegate line wrapping to mature libraries that can deal with arbitrary Unicode strings, instead of implementing as part of the minus library itself.pub(crate) fn break_line(mut line: &str, cols: usize) -> Vec<String> { textwrap::wrap(line, cols) .into_iter() .map(|line| line.to_string()) .collect() }
Panic backtrace on branch 4.0
thread 'main' panicked at 'byte index 86 is not a char boundary; it is inside '張' (bytes 84..87) of `織田信長は、織田弾正忠家の当主・織田信秀の子に生まれ、尾張(愛`', /home/akitaki/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/str/mod.rs:565:13
stack backtrace:
0: rust_begin_unwind
at /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/std/src/panicking.rs:493:5
1: core::panicking::panic_fmt
at /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/core/src/panicking.rs:92:14
2: core::str::slice_error_fail
3: core::str::<impl str>::split_at
at /home/akitaki/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/str/mod.rs:565:13
4: minus::line::Line::break_line
at /home/akitaki/code/minus/src/line.rs:129:36
5: minus::line::Line::new
at /home/akitaki/code/minus/src/line.rs:112:26
6: minus::line::Line::from_str
at /home/akitaki/code/minus/src/line.rs:148:24
7: minus::Pager::prepare
at /home/akitaki/code/minus/src/lib.rs:322:45
8: minus::static_pager::page_all
at /home/akitaki/code/minus/src/static_pager.rs:62:5
9: indicatif_test::main
at ./src/main.rs:6:5
10: core::ops::function::FnOnce::call_once
at /home/akitaki/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
AMythicDev commented
I am doing yet another rewrite using the textwrap
crate, hope it will soon fix the panic
AMythicDev commented
I have fixed this in 4.0 branch. Closing this issue.