Unsound issue while converting bytes to utf8 str
Closed this issue · 5 comments
Hi, thanks for your time to read this issue. Our static analysis tool found there might be an unsound issue in your unsafe_as_str
converting the bytes to utf8 str:
Lines 202 to 208 in ec1a20b
As mentioned in the comments, this may introduce invalid utf8 conversion and producing an invalid value, which is considered as undefined behaviors in Rust. We expect either to mark the whole function as unsafe
or leverage the safe verison to convert because this library can take raw bytes from user and missed the validation of utf8. As a reference, the safe version of the utf8 conversion in std is:
Could you please help us double check the potential probelm? Thanks again for your time.
I suspect this is an AI bot?
This function is intended to be used by the crate's macros only. Users of the crate should never use this function, hence why it is hidden from the docs. It is required to be public due to how macros work.
Hi,
Thanks for your quick response. I am not an AI bot 😄 but I do use a template to report bugs.
For this unsound issue, it's because user has the access to control the input, and in the debug mode, an utf8 validation is done but in release mode, this validation is skipped. So following code will have different behaviors in debug mode and release mode, and in release mode it's considered as an undefined behavior(invalid value) in Rust:
#[forbid(unsafe_code)]
use obfstr::unsafe_as_str;
struct A{}
impl A {
pub const fn as_bytes(&self) -> &[u8] {
// an invalid utf-8 encoding
[0xC0, 0x80].as_slice()
}
}
fn main() {
println!("{:?}", obfstr::obfstr!(A{}))
}
Thanks again for your patience.
Oh I see, I overlooked that you could use a custom type.
It is important that no string validation is done at runtime for performance, the check being done when debug_assertions is to catch any accidents just in case. And it will catch your example, you have to really go out of your way to trigger this UB 😅
The bytes version asserts that the type is &[u8]
with this line: _OBFBYTES_STRING: &[u8] = $s;
.
Would it be enough that the input $s
string passed to the macro is asserted to be a &str
? You could still call the hidden exported unsafe_as_str
(it has unsafe in the name after all...) but it will be harder to misuse.
Thanks for your help, the patch looks great.
Published, thanks for the report!