Stack overflow when decoding large boxed arrays
Closed this issue · 14 comments
Example:
#![feature(new_uninit)]
use parity_scale_codec::{Decode, Encode};
#[derive(Encode, Decode)]
struct S(Box<[u8; 100 * 1024 * 1024]>);
impl Default for S {
fn default() -> Self {
Self(unsafe { Box::new_zeroed().assume_init() })
}
}
fn main() {
let s = S::default();
let encoded = s.encode();
println!("Encoded successfully");
S::decode(&mut encoded.as_slice()).unwrap();
}
Output:
Encoded successfully
thread 'main' has overflowed its stack
fatal runtime error: stack overflow
fish: Job 1, 'cargo run --example overflow' terminated by signal SIGABRT (Abort)
Likely due to rust-lang/rust#53827
This manual implementation works as a workaround, but it is slow:
#![feature(new_uninit)]
use parity_scale_codec::{Decode, Encode, Input};
use std::mem::ManuallyDrop;
#[derive(Encode)]
struct S(Box<[u8; 100 * 1024 * 1024]>);
impl Decode for S {
fn decode<I: Input>(input: &mut I) -> Result<Self, parity_scale_codec::Error> {
let piece = parity_scale_codec::decode_vec_with_len::<u8, _>(input, 100 * 1024 * 1024)
.map_err(|error| error.chain("Could not decode `S.0`"))?;
let mut piece = ManuallyDrop::new(piece);
// SAFETY: Original memory is not dropped and guaranteed to be allocated
let piece = unsafe { Box::from_raw(piece.as_mut_ptr() as *mut [u8; 100 * 1024 * 1024]) };
Ok(S(piece))
}
}
impl Default for S {
fn default() -> Self {
Self(unsafe { Box::new_zeroed().assume_init() })
}
}
fn main() {
let s = S::default();
let encoded = s.encode();
println!("Encoded successfully");
S::decode(&mut encoded.as_slice()).unwrap();
}
Minimized example on stable rust (in case anyone doubts the unsafe
/ nightly):
use parity_scale_codec::Decode;
fn main() {
let data = &[];
let _ = Box::<[u8; 100 * 1024 * 1024]>::decode(&mut data.as_slice());
}
Hmm.... this seems a little tricky to fix as-is; this is the problem code:
impl<T, X> Decode for X where
T: Decode + Into<X>,
X: WrapperTypeDecode<Wrapped=T>,
{
fn decode<I: Input>(input: &mut I) -> Result<Self, Error> {
input.descend_ref()?;
let result = Ok(T::decode(input)?.into());
input.ascend_ref();
result
}
}
impl<T> WrapperTypeDecode for Box<T> {
type Wrapped = T;
}
So it calls decode()
on the array type and then into()
s into the Box
, which goes through the stack, which explodes if the array's too big.
We'll probably have to refactor parity-scale-codec
a little to allow decoding into &mut MaybeUninit<T>
so that the Box
with the underlying memory could be preallocated beforehand, or something along these lines.
I don't think the issue is fixed fully. I have upgraded to 3.6.1 and it my app still crashes with stack overflow.
Here is example that is a tiny bit more advanced than before (this is what I actually use in the app), it uses new type:
use parity_scale_codec::Decode;
#[derive(Decode)]
#[repr(transparent)]
struct NewType([u8; 100 * 1024 * 1024]);
#[derive(Decode)]
struct S(Box<NewType>);
fn main() {
let data = &[];
S::decode(&mut data.as_slice()).unwrap();
}
The biggest problem is that it compiled, but crashes in runtime. If it didn't compile it would have been easier to work with.
Hm not sure what could be done about this. We could add some MaxStackSize(u32)
to SCALE, so that it can estimate its own memory usage and cause a compile error if it would easily be reached.
But that does not give any actual guarantees (unless we pair it with a MaxNested(u32)
attribute...
Or we swap to lazy allocations when the estimated size is above some customizable limit? All just ideas, no idea what is best or achivable.
I think we can fix it. If we have a new type wrapper like type (one unnamed/named field) and it has the repr(transparent)
, we can delegate decode_into
to the wrapped type. This should fix this issue here.
And BTW, it doesn't fail with a stack overflow for me. I get a segmentation fault, because the inputs to decode_into
seem to be null/wrong 🤔
It seems to on Linux:
thread 'main' has overflowed its stack
fatal runtime error: stack overflow
fish: Job 1, 'cargo run --bin codec' terminated by signal SIGABRT (Abort)
I'm also on Linux, but I get:
(signal: 11, SIGSEGV: invalid memory reference)
Looking into it with gdb, I get the following:
[Switching to Thread 0xfffff7dbf0e0 (LWP 540650)]
0x0000aaaaaaae8f98 in parity_scale_codec::codec::Decode::decode_into<mod::NewType, &[u8]> (input=0x0,
dst=<error reading variable: Cannot access memory at address 0xffffcc1bd5f0>) at src/codec.rs:308
That input
is a null pointer, doesn't make any sense..
Yeah, this is kinda expected, as the Derive
macro generates the following code here:
#[repr(transparent)]
struct NewType([u8; 100 * 1024 * 1024]);
struct S(Box<NewType>);
#[allow(deprecated)]
const _: () = {
#[automatically_derived]
impl ::parity_scale_codec::Decode for NewType {
fn decode<__CodecInputEdqy: ::parity_scale_codec::Input>(
__codec_input_edqy: &mut __CodecInputEdqy,
) -> ::core::result::Result<Self, ::parity_scale_codec::Error> {
::core::result::Result::Ok(
NewType({
let __codec_res_edqy = <[u8; 100 * 1024
* 1024] as ::parity_scale_codec::Decode>::decode(
__codec_input_edqy,
);
match __codec_res_edqy {
::core::result::Result::Err(e) => {
return ::core::result::Result::Err(
e.chain("Could not decode `NewType.0`"),
);
}
::core::result::Result::Ok(__codec_res_edqy) => __codec_res_edqy,
}
}),
)
}
}
};
#[allow(deprecated)]
const _: () = {
#[automatically_derived]
impl ::parity_scale_codec::Decode for S {
fn decode<__CodecInputEdqy: ::parity_scale_codec::Input>(
__codec_input_edqy: &mut __CodecInputEdqy,
) -> ::core::result::Result<Self, ::parity_scale_codec::Error> {
::core::result::Result::Ok(
S({
let __codec_res_edqy = <Box<
NewType,
> as ::parity_scale_codec::Decode>::decode(__codec_input_edqy);
match __codec_res_edqy {
::core::result::Result::Err(e) => {
return ::core::result::Result::Err(
e.chain("Could not decode `S.0`"),
);
}
::core::result::Result::Ok(__codec_res_edqy) => __codec_res_edqy,
}
}),
)
}
}
};
As you can see the decode
tries to decode the array on the stack here.
For this not to crash we need to also generate a decode_into
passthrough.
I'll fix this.
And BTW, it doesn't fail with a stack overflow for me. I get a segmentation fault, because the inputs to
decode_into
seem to be null/wrong thinking
That's probably due to corrupted stack. (See the other argument where you're getting error reading variable: Cannot access memory at address 0xffffcc1bd5f0
.) You have to remember that GDB uses whatever is on the stack, and if your stack is busted then it might just give you total garbage.