WASM build can crash with "RuntimeError Index out of bounds" when converting UTF-16LE text to UTF8
ajrcarey opened this issue · 7 comments
Follow-on from #10. The examples/objects sample, when compiled to WASM and run in the browser, will crash with a RuntimeError Index out of bounds
during text conversion. Strangely, this appears to be happening when attempting to return a value from utils::get_string_from_pdfium_utf16le_bytes()
, which may suggest an out-of-memory problem with the stack frame sizing in the default WASM compile settings.
Possibly related to rustwasm/wasm-pack#479
Set examples/.cargo/config.toml to create an 8 Mb stack rather than the default 1 Mb, as per rustwasm/wasm-pack#479. Confirmed 8 Mb stack size applied correctly using wasm2wat tool. RuntimeError continues. Used debug profile to confirm problem occurs in the following WASM instruction:
call $core::ptr::drop_in_place<alloc::vec::Vec<u8>>::h52e9c51caa5dfb98
inside the compiled utils::get_string_from_pdfium_utf16le_bytes()
function. This appears to happen right on the exit from the function. Since the byte buffer containing the UTF-16LE data is taken by ownership into utils::get_string_from_pdfium_utf16le_bytes()
, I suspect this is the freeing of that byte buffer.
drop_in_place() calls a stack of functions that ultimately end up inside the memory allocator.
The actual error occurs in call $dlmalloc::dlmalloc::Dlmalloc<A>::malloc::h1538d4b11d1da1be
, i.e. inside the standard Rust allocator used when targeting the wasm32-unknown-unknown
architecture, dlmalloc
. Could consider switching out for a different allocator when compiling to WASM?
Changing the allocator to wee-alloc
instead of dlmalloc
changes the pattern at which failure occurs (it carries on a bit longer), but a RuntimeError is still thrown. On Edge and Chrome, the error is reported as "memory access out of bounds", which is a bit more descriptive at least. A big hint, however, is that all three browsers show the heap size as 35.6 Mb, with pdfium
consuming roughly 10 Mb for its WASM heap and pdfium-render
consuming a little over 20 Mb. The heap sizes are the same across browsers, which strongly suggests to me a set size limit of about 32 Mb for the entire runtime of the browser tab.
Growing the module heap using instance.memory.grow()
does correctly raise the heap limit - in my testing I grew the heaps assigned to both pdfium and pdfium-render by 100 Mb each, but the RuntimeError occurs at the same place :/
Forcing Rust to avoid freeing the byte buffer via std::mem::forget()
shifts the allocation failure to a call to call $log::__private_api_log::h8c2be2e67ed23b4a
, which itself fails on a call to call $alloc::fmt::format::h6ab9c6dede04b06a
. This suggests that the failure is now occuring, somewhat ironically, in a log::info!()
debugging statement. That ultimately does not matter; the point is that, whether the error occurs in an allocation or a deallocation, it is nevertheless occuring.
I need better disassembly in order to diagnose exactly what values are triggering the out of bounds error.
Well, the stack frame sizing turned out to be a red herring after all. It was the error during the drop_in_place() deallocation that should have given the big hint: a buffer was being allocated with the wrong buffer length. It turned out that the call to FPDFTextObj_GetText()
was the culprit; the return result specifies the number of bytes copied into the buffer in Pdfium's WASM heap, but when we copy that buffer back to pdfium-render's WASM heap we were using core::ptr::mut_ptr::copy_from()
with the result specifying the count of FPDF_WCHARs to copy back, which is twice as many bytes. This quite predictably ended up corrupting the memory heap. What an annoying wild goose chase.
Removed resizing of stack frame in examples/.cargo/config.toml, as it was never the problem. Reset default allocator from wee-alloc to dlmalloc. Removed explicit heap memory growth in Javascript. Corrected error in WasmPdfiumBindings::FPDFTextObj_GetText()
. Removed debugging statements. Bumped crate version to 0.5.5. Pushed bug fix release to crates.io.