spacejam/sled

panic: tried to serialize Uninitialized

repi opened this issue · 2 comments

repi commented

We got a panic from one of our linux users when opening a sled database (that it has created), haven't seen this before and only gotten it twice ever (both times today). opening up this issue as there didn't seem to have been any other reports of this specific panic.

could potentially be some local corruption on that specific machine also, if that was the case would it be possible to fail opening the db on this instead of panicking? though understand if it could be hard to handle everything to errors rather than panics.

Context:

  • sled version: 0.34.7
  • rustc version: 1.56.1
  • operating system: Linux 5.8.0-48-generic (Ubuntu)

logs, panic messages, stack traces:

tried to serialize Uninitialized, file: /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/sled-0.34.7/src/serialization.rs:662:18, thread: main

6   std::panicking::rust_panic_with_hook (panicking.rs:628)
7   std::panicking::begin_panic_handler::{{closure}} (panicking.rs:521)
8   std::sys_common::backtrace::__rust_end_short_backtrace (backtrace.rs:141)
9   rust_begin_unwind (panicking.rs:517)
10  std::panicking::begin_panic_fmt (panicking.rs:460)
11  sled::pagecache::snapshot::PageState::serialized_size (serialization.rs:662)
12  [inlined] sled::pagecache::snapshot::Snapshot::serialized_size::{{closure}} (serialization.rs:507)
13  [inlined] core::iter::adapters::map::map_fold::{{closure}} (map.rs:84)
14  [inlined] core::iter::traits::iterator::Iterator::fold (iterator.rs:2170)
15  [inlined] core::iter::adapters::map::Map<T>::fold (map.rs:124)
16  [inlined] u64::sum (accum.rs:42)
17  [inlined] core::iter::traits::iterator::Iterator::sum (iterator.rs:2985)
18  [inlined] sled::pagecache::snapshot::Snapshot::serialized_size (serialization.rs:504)
19  [inlined] sled::serialization::Serialize::serialize (serialization.rs:35)
20  [inlined] sled::pagecache::snapshot::write_snapshot (snapshot.rs:515)
21  [inlined] sled::pagecache::snapshot::advance_snapshot (snapshot.rs:365)
22  sled::pagecache::snapshot::read_snapshot_or_default (snapshot.rs:450)
23  sled::pagecache::PageCache::start (mod.rs:588)
24  [inlined] sled::context::Context::start (context.rs:44)
25  sled::db::Db::start_inner (db.rs:70)
26  sled::config::Config::open (config.rs:354)
uuhan commented

If snapshot apply() calls,

snapshot.apply(log_kind, pid, lsn, ptr)?;

the pagetable vector maybe enlarged, with PageState::Uninitialized element.

self.pt.resize(

then panic happens if the snapshot is written to disk when sled tries to serialize the snapshot:

write_snapshot(config, &snapshot)?;

let raw_bytes = snapshot.serialize();

I think this pr may help to avoid this corruption and open the database, but there must be some edge case happens in your case.

#1389

repi commented

FWIW we went for 9 months not seeing this but did get another reported panic from a user with it today.

Still on sled 0.34.7 as have been no more releases (related: #1417 (comment))