bitcoindevkit/bdk

Appending changesets to `Store` after opening causes overwriting

Opened this issue · 15 comments

Describe the bug

  • let's say a Store instance initially contains changeset1 stored in its file.

    pub struct Store<C>

  • Then after opening the store instance using Store::open and appending a new changeset (changeset2) using append_changeset, the store will only contain changeset2, not changeset1 + changeset2.

pub fn open<P>(magic: &[u8], file_path: P) -> Result<Self, FileError>

pub fn append_changeset(&mut self, changeset: &C) -> Result<(), io::Error> {

To Reproduce

  #[test]
   fn test() {
       let temp_dir = tempfile::tempdir().unwrap();
       let file_path = temp_dir.path().join("db_file");
       let changeset1 = BTreeSet::from(["hello".to_string(), "world".to_string()]);
       let changeset2 = BTreeSet::from(["got the error".to_string()]);

       {
           // create new  db
           let mut db = Store::<TestChangeSet>::create_new(&TEST_MAGIC_BYTES, &file_path)
               .expect("must create");

           // append first changeset to db
           db.append_changeset(&changeset1).expect("must succeed");
       }

       {
           // open db
           let mut db = Store::<TestChangeSet>::open_or_create_new(&TEST_MAGIC_BYTES, &file_path)
               .expect("failed to load db");

           // now append the second changeset
           db.append_changeset(&changeset2).expect("must succeed");

           // Retrieve stored changesets from the database
           let stored_changesets = db
               .aggregate_changesets()
               .expect("must succeed")
               .expect("must succeed");

           // expected changeset must  be changeset2 + changeset1
           let mut expected_changeset = changeset2.clone();
           expected_changeset.extend(changeset1);

           // Assert that stored_changesets matches changeset2 but not expected_changeset
           assert_eq!(stored_changesets, changeset2);
           assert_ne!(stored_changesets, expected_changeset);
       }

       // Reason:
       //
       // On loading the pre-existing store instance,
       // it contains all the pre-appended changesets but the file pointer is positioned back to 12
       // i.e till the size of TEST_MAGIC_BYTES array.
       // thus on appending any new changeset to db -> it overwrites those past changesets
       // thus only contain the latest changeset.

       // Open the store again to verify file pointer position at the end of the file
       let mut db =
           Store::<TestChangeSet>::open(&TEST_MAGIC_BYTES, &file_path).expect("failed to load db");

       // get the current position of file pointer just after loading store
       let current_pointer = db.db_file.stream_position().expect("must suceed");

       assert_eq!(current_pointer, 12);

       // end pointer for the loaded db
       let expected_pointer = db.db_file.seek(SeekFrom::End(0)).expect("must succeed");

       // both doesn't match
       assert_ne!(current_pointer, expected_pointer);
   }

Expected behavior

  • On opening the store -> its file pointer must be set to end.

Build environment

  • BDK tag/commit: d99b3ef
  • OS+version: Fedora 37
  • Rust/Cargo version: rustc 1.81.0-nightly (f21554f7f 2024-06-08)
  • Rust/Cargo target: nightly-2024-06-09-x86_64-unknown-linux-gnu

@evanlinjin could this happen with the changes in #1514?

Thanks for writing up this ticket.

The intended usecase of bdk_file_store is with BDK structures. With BDK, we never read after writing.

Maybe we can structure things to be more fool-proof. Need to think more on this.

Thanks @KnowWhoami. Maybe Store::open should always seek the file cursor to EOF? If this is truly a bug then I'm surprised all the tests and examples work how we expect using the file store

If this is truly a bug then I'm surprised all the tests and examples work how we expect using the file store

  • while that's because in tests which considers opening the pre-existing Store -> we often run aggregate_changesets api that internally calls the iter_changesets api's which iterates over the stored changesets in order to collect all the stored changesets but it also changes the file pointer of Store and thus moves it to the end of file.

/// **WARNING**: This method changes the write position of the underlying file. The next

fn open_or_create_new() {

  • Thus , it works fine if we append any new changesets after then.

The intended usecase of bdk_file_store is with BDK structures. With BDK, we never read after writing.
Maybe we can structure things to be more fool-proof. Need to think more on this.

  • Yeah that's true, BDK wallet Api's don't care much about internal working of database , they just need to read the db to get required changesets and append's new one if found any. That's it.

  • But since bdk_file_store is mainly developed for the downstream users so that they can directly used it in their BDK based wallets.
    So this bug should be taken into consideration.

I think it would help if we combine opening and loading into one read method that opens the file and returns the aggregate changeset, thereby ensuring the next write goes at the end of the file

I would like to work on this, but I've got some questions:

  • There cannot be a situation where a user doesn't want to aggregate changesets?
  • For any reason it's not desirable to store multiple changesets without aggregating them, like in the following succession?
1st changeset: open-append-close
2nd changeset: open-append-close
3rd changeset: open-append-close

I'm thinking in implementing something like the read method @ValuedMammal mentions above, but just using iter_changeset to verify the correctness of the stored changesets, without modifying them.
That would place the file pointer at the real end of the file, and check any existing changeset for write errors, leaving to the user the decision of aggregating changesets, or keep appending them until further decision.

Perhaps an easy fix is to change the file mode to append here instead of only write

pub fn open<P>(magic: &[u8], file_path: P) -> Result<Self, FileError>
where
P: AsRef<Path>,
{
let mut f = OpenOptions::new().read(true).write(true).open(file_path)?;

After quick testing this appears to fix the issue in the original post, however it caused other test failures, in particular last_write_is_short and write_after_short_read.

In particular, write_after_short_read is clearly proving what @evanlinjin stated above: never read after write . It asserts explicitly that the remaining changesets after the "short read" are overwritten.

This is the idea. I have tested it against current file store tests and a modification of the test above to check the expected behavior and it works.

Pros:

  • Doesn't modify current API

Cons:

  • Extra method
  • Extra enum to wrap IterErrors inside FileError
diff --git a/crates/file_store/src/lib.rs b/crates/file_store/src/lib.rs
index 7c943ca2..236995e9 100644
--- a/crates/file_store/src/lib.rs
+++ b/crates/file_store/src/lib.rs
@@ -18,6 +18,8 @@ pub enum FileError {
     Io(io::Error),
     /// Magic bytes do not match what is expected.
     InvalidMagicBytes { got: Vec<u8>, expected: Vec<u8> },
+    /// Something went wrong while decoding file content
+    WrongEncoding(entry_iter::IterError)
 }
 
 impl core::fmt::Display for FileError {
@@ -29,6 +31,7 @@ impl core::fmt::Display for FileError {
                 "file has invalid magic bytes: expected={:?} got={:?}",
                 expected, got,
             ),
+            Self::WrongEncoding(e) => e.fmt(f)
         }
     }
 }
diff --git a/crates/file_store/src/store.rs b/crates/file_store/src/store.rs
index 62c3d91b..fdd7523c 100644
--- a/crates/file_store/src/store.rs
+++ b/crates/file_store/src/store.rs
@@ -92,6 +92,31 @@ where
         })
     }
 
+    /// Open [`Store`] and recover state after last write.
+    ///
+    /// Use [`open`] to open `Store` and manually handle the previous state if any.
+    /// Use [`create_new`] to create a new `Store`.
+    ///
+    /// # Errors
+    ///
+    /// If the prefixed bytes of the opened file does not match the provided `magic`, the
+    /// [`FileError::InvalidMagicBytes`] error variant will be returned.
+    ///
+    /// [`create_new`]: Store::create_new
+    pub fn rewind<P>(magic: &[u8], file_path: P) -> Result<Self, FileError>
+    where
+        P: AsRef<Path>,
+    {
+        let mut store = Self::open(magic, file_path)?;
+        for next_changeset in store.iter_changesets() {
+            if let Err(iter_error) = next_changeset {
+                return Err(FileError::WrongEncoding(iter_error));
+            }
+        }
+
+        Ok(store)
+    }
+
     /// Attempt to open existing [`Store`] file; create it if the file is non-existent.
     ///
     /// Internally, this calls either [`open`] or [`create_new`].

@nymius I do think it's reasonable to expect that the "open-append-close" flow should work without overwriting as you mentioned in #1517 (comment).

With BDK, we never read after writing.

^ To be honest I'm a little confused what is meant by this. edit: I may have been lacking context in which this is actually true; it makes sense over the lifetime of one Store instance.

If we assume the tests are correct, i.e. that append_changeset should write directly after the current file position (even if that means overwriting existing data), then I think the read approach makes sense (similar to your rewind idea). The issue I have with just adding this method (rewind) though is that it doesn't fix the problem with Store::open as pointed out by @KnowWhoami, and in the event the caller does want to load the contents of the database, the result is that we would effectively be doing iter_changesets twice?

On the other hand if we change the file mode to append, it would ensure that append_changeset always goes at the EOF regardless of the file position (if I understand), but we'd then have to redefine the test expectations with regard to truncating, etc

@nymius Revisiting the open-append-close idea:

Isn't it the case that we will always want to load the backend upon opening the file? For example in the context of a bdk wallet we have to load everything into memory in order to initialize the wallet. Without this you would have nothing to "append".

Isn't it the case that we will always want to load the backend upon opening the file? For example in the context of a bdk wallet we have to load everything into memory in order to initialize the wallet. Without this you would have nothing to "append".

I agree. Unless there was a precondition forcing a re scan on each open (which wasn't documented), I think is the most logic thing to do.

I did a summary of possible solutions I thought about:

  1. Another method like rewind or read implementing the safe behavior of
    reading content before proceeding with any writes.

    I think it works "on paper" but doesn't solve the underlying problem with
    open. It's just a workaround for those expecting the functionality
    explained in the issue.

  2. Iterating changesets on Store opening.

    This also works partially, as any encoding error causes the failure of
    the method and doesn't recover the already parsed changesets as the
    AggregatedChangesetError allows.

  3. Changing the opening mode of the file to append:

    Causes store::test::append_changeset_truncates_invalid_bytes to fail
    because it won't check the sanity of the previous data before allowing
    append_changeset to write after it, causing any failing file store to
    keep breaking store unnoticingly.

  4. Iterating changeset on Store::open call but silently truncating the file to
    the last successfully read changeset if any error occurs:

    Works, only failing test is store::test::write_after_short_read but the
    problem is on the tests itself, as it expects the file pointer to be
    reset each time the file is open.
    I don't like the UX of this. The user will be losing information without
    noticing.

  5. Iterating changeset on Store::open call but silently truncating the file to
    the last successfully read changeset if any error occurs, but returning the
    index of the changeset causing the failure.

    Same as above, but improving the UX, now the user knows there is a
    problem with its changeset store and can decide to do a rescan. Also, we
    could implement a complementary method to merge changesets until some
    changeset index, allowing users to work in the meantime with the not
    corrupted data. Not sure about how useful is this.

  6. Aggregating changeset on each Store open by default:

    Should works the same as above but returning a partially parsed
    changeset. It would also aggregated any changeset in the file, which may
    not be desired by some users, or at least we should check is not a
    current use case for someone. The difficulty with this change is it
    requires the addition of a new variant to the FileError enum, and as
    this variant is wrapping the AggregateChangesetError, it would require
    a generic for the partially aggregated changeset, creating a domino
    effect of complexity, making it a difficult change.

  7. Aggregating changeset on each Store open by default but just returning the
    error of the AggregatedChangesetsError and ignoring the partially
    aggregated changeset:

    This is an evolution of the solution above, avoiding the complexity of
    the generic implied by the partial aggregation.


I've tried all of the above. The majority of them make the following test fail,
except for 1 and 4:

  1. store::test::append_changeset_truncates_invalid_bytes: a premature
    iter_changeset in the open method realizesthe code is wrongly formatted but
    doesn't return the partially read changesets, as it's expected.

  2. store::test::last_write_is_short: a premature iter_changeset will fail if
    the file has been truncated in the middle of a changeset. The expected
    behavior is to open it normally but fail to aggregate and return the partial
    aggregated changeset inside the error.

  3. store::test::write_after_short_read: fails because the expected behavior
    is to not update the file pointer on open, and just overwrite whatever were
    there before. Instead, the open reads the changesets stored, updating the file
    pointer and then proceeds to append any new changeset, effectively avoiding the
    lose of the previous changesets.

The solutions I prefer is 5 and the second one is 7.

Why 5 over 7?
Because 5 gives additional information about the failing changeset, allows a
partial recovery and also informs of the error, while 7 just informs the error
without giving the previous alternatives.

It should be relatively painless if we keep all the same semantics of Store and build a safer API around it, so I want to go back to your first suggestion and add a method (call it reopen) that internally does iter_changesets to check the integrity of the db before returning a new Store instance. We can deprecate the current open and in the future we'll still use it internally, it just won't be marked pub. These are suggestions you can feel free to adapt.

/// Reopen an existing [`Store`]. Errors if...
pub fn reopen<P>(magic: &[u8], file_path: P) -> Result<Self, OpenError> where P: AsRef<Path> { ... }

We can have a new error OpenError that wraps the IterError. If we want to add a new and improved open_or_new for convenience, then I guess OpenError will have to be an enum with a variant for FileError as well.

/// An error while opening or creating the file store
#[derive(Debug)]
pub enum OpenError {
    /// entry iter error
    EntryIter {
        /// index that caused the error
        index: usize,
        /// iter error
        iter: IterError,
    },
    /// file error
    File(FileError),
}

We should be able to keep the existing tests and add coverage for the new functionality fixing the issue. If there's no interest in going the #[deprecated] route, then the suggestion is to make the current open a private method now and replace it with what I'm calling reopen above.