holepunchto/hyperbee

rw lock for batches

Opened this issue · 13 comments

currently concurrent read and write ops on a single batch is not safe, it should be

Currently I am seeing some stuff where I have two batches running concurrently, they both do a read and then a write. I see some atomicity issues when running that. Is that the issue you're describing here? Thanks.

LuKks commented

@makew0rld Can you try provide a minimal test case? Otherwise let me know what is the exact atomicity issue so I can try

I'm struggling to reproduce the issue with a test case. I'll report back if I ever get it working. The issue I encountered had to do with reading an array or object, and then adding to it and writing it back. Two functions would do that at the same time, and I wanted to rely on batches to apply each of their reads and writes sequentially, one function after another. In my experience this didn't happen, but I can't reproduce it with a test case right now.

LuKks commented

@makew0rld I would suggest reading how current batch/lock works, sadly it didn't got merged because API might change so don't use the .lock method or don't necessarily depend on this but read more here:
https://github.com/holepunchto/hyperbee/blob/8b9e688ee370a277821f752294fbf61a6bc9fae8/README.md#const-batch--dbbatch
https://github.com/holepunchto/hyperbee/blob/8b9e688ee370a277821f752294fbf61a6bc9fae8/README.md#await-batchlock

Edit: This was the PR #126

LuKks commented

I like your use-case, would be awesome if we can make it work, can you share any code that you currently have?

Here's some test code I wrote, and it is working fine. It's possible my issue is elsewhere. But I'm curious if @mafintosh can comment on what this issue is about? Currently what exactly is "not safe"? Even if it's not related to the issue I'm experiencing, knowing what's unsafe will help guide how I write my hyperbee programs in the future.

import Hypercore from "hypercore";
import Hyperbee from "hyperbee";

const core = new Hypercore("test.hypercore");
await core.ready();
const db = new Hyperbee(core, {
  keyEncoding: "utf-8",
  valueEncoding: "utf-8",
});

async function thread(name, delay) {
  const batch = db.batch();
  await batch.lock();

  const result = await batch.get("test");
  if (result === null) {
    await new Promise((resolve) => setTimeout(resolve, delay));
    await batch.put("test", JSON.stringify([0]));
    await batch.flush();
    return;
  }

  const arr = JSON.parse(result.value);
  arr.push(arr.at(-1) + 1);

  await new Promise((resolve) => setTimeout(resolve, delay));

  await batch.put("test", JSON.stringify(arr));
  await batch.flush();
}

await Promise.all([thread("thread1", 10), thread("thread2", 10)]);

console.log((await db.get("test")).value);

The actual code that caused the issue is open source, but not very accessible to try out: https://github.com/starlinglab/authenticated-attributes/blob/102b629f6ac99b3d680f3b14226145e7ce0d3e9b/hyperbee/import.js#L174-L192

LuKks commented

You're using .lock() there, that's why it's working properly this time but surely you weren't using it before, right?

Does this test represents your case? #131

LuKks commented

@makew0rld For example, this is currently unsafe: #110

surely you weren't using it before, right?

I believe I was using it before, but possibly not. As I said the test case works, so maybe my issue in production was unrelated to this.

What makes #110 unsafe? It looks like everything is synchronous because await is used.

LuKks commented

Unsafe in that specific case means it throws an error but it shouldn't i.e. it's a bug, we will fix soon

The test does 5000 iterations, but if you do let's say 1000 then bug doesn't happen, thus it's generally unsafe because it works until it doesn't haha

LuKks commented

Currently what exactly is "not safe"?

The question is more like what things to consider when using the batch API, besides the previous bug, the only undocumented thing is:
Lock is auto-acquired when you do the first put or del in a batch

And you already know how to use batch.lock(), otherwise: 3ca9792

Ok, thanks. I am manually using locking, so it should be fine. I'll report back here if I am able to reproduce any issues.

Oh also it seems undocumented that batches that also do createReadStream.