matthieu-m/jagged

May the Writer be Send instead of the Data itself?

Closed this issue · 2 comments

Currently, I'm stuck in the writing thread because the data can only live here, and only from here can I obtain a reader/snapshot of it, so the writing thread becomes my main thread, if I understood Rust lifetimes correctly.

It worked, though. But I expect a more natural runtime model. The Data is Sync, and it would be even better if it's also Send, so I can place it within an Arc. It has only a limited set of methods for obtaining its writer, reader, and snapshot, all of which are Send. Among these, reader and snapshot are also Sync, while writer can only be acquired once until it's released.

It is just a rough concept and has not reached the implementation stage. It is also uncertain whether it is truly feasible.

I am afraid I do not understand the problem you find yourself facing, at all.

Currently, I'm stuck in the writing thread because the data can only live here,

The data itself does not really live on any thread, it lives on a shared heap.

and only from here can I obtain a reader/snapshot of it,

The first Reader must be obtained from the main data-structure indeed, but after that:

  • The Reader is Copy, so can be copied to other threads at any time.
  • The Reader can create Snapshots.

Having an independent Data would change nothing to that; you'd still need to create the Reader from the Data.

so the writing thread becomes my main thread, if I understood Rust lifetimes correctly.

"main thread" typically means the thread where main runs, so no.

On the other hand, jagged is very much designed for fork-join parallelism, so you'll need to use the scoped threads API to guarantee that threads which use a reference to the main data-structure do not outlive said data-structure.

The Data is Sync, and it would be even better if it's also Send, so I can place it within an Arc. [...] It is also uncertain whether it is truly feasible.

It definitely seems feasible. It also seems very different, with different trade-offs.

It would add levels of indirection, fallibility to a number of currently infallible methods, etc...

This does not mean it's a bad idea, but it's different enough from jagged that it seems irreconcilable. I'm no oracle, though, maybe there's a way to get our cake and eat it too... I won't be the one doing the investigation, though.

On the other hand, jagged is very much designed for fork-join parallelism, so you'll need to use the scoped threads API to guarantee that threads which use a reference to the main data-structure do not outlive said data-structure.

This is exactly where I feel inconvenient, the programming model is fixed to be as follows:

    // Although the vec is Send, you cannot actually move it, as it would make the reader unusable.
    let vec: Vector<_> = Vector::new();

    thread::scope(|scope| {
        //  Consumer
        scope.spawn(move |_| {
            // read...
        });

        //  Producer
        for i in 0..NUMBER_ELEMENTS {
            // vec.push(value);
        }
    }).unwrap();

What I expect is to be like this:

	// vec is Send and Sync
        let vec = Arc::new(Vector::new());

        let vec_clone = vec.clone();
        let tw = thread::spawn(move || {
	    // better to limit there is only one writer, whether static or dynamic.
            let writer = vec_clone.writer();
            // write into
        });

        let vec_clone = vec.clone();
        let tr = thread::spawn(move || {
            let snapshot = vec_clone.snapshot();
            // read out
        });

The basic idea is that I can hold this data in any thread and read from it, but only one can obtain its writer.