Pauan/rust-signals

Documentation regarding memory management

Opened this issue · 3 comments

This looks really cool, and handles stuff like collections which others (rxRust) totally miss.

It says it's fast, and the examples don't show any lifetimes, but it's not clear what the ownership model is and rust hides a lot of lifetime stuff. What's the ownership model here? My understanding is that this basically defines a graph, and graphs are Rust's weak point. Optimizations sometimes use things like slab allocators which impose various weird restrictions on usage.

I was using Sycamore and needed to store a signal in a struct and immediately lost two days to lifetime issues.

This is the 2nd thing I looked for and it would be really helpful to have in the documentation for people like me.

Pauan commented

It says it's fast, and the examples don't show any lifetimes, but it's not clear what the ownership model is and rust hides a lot of lifetime stuff. What's the ownership model here?

The ownership model is straightforward: values are always owned, there are no references or lifetimes.

The ownership of the value starts at the root node, and the ownership is then passed down to the children until it reaches the leaf node.

So there is a simple and predictable ownership path that always goes from root to leaf.

This is possible because Signal methods always consume the parent Signal, which means Signals are always owned. Because Signals are always owned, that means the value of the Signal is also always owned.

This is the same ownership model that Futures and Streams use, which isn't a coincidence, since futures-signals is intentionally built on the same infrastructure as Futures and Streams. That also means that Signals are zero-cost, because Futures/Streams are also zero-cost.

Because values are always owned, sometimes the value needs to be cloned. This is indicated with a _cloned suffix in the method name. You can avoid the cost of cloning by wrapping the value in Rc or Arc.

My understanding is that this basically defines a graph, and graphs are Rust's weak point.

It is a graph, but it's a directed acyclic graph, so it's very straightforward to implement in Rust.

Futures and Streams also create directed acyclic graphs, and Rust handles them perfectly fine, they fit very naturally into Rust.

I was using Sycamore and needed to store a signal in a struct and immediately lost two days to lifetime issues.

I strongly recommend not storing Signals in structs.

It is possible to do so (by using boxed_local and Broadcaster), but it requires boxing and dynamic dispatch, so it's much less efficient.

Trying to store a Signal in a struct is like trying to store a Future or Stream in a struct: it's possible but not a good idea.

In practice, if you follow the idiomatic techniques, then you should basically never need to store Signals in structs, because there are better alternatives.

Ah I see, so the signals don't manage any references to subscribers, but you literally need to spawn each one in a new runtime task, then the runtime manages the graph using wakers. So it's basically using the graph already built into the runtime IIUC, that's pretty neat.

Awaiting individual signals or putting them in a select_all or something doesn't make any sense.

And I guess that means in a single threaded runtime none of the downstream callbacks fire until the current async task parks.

Pauan commented

Ah I see, so the signals don't manage any references to subscribers, but you literally need to spawn each one in a new runtime task, then the runtime manages the graph using wakers.

Yes, the Future Executor handles spawning Tasks and polling the Task when the Signal updates.

Which means futures-signals works with every Executor (tokio, async-std, wasm-bindgen-futures, etc.)

This spawning is quite straightforward:

tokio::spawn(some_signal.for_each(|value| {
    // Do something with the value of some_signal
}));

Each Executor has a different way to spawn Futures, but they're all very similar (usually using a spawn function).

Awaiting individual signals or putting them in a select_all or something doesn't make any sense.

You can use map_ref to combine multiple parent Signals together:

let output_signal = map_ref! {
    let a = input_signal1,
    let b = input_signal2,
    let c = input_signal3 => {
        *a + *b + *c
    }
};

Whenever input_signal1, input_signal2, or input_signal3 changes, it will then re-run the *a + *b + *c code, and will put the result into output_signal.


So map_ref allows a Signal to have multiple parents, but what about a Signal having multiple children?

In order to fit with Rust's ownership model, by design Signals can never have multiple children. A Signal can have multiple parents, but it will always have exactly 1 child.

When the Signal changes, it passes ownership of its value to that 1 child, which works perfectly with Rust because Rust requires every value to have exactly 1 owner.

What if you really want to have multiple children though? There are two solutions:

  1. You can call mutable.signal() multiple times, and each time you will get a fresh Signal. So this works just fine:

    let mutable = Mutable::new(5);
    
    let signal1 = mutable.signal().map(...);
    
    let signal2 = mutable.signal().map(...);

    So in this case mutable has 2 children: signal1 and signal2. Whenever mutable changes it will notify all of its children.

  2. You can use Broadcaster, which takes in any Signal and makes it "broadcastable":

    let broadcaster = Broadcaster::new(signal1);
    
    let signal2 = broadcaster.signal().map(...);
    
    let signal3 = broadcaster.signal().map(...);

    Just like with Mutable, you can call the signal() method multiple times, and whenever signal1 changes it will send the value to all of its children.

Why did I design Signals this way? Performance. If a Signal only has a single child, then the entire FRP system can be absurdly fast, essentially having zero cost (just like Futures and Streams).

Using Mutable or Broadcaster has a significant performance cost, so it's wasteful to make every Signal broadcastable even when you don't need it. Instead you can manually use Broadcaster in the rare cases that you need multiple children.

In practice it's quite rare to use Broadcaster, because Mutable supports multiple children, and that's good enough for almost every use case.

Mutable is faster and has more functionality than Broadcaster, so you should prefer to use Mutable instead of Broadcaster.


You can also poll individual Signals, for example by converting the Signal into a Stream:

let mut my_stream = my_signal.to_stream();

// Retrieve the current value of my_signal
let value = my_stream.next().await;

// Wait for the value of my_signal to change
let value = my_stream.next().await;

This is quite unusual though, you normally just use for_each which will run a closure every time the Signal changes.

So it's basically using the graph already built into the runtime IIUC, that's pretty neat.

Sort of, the graph is just a bunch of structs containing other structs. It's just a natural part of the Rust language, it's not special.

It relies upon the fact that Rust inlines structs, so when you call various methods like map, filter, etc. it's just building up a single giant struct which contains all of the relevant data.

And then you just need to convert that single struct into a Task and spawn it, which is super cheap.

Here are some old and slightly outdated (but still useful) articles that explain the design of Futures:

http://aturon.github.io/blog/2016/08/11/futures/

http://aturon.github.io/blog/2016/09/07/futures-design/

It explains why using callbacks doesn't work with Rust, and why Rust Futures are so fast. The same design principles applies to Signals as well.

And I guess that means in a single threaded runtime none of the downstream callbacks fire until the current async task parks.

That depends on the implementation of the Executor. Wasm doesn't have threads or parking, so wasm-bindgen-futures does some shenanigans with Promises in order to make it work.

But yes, in general you must wait for the Executor to poll the Task in order to retrieve the value of the Signal.

Which means when you change a Signal, there can be a delay before it is polled. This delay is quite small and it's not usually a problem. It doesn't cause any semantic issues, because Signals are lossy.