raquo/Airstream

Settle on a Signal `==` performance solution

Closed this issue ยท 11 comments

raquo commented

Signals only emit values that are different from their current state. "different" means Signal does an == check. This is very convenient, and in general improves performance by eliminating redundant DOM calls, but can potentially be expensive in certain situations, e.g. when comparing large, almost identical Map-s.

Currently we have com.raquo.airstream.util.Ref to deal with this, but the wrapping is annoying to deal with. Perhaps I should provide some helpers to deal with wrapped values easier, but I'm not convinced that this is the best solution.

I will look more into this when I get the time to work on memoization in Laminar (raquo/Laminar#38), I think there might be some common ground between these issues.

Comparing with == might be expensive, but has no surprises. so it should be the default.

I can think of several options for giving users more control over performance:

  • write own equals method
  • disable comparison for specific types
  • disable comparison for specific nodes in the dataflow-graph
raquo commented

Yes, == definitely stays as default.

raquo commented

One possible solution that I want to get off my chest.

Adding something like the following methods:

trait EventStream[A] {

  def toSignal(initial: A, comparator: (A, A) => Boolean) = new SignalFromStream(this, initial, comparator)

  def toSignalWithTry(initial: Try[A], comparator: (A, A) => Boolean) = new SignalFromStream(this, initial, comparator)
}

trait Signal[A] {

  protected[this] val _comparator: (A, A) => Boolean = (_ == _) // signal would look at this instead of having `==` hardcoded

  def composeChanges[B](operator: EventStream[A] => EventStream[B], comparator: (B, B) => Boolean): Signal[B] = {
    inputSignal.changes.compose(operator).toSignalWithTry(inputSignal.tryNow(), comparator)
  }
}

Usage as follows, when you don't want expensive == comparisons:

def makeFoo(model: Model): Foo = ???
val modelsStream: EventStream[Map[String, Model]] = ???

val modelsSignal: Signal[Map[String, Model]] = modelsStream.toSignal(Map.empty, comparator = _ eq _)
val derivedSignal: Signal[Map[String, Foo]] = modelsSignal.composeChanges(_.filter(_.nonEmpty).map(makeFoo), comparator = _ eq _)

I don't have time to write the usual thesis right now, but I hope this is more or less self explanatory โ€“ we're composing event streams instead of signals when we call composeChanges, so all that operator chain will not do any == comparisons. And because we have specified an eq-based comparator in the beginning (toSignal) and end (composeChanges) of the chain, none of the signals involved here will do a == comparison.

I'm not sure about the comparator thing but I've been wanting composeChanges for a while to improve expressiveness, and will most certainly implement that one even without the comparator param.

raquo commented

I implemented composeChanges pretty much as described here. So far I like using it. I'll marinade on this for a while, and will likely follow the general idea from my last comment.

raquo commented

I am wondering if simply dropping the == check in Signals entirely might be a better option. All things considered it might be the simplest solution.

It's always possible to add an == check to a Signal that doesn't have it โ€“ just call .distinct on it or something. That's what Laminar could do under the hood where signals touch the DOM, and what you could do when you actually need the current signal behaviour.

But it's quite complicated to remove a == check that's there by default. Special, weird, APIs are needed for that. And I'm not 100% sold on the supposed performance advantage either: when every signal has a == check we end up performing a lot of redundant checks, some of which could be expensive. Users shouldn't need to use streams just because they don't want signals to diff huge maps or lists on every event.

I don't believe that performance is the hottest topic around "calming" / "distincting" signals. Although performance is obviously important, I believe that correctness means even more. Calming / not calming (distincting / not distincting) is a choice that's visible in signal's changes, which means it changes the semantics.

(this is more of a general FRP thought, I've just learned about Airstream existence)

raquo commented

@cubuspl42 Are you saying built-in distincting of Signals like we have now is more correct? Or the opposite? I definitely understand the effect on .changes, just not really sure what other FRP libraries are doing.

For Airstream performance here actually matters quite a bit, because signals are often used to render dynamic children in Laminar. Those can be some very big Lists or Maps, and spending cpu cycles to diff them only to find that one element has been updated can get very expensive. And that needs to be done on every signal, so if you e.g. split a signal of a big map into N signals of small maps, you now need to make another equality check on every one of those N maps every time the parent map updates. And there's really no good way for users to avoid that recurring penalty right now.

(In this reply I'm using the term Cell, from Sodium nomenclature, but as far as I understand it's semantically more or less equivalent to Airstream Signal)

@raquo I'm not saying that either way is more "correct", I'm just reminding that it's definitely not performance-only topic (but I'm aware that in some contexts performance might be crucial).

I personally dislike the idea of "distincting by default", because I don't believe all types are semantically distinctable, i.e. not all types are semantically comparable.

The prominent example of semantically non-comparable type is actually Cell (Airstream Signal) itself. In purest theory, possibly never formulated out loud, from time-less perspective, Cell[A] could be compared with another Cell[A] if A is comparable (using purely functional / Haskell terminology, belonging to Eq typeclass). It could be done so by comparing each change in the whole timeline (and only assuming it's finite) of both cells. It's probably obvious how unrealistic it is. So it's easier to assume that Cell[A] is, semantically, just not in Eq (is not comparable) at all.

So taking into consideration that Cell is not comparable, it means that if A in Cell[A] is under constraint of being comparable (so we can turn on distinction-by-default), it means that we can't have cells of cells, i.e. Cell[Cell[A]]. And that means we can't have higher-order FRP, which is unacceptable.

Of course, many languages (on the technical level, forgetting the whole pureness and semantics for a moment) allow you to compare any type. If your type has no better way of being compared, it's address (or identity) will be compared, out of no better option. This implies that every object has an address and that it's given on object creation. It means that every object needs to be created in a context that allows you to allocate a new address.

If we assume that cells can be "compared by address", it means that they have to be created in stateful context that can assign them an address. Which contradicts the "F" in "FRP', end of proof.

So, from my perspective (however impractical that might sound), distincting-by-default is meaningless (or having semantics such ugly that I don't want to accept it).

The only solution to this problem I know is moving the calming (distinction) guarantees from the Cell type itself to specific operators. That would mean that we'd need to have two version of each mapping operator: the uncalm one (that doesn't put the comparability constraint on the A type <A as in Cell[A] in the operator's return type>) and the calm one (which does, and is typically defined as calm(anotherOperator(...))).

That would affect all operators returning cells. Possibly, the calm ones could be the "recommended" ones (for example Cell.map could be calm, while Cell.map' or Cell.mapUncalm could be the more generic [less constrained] variant, but designed for non-comparable A).

All of this is my personal opinion, and to be honest I'm not aware of any library that does exactly that. The beauty of semantics is sometimes difficult to be reflected in practical implementations, for variety of reasons.

Focusing on your words about performance of signal-of-maps: if you provided me with a minimal example of an actual performance-bottlenecked program involving maps changing in time, I could possibly help. I have some experience with dynamic collections in FRP.

raquo commented

minimal example of an actual performance-bottlenecked program

I'm not there yet, which is precisely why this issue is still open :) But in my app I load "the universe" โ€“ lots of data in big Maps โ€“ into a Signal, and then do a bunch of signal transformations on it. Essentially all of the dynamic stuff in the DOM comes from there, but not before being channeled through a bunch of intermediate signals. And at every step these equality checks are run on all those signals, when all I really care about is to run them before updating the DOM and maybe in another one or two cases. I didn't get to optimizing this yet so no numbers or anything at this point. It might turn out that distincting at the edges is actually less efficient in my case, but I doubt it. In the general case though, the programmer should be able to control where the distincting happens both for business logic and for performance tuning.

Re: everything else โ€“ good insights, thanks.

Seems like we favor the same outcome (not distinct-ing by default), although for different reasons. I would only disagree with having calmed versions of operators โ€“ I don't like exploding the number of methods that way, I prefer a single distinct operator for that even if that requires a separate invocation and signal instance. Although, in Scala there's the option to use implicit params to provide defaults. Though, not having thought about this too much yet, not a fan of adding those to every method. Will probably start with something simpler.

Regarding what you said about equality checks not being defined for some types in some languages โ€“ I see your point in the context of libraries native to those languages, or cross-platform libraries like ReactiveX or Sodium, but Airstream isn't that. It leans heavily on Scala conventions and language features, and is designed specifically for Scala.js frontend work. Both in terms of the design of operators and available functionality (e.g. no parallelism in JS and thus no facility for that in Airstream either, other platforms be damned).

about equality checks not being defined for some types in some languages

Actually that's not what I meant. I meant that, semantically, some types just can't be compared in a pure context, i.e. one needs to sacrifice purity in order to make "everything" comparable.

The keyword is "semantically", which means that I'm talking from FRP semantics point of view. That's probably the point where we don't share the same assumptions and goals, and that's fine of course. Airstream is not declared to be an FRP implementation.