Zooming into a Var should not require an Owner?
Opened this issue · 4 comments
Background
Currently, when you .zoom(...)
into a Var
to create a derived Var
, Airstream internally maps over the parent var's signal, and adds an observer to it to make sure that it's strictly evaluated. To clean up the resulting subscription when this derived var is no longer needed, Airstream currently requires an Owner
to call the zoom
method.
Requiring an owner here is annoying, as explained in #112, raquo/Laminar#130 and raquo/Laminar#148
Owner is not required
@kitlangton suggested on Discord that we don't actually need an Owner in this scenario, and it turns out that he was right!
Suppose we have:
def zoomInFn(a: A): B = ???
def zoomOutFn(a: A, b: B): B = ???
val parentVar: Var[A] = ???
val derivedVar: Var[B] = parentVar.zoom(zoomInFn)(zoomOutFn)
We want derivedVar to be strictly evaluated, i.e. we should be able to call derivedVar.now()
any time and get the value of zoomInFn(parentVar.now())
. On its own, such "pulling" of a signal's current value does not require an Owner
. The Owner
is needed to "push" new values from parentVar
to derivedVar
whenever parentVar
updates[1]. That's how zooming works today – derivedVar
is auto-updated whenever parentVar
is updated, and calling derivedVar.now()
does not trigger any computation, it simply retrieves the already-computed value.
In contrast, without such auto-updates, updating parentVar
would not evaluate zoomInFn(parentVar.now())
immediately – it would only be evaluated if and when we call derivedVar.now()
. This way we don't need an Owner
, but we need to deal with other issues.
Purity will be required
As you've likely noticed, using the Owner-less "pull" approach requires zoomInFn
to be a pure function, since the number of times it's called is up to the user, as is the timing of those calls, e.g. for some updates of parentVar
, zoomInFn
might not be called at all, if the user does not call derivedVar.now()
before the next update to parentVar
.
Normally, Airstream's public API does not require user-provided callbacks to be pure, and this is by design. I think the batch version of the Var update method – Var.update(var1 -> updateFn1, var2 -> updateFn2)
– is currently the only method that advises the user to provide pure callbacks – in that case it's because some of these callbacks might fail, and the whole batch update is aborted, so you wouldn't want any user-specified side effects to still happen in that case.
In practice, zoomInFn
is typically used to select a val
member containing an immutable value from an immutable class, so it's very likely to already be pure. So, maybe requiring purity here is a good tradeoff. Not needing an Owner
is a huge advantage, after all. Or, perhaps we should have a separate method to create such "lazy" derived vars, e.g. pureZoom(...)
instead of zoom(...)
. Not sure yet – feedback welcome.
Nesting and types
Currently we can safely assume that the type Var
is strictly evaluated, so if we zoom
into parentVar
, zoomInFn
will be evaluated whenever parentVar
is updated. However, if both zoom
and pureZoom
return subtypes of Var
, we need to take care to maintain this assumption, i.e. if we call derivedVar.zoom
, derivedVar must become strictly evaluated even if it was "lazy" before (created with pureZoom
). I think this will happen naturally, as derivedVar.zoom
will start derivedVar.signal
using the provided owner, but this is one more area to watch.
Similar concern applies to StrictSignal's proposed pureMap
method (see below), make sure to also think that through.
Sync with derived var's signal
Any Var's .now()
value must always be in sync with this Var's signal.now()
. In practice, this should be easy to achieve – simply put the pull-based implementation of .now() in the Var's signal, rather than in the Var itself. This would also let us query the parent signal's _lastUpdatedId
to check whether parentVar
has updated since the last time we called derivedVar.now()
, eliminating redundant evaluations of zoomInFn
(but not solving the problem of potentially missing evaluations if .now()
is not called, so purity is still required).
As Kit suggested, the implementation of .now()
would basically fall back to current behavior if derivedVar.signal
is started (i.e. is being observed by other listeners), the pulling logic would only be enabled if the signal is stopped.
Signals already re-sync their value with the parent signal when they're started after being stopped, so I think the sync between derivedVar.now()
and current value seen by derivedVar.signal
's observers would be already built-in, although extra care needs to be taken to review the starting process of derivedVar.signal
– I need to figure out the right time in the process to switch from standard logic to pull-based logic.
Implications for other parts of the API (StrictSignal)
We restrict the Airstream public API to only expose memory-safe operations, to prevent users from accidentally creating memory leaks, or reading stale values. We can't rule out such things in principle (e.g. users can always pass the wrong owner), but we can make them exceptionally unlikely (e.g. by not requiring users to provide owners explicitly, having Laminar take care of it behind the scenes).
As mentioned before, one of the restrictions that I put on Airstream API is that it shouldn't require user-provided callbacks to be pure – but I put such restrictions for pragmatic reasons – to make Airstream more forgiving – not ideological reasons, and so such restrictions are subject to revision if the right tradeoff opportunity presents itself.
In this case, remember that Var is essentially a bundle of StrictSignal (the read channel including .now()
) and Observer (the write channel). So, if we can zoom
into a Var
without an Owner
and get another Var
, we should be able to map
a StrictSignal
without an Owner
to get another StrictSignal
, no? Currently map
-ing a StrictSignal
does not require an Owner, but it results in a regular Signal
that is lazy and does not have a public .now()
(see raquo/Laminar#130). We would then need to call .observe(owner)
on that mapped signal to convert it back to a StrictSignal
.
So, we could override def map
on StrictSignal
to return another StrictSignal without an Owner, with similar pull-based semantics as we discussed for lazy derived vars above. That way you could do e.g.:
val strictSignal: StrictSignal[A] = parentVar.signal
val mappedSignal: StrictSignal[B] = strictSignal.map(makeBfromA)
val nowB = mappedSignal.now() // no need for owners or observers
As you see, it's very much like a derived var, but without the write channel.
However, unlike derived var's zoomInFn
, the project
callback of signal.map
has more varied use cases, and requiring it to be pure gives me more of a pause. On the other hand, the regular signal.map
is not affected by it, it's only the signals created via strictsignal.map
that are affected, and only when the resulting signal has no observers – a case when currently, the project callback would not evaluate at all (and calling .now() on it is not possible).
It is also not clear which of StrictSignal's methods would be "upgraded" to such purity-requiring versions that return StrictSignal instead of Signal. map
– yes, but any others? recover
, debugWith
, it's not obvious which other ones would support this.
Perhaps we could ease into this new pattern by adding a new pureMap(pureProject)
method on StrictSignal
, and if we're doing that, we might as well implement the new lazy derived var functionality under pureZoom
, keeping the old zoom
alive.
Kit's implementation
This assumes that zoomIn
is pure, and should give you a good idea of what using this kind of API would be like. You can already use this in your code, it uses extension methods and does not require any changes to Airstream internals.
class ZoomedVar[I, O](
parent: Var[I],
zoomIn: I => O,
zoomOut: (I, O) => I,
) extends Var[O]:
def setCurrentValue(
value: Try[O],
transaction: Transaction
): Unit =
val tryA = value.map(zoomOut(parent.now(), _))
parent.setCurrentValue(tryA, transaction)
val signal = new ZoomedVarSignal[I, O](parent, zoomIn)
def getCurrentValue = signal.tryNow()
def underlyingVar = parent.underlyingVar
object ZoomedVar:
extension [I](parent: Var[I])
def zoomed[O](zoomIn: I => O)(zoomOut: (I, O) => I): ZoomedVar[I, O] =
new ZoomedVar(parent, zoomIn, zoomOut)
class ZoomedVarSignal[I, O](
parent: Var[I],
zoomIn: I => O,
) extends MapSignal[I, O](parent.signal, zoomIn, None)
with StrictSignal[O]:
override def tryNow(): Try[O] =
if isStarted then super.tryNow() else parent.tryNow().map(zoomIn)
(I took the liberty of replacing override def now
with override def tryNow
to make tryNow() work too)
Summary
This approach looks very promising. At the very least, it works as "lazy derived vars" mentioned in #112, while retaining the full API of real vars (except for requiring purity). I will need to think some more how this finding affects all the other linked issues, perhaps more opportunities for improvements will be revealed.
Footnotes:
[1] because for that to happen, parentVar
needs to keep a reference to derivedVar
, and that creates circular references which long story short, garbage collection can easily miss in realistic use cases, so we need an Owner to break those circular references when they're no longer needed
Sorry, I've accidentally pressed Cmd+Enter and posted the half-written issue too early. I've now updated it with the full text.
🥳 Very exciting!
I do have some general thoughts/questions about the purity requirements.
Do users currently depend on zoomIn
being called precisely as many times as the parent variable is updated? I can't easily imagine a case, even if my Var
contained mutable state, in which I'd be relying on that invariant—particularly if zoomIn
might be called fewer times than the parent is updated, rather than more.
Perhaps the user could perform some logging/tracking in here? Yet, if one wished to execute some logic each time the parent Var
changed, wouldn't using the Var
's Signal
be the more natural approach?
And because the zoomIn
function may be called even fewer times than before, is this so much a purity issue? I think it could be argued that it's an implementation detail.
Of course, I completely understand that it's impossible to guess how users might be relying on some particular behavior. In fact, it's nearly certain that for any library of modest user-base, someone somewhere is pinning the hopes of their entire application on each incidental bit of functionality, conceived of and maintained by the author or not.
And then, I'm sure the maintenance of such invariants is useful to you, yourself, as you develop a library with such intrinsic complication. You're clearly well versed in the multifarious gotchas of this domain, as well as the guarantees you've thus far painstakingly preserved.
So, I guess what I'm devil's advocating for here is the possibility of the wholesale replacement of zoom
with this new Ownerless variant, and casting the current behavior to the wind.
The benefits would be:
- I'd guess that for 99.9% of cases, the Ownerless implementation will be more pleasant to use. In fact, I'd imagine that some users are currently using
zoom
withunsafeWindowOwner
or some other custom leakyOwner
. I assume this because I know I've done it myself, before I fully understood the consequences, and really wanted tozoom
into myVar
. - Less code > more code. Both for users and maintainers. Users won't have to reckon with the decision between these two overrides, and maintainers won't have to reckon with question of which of the many subtypes of Var they might be dealing with.
- I guess those are the only two points, but that wouldn't be aesthetically pleasing. So I'll reiterate that I'm a big fan of smaller, more opinionated APIs, especially when the alternatives would be so rarely (if my guessing is at all accurate) useful.
And the downside would be the relaxing of some previous behavior. But of course, the recommended alternative would be to simply observe the parent var's signal if you wanted to be notified exactly-once (and no less) per emission.
I do see the argument for StrictSignal
not behaving this way. Perhaps mapStrict
, with a docstring of the caveats, would make sense? Though perhaps it's still worth thinking through the worst negative consequences of just doing the Ownerless version.
My lesser, backup (imp's advocacy) argument would be that perhaps the current behavior could get demoted to zoomImpure
, and the Ownerless variant could get top-billing. Only because I think it's more often what people would want (and to avoid folks jury-rigging zoom
with an unsafeWindowOwner
).
A final, orthogonal thought on API-design is that I do enjoy a post-fix pattern for method alternatives, if only because it more naturally works with autocomplete. The user would type zoom
and see all of the variants listed out at the end zoomX
, zoomY
, zoomZ
. Though, it's more important to be consistent with the library's existing patterns.
Okay! All done! I'm super happy either way, just thought I'd put those arguments out there in case they're helpful.
Those are all good points, thanks for elaborating! At the moment I can't give a specific, compelling, example, for what impure things the users could legitimately be doing in the zoomInFn
callback. If nobody will be able to come up with that by the time I get to implementing it, I say perhaps it's a good indication that the problem is not worth worrying about in practice, and the lazy owner-less behaviour is better off being the default.
I may be wrong in calling the requirement on zoomInFn
"purity" – FP nomenclature isn't my strong suit. At a high level, if zoomInFn
has side effects, lazy evaluation will likely be undesirable, as the execution of said side effects will be unpredictable at definition site. You're right that because we can at least get rid of redundant evaluations of zoomInFn
, the range of "side effects" that could cause problems is reduced. For example, if the zoomInFn
callback is merely expensive to compute, that won't be a problem. If it does some logging that doesn't affect program execution, it's probably fine too. If it creates new instances that don't have structural equality like case classes, proooobably still fine, at least if nothing stateful is involved.
Naming-wise, we already use strict
to indicate strict evaluation, e.g. mapToStrict
vs mapTo
(the latter is accepts the argument by-name, so the argument is evaluated lazily). Perhaps the existing zoom
method could be renamed to zoomStrict
to make room for the new lazy zoom
method, but StrictSignal's map will need some other name, mapStrict
would be inconsistent with other strict
names, and I'm not ready to change the contract of map
.
A final, orthogonal thought on API-design is that I do enjoy a post-fix pattern for method alternatives, if only because it more naturally works with autocomplete.
I must say that IntelliJ gives me all the relevant autocompletions regardless of whether the method starts with what I typed, or just contains it somewhere in the middle, so the main advantage of that strategy is kind of diluted IMO. Still, I try to follow this when the resulting method name sounds fine (e.g. zoomStrict
is fine, but other times adding a suffix just doesn't read well).
Brilliant! Sounds like a plan 😎 I'll put my lil' demo code to work and report back if any weird side-effects crop up.
🥳🥳🥳