Collection of FinalizationGroup objects
mhofman opened this issue ยท 16 comments
With agents keeping a list of all FinalizationGroup
instances on an internal slot, do these get collected when all user references are gone?
Besides the obvious leak, even when [[Cells]]
is empty, it might be surprising for users to get cleanup callbacks after they stopped using their finalization group.
The situation is akin to registering an event listener on an object. If the object is not reachable anymore, I'm not expecting to get my event handler called.
Developers may not go through the trouble of bookkeeping and manually unregistering.
At the very least, I believe a FinalizationGroup instance should only be in agent.[[FinalizationGroups]]
when its cells are not empty.
With agents keeping a list of all FinalizationGroup instances on an internal slot, do these get collected when all user references are gone?
V8 doesn't hold on to the FinalizationGroup
using an internal slot, instead the relationship between the [[Cells]]
and the [[FinalizationGroup]]
is inverted. A WeakCell
holds on to it's corresponding FinalizationGroup
. It might be ideal to update the spec to just reflect this.
Besides the obvious leak, even when
[[Cells]]
is empty, it might be surprising for users to get cleanup callbacks after they stopped using their finalization group.
What does it mean to stop using their finalization group? If you call FinalizationGroup.prototype.unregister
, then no cleanup callbacks will be called. If you remove all references to the finalization group and expect no cleanup callbacks, then that's not really supported.
This is a very good point. I don't have a strong opinion on the editorial question of how we phrase this, but I think we should make it clear that FinalizationGroups can be collected somehow or other.
#86 removed the [[FinalizationGroups]]
slot, leaving to the host the liberty of collecting unreferenced FinalizationGroup
objects.
I think we should clarify what the expected behavior is regarding the invocation of cleanupCallback
after the corresponding FinalizationGroup
is no longer reachable.
Take the following example:
function whenFinalized(target?: object): Promise<true> {
let resolve: (collected: true) => void;
let collected = new Promise<true>(r => void (resolve = r));
let finalizationGroup = new FinalizationGroup<any>(
items => void resolve(true)
);
finalizationGroup.register(target || {}, 0);
target = undefined;
return collected;
}
let object = {};
const finalized = whenFinalized(object);
object = undefined!;
await Promise.resolve();
gc();
await finalized;
There are arguably more efficient implementations of the concept (e.g. by keeping a global finalizationGroup and using the resolve as holding). The question I have is, what is the expected behavior:
- The
finalizationGroup
object can be collected at the same time (or before) its cells are empty, which means thecleanupCallback
may never be called when the targets are collected. In this exampleawait finalized
never returns. - The
finalizationGroup
object should be held until all its cells are empty and that theFinalizationGroupCleanup
job is performed once after the last cell becomes empty. This would preventfinalizationGroup
from being collected before its registered targets, and ensure thatcleanupCallback
will be called (if the engine is capable of performing cleanup jobs).
As I originally mentioned, until #86, the specced behavior seemed to be:
3. Never collect any FinalizationGroup
objects. It implied cleanup callbacks should always be invoked.
My current reading of the spec is that since nothing holds the finalizationGroup
object beside user code, if it becomes unreachable in JavaScript, then it can get collected. That's behavior 1. above.
It also seems to be the current V8 implementation.
If that's the intended behavior, should this be added as a note in the spec?
How important do you think this behavior is for a polyfill? Currently I'm holding a FinalizationGroup
object strongly until its targets are collected. Since collection is never guaranteed, I guess that's compliant, but it might deviate from engine implementations, causing for unexpected issues later.
I have the same question though.
I am currently making some experiment of weakref + FinalizationGroup
Which relay the object operations to another worker, making a transparent RPC
In the code,
I write something like (simplified)
// Does not work
function createCleaner() {
var gcGroup = new FinalizationGroup((tokens) => {
for (let token of tokens) {
cleanup(token)
}
})
return gcGroup
}
createCleaner().register(myProxy, cleanupToken)
However, this surprisely doesn't work in current chrominum's implementation (I use edge 76)
I need to use some dummy value holder to keep the FinalizationGroup
from being garbage collected
// Works but... why should I use a useless value holder here?
globalThis.gcGroups = new Set()
function createCleaner() {
var gcGroup = new FinalizationGroup((token) => {
gcGroups.delete(gcGroup)
for (let refId of refIds) {
cleanup(token)
}
})
gcGroups.add(gcGroup)
return gcGroup
}
createCleaner().register(myProxy, cleanupToken)
This behavior seems to be error prone if anyone forgot to keep the reference somewhere on globalThis
You don't need to create new FinalizationGroup
objects, and can use a singleton for your cleaner.
That's what I meant when I said:
There are arguably more efficient implementations of the concept (e.g. by keeping a global finalizationGroup and using the resolve as holding).
globalThis.cleaner = new FinalizationGroup((refIds) => {
for (let token of refIds) {
cleanup(token)
}
})
cleaner.register(myProxy, cleanupToken)
You running into this issue however does re-inforce the fact we should clarify what the behavior should be.
Does it actually fires everytime some object is collected?
I am not sure how it works here. Wait for everything to collected and fire the callback destroy the purpose here totally, which is a no go.
And some other though is, it is really weird in js that you need to keep something on globalThis in order to make it to work.
All api in js works no matter you attach it to globalThis or not.
Ajax will fire callback as is even you are not attaching the XmlHTTPRequest object to dom.
setTimeout also does not require you to attach the value to somewhere on global
Making it require you to do this to make it work is likely to trap everyone as actually no api in current web platform require you to do this
Does it actually fires everytime some object is collected?
I am not sure how it works here. Wait for everything to collected and fire the callback destroy the purpose here totally, which is a no go.
The callback is invoked every time some object registered with the group has been collected. It does not wait for all registered objects to be collected.
Maybe this is something that could be clarified in the explainer.
And some other though is, it is really weird in js that you need to keep something on globalThis in order to make it to work.
All api in js works no matter you attach it to globalThis or not.
Ajax will fire callback as is even you are not attaching the XmlHTTPRequest object to dom.
setTimeout also does not require you to attach the value to somewhere on global
Making it require you to do this to make it work is likely to trap everyone as actually no api in current web platform require you to do this
Well that's not entirely true.
The closest you can compare this API is with DOM events: a callback handler is attached to an object, and invoked when some condition happens.
setTimeout/setInterval just invokes the function you passed, there is no intermediary holding object.
With DOM events, if your object gets collected, the event handlers are no longer invoked. But for your object to get collected you have to both remove it from the DOM tree, and null-out JS references.
To keep an object alive, you just need to keep it reachable somehow. Adding to the global object is one way, but you could use an export or a closure, e.g. export const cleaner = new FinalizationGroup(...)
XHR never gets added to the DOM tree, which makes it prone to collection if the developer doesn't hold a reference to it. That's why it's specified to not be collected when still in active state: https://xhr.spec.whatwg.org/#garbage-collection
I'm guessing developers would expect a similar behavior: an active registration in a FinalizationGroup should hold the FinalizationGroup alive, so that the cleanup callback gets invoked for those.
To sum up: XHR & co have special garbage collection behavior. FinalizationGroup should have too.
Well... even the dom thing you clarify isn't entirely true. Some elements like Image, Audio can load without attaching to dom.
And reason the handler on detached dom won't fire isn't because it is treated differently by gc somehow either, but actually because there is no path you can walk onto the detached element and dispatch event from the dom root, if you manually dispatch event on detached element, it would work as is.
At the same time. the FinalizationGroup won't work if you did set it to globalThis is totally like a magic.
You did not call sometime like appendChild
on the window, just set it, and it suddenly works, you forgot to set it, and it suddenly not work magically.
Such behavior isn't suggested by the usage at all and is quite confusing,
If such registration is actually required, it would be better to make it explicit by force you to call some method to register (like dom), instead of relying on some implicit garbage collection things.
it very much seems like an otherwise collectible FinalizationGroup that is targeting a live object should not be collectible.
it very much seems like an otherwise collectible FinalizationGroup that is targeting a live object should not be collectible.
How would that be observable? There's no guarantee a GC collects that live object anyway?
It kinda seems like the developer should keep the finalization group alive if they really want the callback to be called and clear it if the don't care anymore. I'm not sure how we could specify anything else?
It's observable by the callback not being called when an equivalent live FinalizationGroup's callback would be called. Making sure both are called is how #147 attempts to specify it.
As mentioned above, other Web APIs such as XHR do not require the developer to keep the object alive to get notified of changes.
Ah, fair enough I didn't think about that. I withdraw my comment.
I'd like to better understand the "independent lifetime" problem.
In which case do we expect a backing store (wasm memory?) and a FinalizationGroup to die before the referencing objects? Wouldn't that put those objects into a corrupted state if they're still alive themselves? If there is a step to avoid this corruption, couldn't such step release the FinalizationGroup?
I do agree that it's possible to build dependent lifetime on top of independent lifetime, I'm just worried that the independent behavior might be surprising to developers.
@mhofman I continue to be confused about why a Wasm object referencing part of a Wasm memory wouldn't have a strong reference to that memory. That would straightforwardly guarantee that the memory reference wouldn't become undefined. In my example of this usage, there is a strong reference to the Wasm memory.
If you think there'd be benefit to using WeakRefs here, it'd be helpful to understand why in some more detail. WeakRefs are designed to go undefined sometimes. They are only useful if you do expect them to go undefined sometimes. So I'd be surprised if this were seen as unexpected "corruption". If you have some draft code using WeakRefs for memory allocation, and you're running into issues here, I'd be interested to see it.
When we discussed this topic in the February 2020 TC39 meeting, we came to consensus on "independent lifetimes" semantics, as long as there is a change the name of FinalizationGroup that implies less certainty (see #180).
Besides the createCleaner
example given by @mmis1000 where a FinalizationRegistry would be accidentally useless because it would be collectible immediately, it seems like the first two code examples in the readme right now of FinalizationRegistry being used are both similarly vulnerable to this issue too. I think this is a sign that many users of FinalizationRegistry are likely to accidentally misuse it with the current design.
FileStream example: if this example code was placed anywhere besides the top-level of a non-module script (where top-level variables would automatically become properties of globalThis), then the FileStream
class itself and its FinalizationRegistry becomes collectible at the same time as the fs
target variable.
makeAllocator, allocate example: This example seems equivalent to the createCleaner
example above. When allocate is called, it returns a reference to a target of the FinalizationRegistry but not the FinalizationRegistry itself, so the FinalizationRegistry instance is collectible immediately after the allocate function returns.
It's possible to change the code examples to not have that issue, but some of the obvious solutions don't work out. (One might think that the issue is the FileStream class and makeAllocator both instantiate FinalizationRegistry, and change the code to instantiate a single FinalizationRegistry inside a top-level variable, but that doesn't solve the problem for the FileStream example. The FileStream example is fixed if you add a globalThis property or an export to the module that references the FileStream class or its FinalizationRegistry instance directly. Though fixing it by adding an export wouldn't be enough if the module was part of a webpack bundle using hot module replacement which lets modules be replaced with new versions at runtime.)
From the developer's perspective, it seems like the current design is easily misused, and for nearly all of the use-cases I can imagine, the current design of losing references FinalizationRegistry causing callbacks to be prevented is never actively helpful, and flipping the design lets the developers fall into the pit of success. If the current capability for letting the finalizer callbacks never run is useful to developers for some case, then FinalizationRegistry could have a .disconnect()
method added that works like MutationObserver's disconnect method to throw out all finalizer callbacks.
Speaking of MutationObservers: they have a roughly similar API to FinalizationRegistry (you instantiate it with a callback, and then call a method on it to make it watch some objects and call a callback when something happens with them), but they still call their callbacks after you drop all your references to the MutationObserver. I've written a lot of code making deep use of MutationObservers, and if I imagined them working this way too, it seems painful. To me, it doesn't seem any more correct for FinalizationGroup to work this way: just because it has to do with reacting to other objects being GC'd doesn't seem like a good reason for it itself to have side effects on being GC'd.
Overall, TC39 has concluded that objects registered against a FinalizationRegistry does not hold the registry alive--and if the registry dies, then it's allowed to cancel the un-called finalization callbacks with it. You can avoid this case by keeping the FinalizationRegistry alive, e.g., by setting a global variable to the FinalizationRegistry.