context: add AfterFunc
neild opened this issue Β· 41 comments
Edit: The latest version of this proposal is #57928 (comment).
This proposal originates in discussion on #36503.
Contexts carry a cancellation signal. (For simplicity, let us consider a context past its deadline to be cancelled.)
Using a context's cancellation signal to terminate a blocking call to an interruptible but context-unaware function is tricky and inefficient. For example, it is possible to interrupt a read or write on a net.Conn
or a wait on a sync.Cond
when a context is cancelled, but only by starting a goroutine to watch for cancellation and interrupt the blocking operation. While goroutines are reasonably efficient, starting one for every operation can be inefficient when operations are cheap.
I propose that we add the ability to register a function which is called when a context is cancelled.
package context
// OnDone arranges for f to be called in a new goroutine after ctx is cancelled.
// If ctx is already cancelled, f is called immediately.
// f is called at most once.
//
// Calling the returned CancelFunc waits until any in-progress call to f completes,
// and stops any future calls to f.
// After the CancelFunc returns, f has either been called once or will not be called.
//
// If ctx has a method OnDone(func()) CancelFunc, OnDone will call it.
func OnDone(ctx context.Context, f func()) CancelFunc
OnDone
permits a user to efficiently take some action when a context is cancelled, without the need to start a new goroutine in the common case when operations complete without being cancelled.
OnDone
makes it simple to implement the merged-cancel behavior proposed in #36503:
func WithFirstCancel(ctx1, ctx2 context.Context) (context.Context, context.CancelFunc) {
ctx, cancel := context.WithCancel(ctx1)
stopf := context.OnDone(ctx2, func() {
cancel()
})
return ctx, func() {
cancel()
stopf()
}
}
Or to stop waiting on a sync.Cond
when a context is cancelled:
func Wait(ctx context.Context, cond *sync.Cond) error {
stopf := context.OnDone(ctx, cond.Broadcast)
defer stopf()
cond.Wait()
return ctx.Err()
}
The OnDone
func is executed in a new goroutine rather than synchronously in the call to CancelFunc
that cancels the context because context cancellation is not expected to be a blocking operation. This does require the creation of a goroutine, but only in the case where an operation is cancelled and only for a limited time.
The CancelFunc
returned by OnDone
both provides a mechanism for cleaning up resources consumed by OnDone
, and a synchronization mechanism. (See the ContextReadOnDone
example below.)
Third-party context implementations can provide an OnDone
method to efficiently schedule OnDone
funcs. This mechanism could be used by the context
package itself to improve the efficiency of third-party contexts: Currently, context.WithCancel
and context.WithDeadline
start a new goroutine when passed a third-party context.
Two more examples; first, a context-cancelled call to net.Conn.Read
using the APIs available today:
// ContextRead demonstrates bounding a read on a net.Conn with a context
// using the existing Done channel.
func ContextRead(ctx context.Context, conn net.Conn, b []byte) (n int, err error) {
errc := make(chan error)
donec := make(chan struct{})
// This goroutine is created on every call to ContextRead, and runs for as long as the conn.Read call.
go func() {
select {
case <-ctx.Done():
conn.SetReadDeadline(time.Now())
errc <- ctx.Err()
case <-donec:
close(errc)
}
}()
n, err = conn.Read(b)
close(donec)
if ctxErr := <-errc; ctxErr != nil {
conn.SetReadDeadline(time.Time{})
err = ctxErr
}
return n, err
}
And with context.OnDone
:
func ContextReadOnDone(ctx context.Context, conn net.Conn, b []byte) (n int, err error) {
var ctxErr error
// The OnDone func runs in a new goroutine, but only when the context expires while the conn.Read is in progress.
stopf := context.OnDone(ctx, func() {
conn.SetReadDeadline(time.Now())
ctxErr = ctx.Err()
})
n, err = conn.Read(b)
stopf()
// The call to stopf() ensures the OnDone func is finished modifying ctxErr.
if ctxErr != nil {
conn.SetReadDeadline(time.Time{})
err = ctxErr
}
return n, err
}
Change https://go.dev/cl/462855 mentions this issue: context: add OnDone
It is worth noting that OnDone
is a performance optimization. We can write it today in terms of the existing context package. The benefit of adding it to the package is that it permits us to take action on context cancellation without leaving a goroutine sitting around waiting for the cancel to happen.
func OnDone(ctx context.Context, f func()) context.CancelFunc {
stopc := make(chan struct{}, 1)
donec := make(chan struct{})
go func() {
select {
case <-ctx.Done():
f()
case <-stopc:
}
close(donec)
}()
return func() {
select {
case stopc <- struct{}{}:
default:
}
<-donec
}
}
The ContextReadOnce
example is still a bit awkward, in the sense that there are a few things that have to be done exactly right to avoid any problems. Perhaps we can tighten up the idea, at the cost of another closure.
// Try executes fn while watching ctx. If ctx is cancelled, Try calls cancel and returns ctx.Err().
// Otherwise, Try returns the result of fn.
// This is implemented using an internal implementation of OnDone as described above.
func Try(fn func() error, cancel func()) error
// Example of using Try.
func ContextReadOnDone(ctx context.Context, conn net.Conn, b []byte) (n int, err error) {
err = context.Try(func() error {
n, err = conn.Read(b)
}, func() {
conn.SetReadDeadline(time.Now())
})
return n, err
}
Try
does look quite elegant to use. I don't think it would work for the WithFirstCancel
case, and in general Try
can be implemented in terms of WithDone
but not vice-versa, so if we were to have only one I'd vote for OnDone
.
func Try(ctx context.Context, fn func() error, cancel func()) error {
stopf := context.OnDone(ctx, cancel)
err := fn()
stopf()
if ctx.Err() != nil {
return ctx.Err()
}
return err
}
The Try
function looks a lot like the os/exec
functionality added for #50436.
Try
is analogous to(*exec.Cmd).Wait
.- The
fn
argument is analogous to the subprocess itself. - The
cancel
argument is analogous to theexec.Cmd.Cancel
field.
Based on that analogy, I would suggest a slightly different implementation of Try
in terms of OnDone
:
func Try(ctx context.Context, fn func() error, cancel func()) error {
canceled := false
stop := context.OnDone(ctx, func() {
cancel()
canceled = true
})
err := fn()
stop()
if canceled && err == nil {
return ctx.Err()
}
return err
}
Notably: in case of cancellation I would prefer to still return the result of fn
if it is non-nil, since it may contain more detail about the result, but I would return ctx.Err()
if fn
returns nil in case the result of fn
is spurious.
This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
β rsc for the proposal review group
This seems similar to having time.AfterFunc to avoid making a goroutine that calls time.Sleep and then the function. Here we avoid making a goroutine that receives from ctx.Done and then calls the function. Perhaps it should be context.After or AfterFunc?
context.After(ctx, func() { println("ctx is done!") })
reads nicely to me. It's nice that this would let people implement context.Merge themselves at no efficiency cost compared to the standard library.
We should specify that f is always run in a goroutine by itself, at least semantically, even in the case where
// If ctx is already cancelled, f is called immediately.
f shouldn't be called by context.After in that case either.
I like context.After
After
is nice and short, but I think I prefer context.AfterFunc
for consistency with time.AfterFunc
.
This may be implied, but to clarify since I don't see it in the example... After
/AfterFunc
should return a CancelFunc
like the initial OnDone
proposal.
OK, so it sounds like the signature is
package context
func AfterFunc(ctx Context, f func()) (stop func() bool)
AfterFunc arranges to call f after ctx is done (cancelled or timed out), and it calls f in a goroutine by itself. Even if ctx is already done, calling AfterFunc does not wait for f to return.
Multiple calls to AfterFunc on a given ctx are valid and operate independently; one does not replace another.
Calling stop stops the association of ctx with f. It reports whether the call stopped f from being run.
If stop returns false, then the context is already done and the function f has been started in its own goroutine; stop does not wait for f to complete before returning. If the caller needs to know whether f is completed, it must coordinate with f explicitly.
(This last paragraph is adapted from time.Timer.Stop.)
Do I have that right? Anything wrong there?
I'd prefer to have calling stop
wait until f is completed if it has already started, since that makes using AfterFunc
in a race-free manner simpler; you're guaranteed that after a call to stop
, f
either has run or will not run. But it doesn't make a difference in most of the motivating examples so perhaps consistency with time.AfterFunc
is preferable.
Even if ctx is already done, calling AfterFunc does not wait for f to return.
I find this sentence a bit off when I read it. If I understand the proposal, calling AfterFunc never waits for f to return. The above sentence seems to emphasize the corner case of ctx already being done, though. That emphasis makes it seem like there is a special case involved, but there really isn't. Consider rewording to:
Calling AfterFunc does not wait for f to return, even if ctx is already done.
I think this better emphasizes that the behavior is always the same while still pointing out the less than obvious corner case.
While naming returned func stop
looks reasonable to avoid confusion with already existing cancel
func, I think name stop
isn't a good one for this use case.
Moreover, as stop
is related to f
, then stop() == false
may be misunderstood as "f is running now".
To me cancel
still sounds much more suitable here, and as we've some confusion in both cases then maybe it's too early to reject cancel
from name candidates.
As for other names⦠maybe detach
, unhook
, revoke
, prevent
?
If f is long-running, then you'd need to make arrangements to interrupt it.
If you do want a long-running f, and a stop
that doesn't wait for it, you can start a goroutine for the long-running operation explicitly:
started := false
stop := context.AfterFunc(ctx, func() {
started = true
go longRunningOperation()
})
stop()
if started {
// longRunningOperation is in progress
}
This is less convenient than a non-blocking stop (and a bit less efficient), but I think it's also the less common case. Every motivating example I've come up for AfterFunc
is fast--signaling a sync.Cond
, setting a timeout on a net.Conn
, etc.
A non-blocking stop makes it easier to inadvertently leak a long-running operation. In general, functions should clean up any goroutines they start before returning. A blocking stop ensures that the AfterFunc isn't left running unless the programmer takes specific steps to create a goroutine for it, as above.
A non-blocking stop can also make the common case quite a bit more subtle, and possibly less efficient. Taking the case of reading from a net.Conn
, we need to create a channel to synchronize with the AfterFunc
goroutine even in the common case where the AfterFunc
is not called:
func ContextReadOnDone(ctx context.Context, conn net.Conn, b []byte) (n int, err error) {
stopped := make(chan struct{})
stopf := context.AfterFunc(ctx, func() {
conn.SetReadDeadline(time.Now())
close(stopped)
})
n, err = conn.Read(b)
if !stopf() {
// stopf may still be running, so we need to wait for it to finish before resetting the conn deadline.
//
// Failing to wait here means that we might return with a still-running goroutine which will set the
// conn deadline at some point in the future (if we have a race between Read returning successfully
// and the context expiring).
<-stopped
conn.SetReadDeadline(time.Time{})
err = ctx.Err()
}
return n, err
}
@neild your second example is great, but first one looks very race-prone:
started := false
stop := context.AfterFunc(ctx, func() {
started = true
go longRunningOperation()
})
stop()
if started {
// longRunningOperation is in progress
}
TBH I don't remember outcome from last change in https://go.dev/ref/mem (i.e. is it safe to read/write int-sized vars like bool), but even if it's safe callback can be started by context.AfterFunc but didn't execute it's first operation started = true
yet, so there is a race here anyway. Probably worth rewriting, because, you know, people will copy-paste it from here too. :)
@powerman The first example assumes a stop
function which blocks until any in-progress call to f has completed. The call to stop
synchronizes access to the started
var; after stop
returns, either the func has run and set started = true
or it will never run and started
remains false
.
But isn't blocking stop
still returns bool
which can be used instead of extra started
var?
A blocking stop
could return a bool
indicating whether f ran or not, but there's less need for it. A non-blocking stop
must provide a way to tell whether f has been started.
A blocking stop can easily lead to deadlocks, especially if these functions are trying to send on channels to notify other goroutines that the context is cancelled. A non-blocking stop won't, and it matches time.Timer.Stop. I'm not entirely sure how to decide between those benefits and the ones @neild has pointed out.
Perhaps a non-blocking stop to match time.Timer.Stop, plus the Try function proposed by @ianlancetaylor above to handle the cases where you want to synchronize on the AfterFunc?
// Try executes fn while watching ctx. If ctx is cancelled, Try calls cancel and returns ctx.Err().
// Otherwise, Try returns the result of fn.
func Try(fn func() error, cancel func()) error
Alternatively, AfterFunc
could return a type with Stop
and Wait
methods. Or Stop
could return something that can be waited on. (func AfterFunc(f func()) func() func()
?)
It seems unusual to me to have to wait for the stop function. And it is after all not too hard to do if you need to. So maybe we could start with a non-blocking stop function.
Sounds like consensus is in favor of a non-blocking stop, which does have the advantage of matching time.AfterFunc
. And you can build a blocking stop atop the non-blocking one easily enough if necessary.
Non-blocking stop it is. Have all remaining concerns about this proposal been addressed?
Based on the discussion above, this proposal seems like a likely accept.
β rsc for the proposal review group
It'd be helpful to update the initial post with the naming and semantics decided on in the thread.
Current version of the proposal (will link to this from the initial post):
package context
// AfterFunc arranges to call f in a new goroutine after ctx is done (cancelled or timed out).
// If ctx is already done, f is called immediately in a new goroutine.
//
// Multiple calls to AfterFunc on a context operate independently; one does not replace another.
//
// Calling the returned stop function stops the association of ctx with f.
// It returns true if the call stopped f from being run.
// If stop returns false, either the context is done and f has been started in its own goroutine; or f was already stopped.
// The stop function does not wait for f to complete before returning.
// If the caller needs to know whether f is completed, it must coordinate with f explicitly.
//
// If ctx has a method AfterFunc(func()) func() bool, AfterFunc will call it.
func AfterFunc(ctx Context, f func()) (stop func() bool)
(Calling stop multiple times always returns false after the first call; this matches time.Timer.Stop
.)
If the context has a method AfterFunc(func()) func() bool
, will AfterFunc
only call that method and not do anything else?
Yes.
How about:
// If ctx has a "AfterFunc(func()) func() bool" method, AfterFunc will use it to schedule the call.
Change https://go.dev/cl/482695 mentions this issue: context: add AfterFunc
No change in consensus, so accepted. π
This issue now tracks the work of implementing the proposal.
β rsc for the proposal review group
One thing that wasn't discussed yet is how this proposal interact with #40221 (context.WithoutCancel)?
IIUC, the function supplied in AfterFunc is still called, which means that using AfterFunc to do anything with context values (like closing a trace span) is going to be problematic if context.WithoutCancel is ever used. I know that it's possible to do everything that context.AfterFunc is doing, but I wonder if it's going to be more attractive to assume that context's have a life-cycle that can be hooked into, while at the same time a method exists to escape a context from that life-cycle.
Internally at Google, we have a solution that adds a different life-cycle hook that to do work during context detaching (e.g. to open a new trace span for a background go routine), assuming the right context detaching method is used.
It's probably worth a separate proposal, but I do wonder if the implementation of this proposal is going to make it more complicated to add detaching functionality later. To be clear, I don't know the answer to that question.
AfterFunc(ctx, f)
calls f
after ctx
is done. If ctx
is the result of WithoutCancel
, it never becomes done and f
will never be called.
WithoutCancel(parent)
creates a new context. It does not affect the cancelation of the parent context, and will not interfere with functions registered with AfterFunc
on the parent context.
parent, cancel := context.WithCancel(context.Background())
context.AfterFunc(parent, func() {
fmt.Println("parent canceled")
})
child := context.WithoutCancel(parent)
context.AfterFunc(child, func() {
fmt.Println("child canceled") // this will never be called
})
cancel()
// Prints "parent canceled" and nothing else.
Got it. Do you think it might makes sense to be explicit about the fact that the function might never be called? That might dissuade its use in cases like this:
func dispatchRequest(ctx context.Context) {
ctx, cancel = context.WithCancel(ctx)
defer cancel()
handleRequest(ctx)
}
func handleRequest(ctx context.Context) {
var span trace.Span
ctx, span = tracer.Start(ctx, "operation")
context.AfterFunc(ctx, span.End)
backgroundAction(ctx)
}
func backgroundAction(ctx context.Context) {
ctx = context.WithoutCancel(ctx)
go someLibraryFunc(ctx)
}
func someLibraryFunc(ctx context.Context) {
span = trace.SpanFromContext(ctx) // <--- span likely ended already, updates are not allowed
}
Arguably, this issue already exists today. I am wondering if it will be worse, because it's too easy to assume that context.AfterFunc
is going to do something different.
Change https://go.dev/cl/486535 mentions this issue: doc: add release note for context.AfterFunc
I only just saw this issue; sorry for the late comment.
I'm not entirely sure whether the semantics of stop
are correct here (and I'm not convinced that Timer.Stop
is good precedent, because that API is notoriously error-prone and hard to use.
An alternative semantic could be:
// The stop function reports whether the function has been successfully stopped; that is, it returns false
// if and only if the function has been invoked already.
That makes it feasible to call stop
multiple times concurrently and have consistent results between them, and it seems to me like a simpler invariant to explain.
I'm wondering what the use case is for knowing whether this is the stop
call that prevented the function running.
@rogpeppe Thanks, I suggest that you open that in a new issue, as this semantics has already been implemented. We can make the new issue a release blocker for 1.21. Thanks.
@ianlancetaylor Thanks. Done.