Batched heartbeat got canceled due to caller context cancellation
Opened this issue · 1 comments
codemonkeycxy commented
Expected Behavior
Say I have the following activity code
func (a *Activities) MyActivity(ctx) (out, error)(
g, gctx := errgroup.WithContext(ctx)
g.SetLimit(4)
for _, batch := range batches {
batch := batch
g.Go(func() error {
activity.RecordHeartbeat(gctx)
// Goroutine logic
})
}
if err := g.Wait(); err != nil {
return nil, err
}
for _, item := range array {
activity.RecordHeartbeat(ctx)
// Additional logic
}
}
I expect heartbeat to get sent out to the server and my activity to NOT time out
Actual Behavior
Activity timed out with activity Heartbeat timeout
This is because g.Wait()
cancels gctx
but gctx
is still referenced in SDK Heartbeat
function's goroutine https://github.com/temporalio/sdk-go/blob/master/internal/internal_task_handlers.go#L2031. Due to the batch heartbeat logic, gctx
was used after g.Wait()
had finished in my code and caused a context canceled
error when making the call to the server
I could change activity.RecordHeartbeat(gctx)
to activity.RecordHeartbeat(ctx)
but wonder if SDK can avoid this pitfall by using context.Background()
Steps to Reproduce the Problem
Specifications
- Version: go.temporal.io/sdk v1.27.0
- Platform: Linux
Quinn-With-Two-Ns commented
Talked offline discussed we should:
- Clarify the documentation around how
RecordHeartbeat
will behave if the calling context is cancelled - If
RecordHeartbeat
is called with multiple contexts we should use the last non cancelled context.