Data race in azblob client

Question

Data race in azblob client

muroj opened this issue 2 months ago · 5 comments

Bug Report

Environment Info

import path of package in question: github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/container
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob v1.4.1
go version go1.23.2 darwin/arm64

My program is detecting a data race while enumerating containers and blobs in parallel. Here is a minimal snippet demonstrating my approach.

client, err := azblob.NewClient(url, credential, nil)
ctrPager := client.NewListContainersPager(nil)

for ctrPager.More() {
  resp, err := ctrPager.NextPage(context.TODO())

  for _, c := range resp.ContainerItems {

    // limit blob enumeration, some containers have millions of blobs
    ctx, cancel := context.WithTimeout(context.Background(), time.Second*60)
    defer cancel()

    go func(c *service.ContainerItem) {
      blobPager := client.NewListBlobsFlatPager(*c.Name, nil)
      for blobPgr.More() {
        
      var blobs container.ListBlobsFlatResponse
        result := make(chan struct{})

        go func() {
          blobs, err = blobPgr.NextPage(ctx)
          handleError(err)
          result <- struct{}{}
        }()

        select {
        case <-ctx.Done():
          fmt.Printf("context cancelled while processing blobs for container: %s, %v\n", *c.Name, ctx.Err())
        case <-result:
          doStuff()
        }
      }
    }(c)
  }
}

Running this program results in the following data race, which occurs in the Blob Pager.

WARNING: DATA RACE
Write at 0x00c0001f4358 by goroutine 106:
  github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/container.(*Client).NewListBlobsFlatPager.func2()
      /Users/jmuro/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/storage/azblob@v1.4.1/container/client.go:283 +0x8c
  github.com/Azure/azure-sdk-for-go/sdk/azcore/runtime.(*Pager[go.shape.struct { github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/internal/generated.ListBlobsFlatSegmentResponse; ClientRequestID *string; ContentType *string; Date *time.Time; RequestID *string; Version *string }]).NextPage()
      /Users/jmuro/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/azcore@v1.14.0/runtime/pager.go:79 +0x5b4
  main.getContainerStats.func1()
      /Users/jmuro/acme/ops/util/azure/storage/utils/azure_storage_stats.go:168 +0x84

Previous write at 0x00c0001f4358 by goroutine 104:
  github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/container.(*Client).NewListBlobsFlatPager.func2()
      /Users/jmuro/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/storage/azblob@v1.4.1/container/client.go:283 +0x8c
  github.com/Azure/azure-sdk-for-go/sdk/azcore/runtime.(*Pager[go.shape.struct { github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/internal/generated.ListBlobsFlatSegmentResponse; ClientRequestID *string; ContentType *string; Date *time.Time; RequestID *string; Version *string }]).NextPage()
      /Users/jmuro/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/azcore@v1.14.0/runtime/pager.go:79 +0x5b4
  main.getContainerStats.func1()
      /Users/jmuro/acme/ops/util/azure/storage/utils/azure_storage_stats.go:168 +0x84

I am new to go routines, contexts, and channels, so I wouldn't be surprised if my code is buggy. Regardless, I'd like confirmation on whether the azure go SDK is thread safe. In particular, is it safe to share the clients between go routines? This go sdk design doc implies that clients should be threadsafe, but is that the case? The .NET SDK doc explicitly mentions thread safety, but the go SDK doc does not.

Answer 1 · 2024-10-31T14:56:19.000Z

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.

Answer 2 · 2024-10-31T16:37:20.000Z

Hello!

You are correct that SDK clients are goroutine safe. However, Pager[T] and Poller[T] types returned from SDK methods are not. In your example, the following is not goroutine safe.

        go func() {
          blobs, err = blobPgr.NextPage(ctx)
          handleError(err)
          result <- struct{}{}
        }()

Here, it's possible for multiple goroutines to call NextPage() on the same blobPgr instance which is not goroutine safe.

Please ping back if you have further questions.

Answer 3 · 2024-10-31T16:37:44.000Z

Hi @muroj. Thank you for opening this issue and giving us the opportunity to assist. We believe that this has been addressed. If you feel that further discussion is needed, please add a comment with the text "/unresolve" to remove the "issue-addressed" label and continue the conversation.

Answer 4 · 2024-10-31T17:19:34.000Z

It appears this wasn't doc'ed. I've added doc comments about this in #23679

Answer 5 · 2024-11-07T22:35:50.000Z

Hi @muroj, since you haven’t asked that we /unresolve the issue, we’ll close this out. If you believe further discussion is needed, please add a comment /unresolve to reopen the issue.