bufbuild/protocompile

Data race in `v0.3.0`

nhatthm opened this issue · 3 comments

Description

After upgrading to github.com/jhump/protoreflect to v1.15.0, my unit tests fail with -race, it's okay when I remove all t.Parallel() calls. After checking the dependencies, I downgrade github.com/bufbuild/protocompile from v0.3.0 to v0.2.1-0.20230123224550-da57cd758c2f, that's the version that github.com/jhump/protoreflect uses in the go.mod file. Then all the tests passed again.

The tests are simple, just parsing a simple proto file and constructing a new proto message. However, there are 2 tests that parse the same proto file

Env

  • go1.20
  • Mac M1

Expected

The tests pass without a race condition.

Actual

WARNING: DATA RACE
Read at 0x00c00023e4c8 by goroutine 30:
  google.golang.org/protobuf/types/descriptorpb.(*FieldDescriptorProto).GetTypeName()
      /runner/_work/my-project/vendor/google.golang.org/protobuf/types/descriptorpb/descriptor.pb.go:958 +0xe04
  github.com/bufbuild/protocompile/linker.resolveFieldTypes()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/linker/resolve.go:365 +0xe20
  github.com/bufbuild/protocompile/linker.(*result).resolveReferences.func1()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/linker/resolve.go:246 +0xad4
  github.com/bufbuild/protocompile/walk.messageDescriptor()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/walk/walk.go:122 +0x159
  github.com/bufbuild/protocompile/walk.DescriptorsEnterAndExit()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/walk/walk.go:70 +0xc5
  github.com/bufbuild/protocompile/linker.(*result).resolveReferences()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/linker/resolve.go:200 +0x9a4
  github.com/bufbuild/protocompile/linker.Link()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/linker/linker.go:84 +0x5d1
  github.com/bufbuild/protocompile.(*task).link()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:565 +0x105
  github.com/bufbuild/protocompile.(*task).asFile()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:505 +0x11b0
  github.com/bufbuild/protocompile.(*executor).doCompile()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:358 +0x53e
  github.com/bufbuild/protocompile.(*executor).compileLocked.func1()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:277 +0x10c

Previous write at 0x00c00023e4c8 by goroutine 32:
  github.com/bufbuild/protocompile/linker.resolveFieldTypes()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/linker/resolve.go:406 +0x1dd9
  github.com/bufbuild/protocompile/linker.(*result).resolveReferences.func1()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/linker/resolve.go:246 +0xad4
  github.com/bufbuild/protocompile/walk.messageDescriptor()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/walk/walk.go:122 +0x159
  github.com/bufbuild/protocompile/walk.DescriptorsEnterAndExit()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/walk/walk.go:70 +0xc5
  github.com/bufbuild/protocompile/linker.(*result).resolveReferences()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/linker/resolve.go:200 +0x9a4
  github.com/bufbuild/protocompile/linker.Link()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/linker/linker.go:84 +0x5d1
  github.com/bufbuild/protocompile.(*task).link()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:565 +0x105
  github.com/bufbuild/protocompile.(*task).asFile()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:505 +0x11b0
  github.com/bufbuild/protocompile.(*executor).doCompile()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:358 +0x53e
  github.com/bufbuild/protocompile.(*executor).compileLocked.func1()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:277 +0x10c

Goroutine 30 (running) created at:
  github.com/bufbuild/protocompile.(*executor).compileLocked()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:259 +0x369
  github.com/bufbuild/protocompile.(*executor).compile()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:244 +0xf8
  github.com/bufbuild/protocompile.(*task).asFile()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:458 +0x96a
  github.com/bufbuild/protocompile.(*executor).doCompile()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:358 +0x53e
  github.com/bufbuild/protocompile.(*executor).compileLocked.func1()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:277 +0x10c

Goroutine 32 (running) created at:
  github.com/bufbuild/protocompile.(*executor).compileLocked()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:259 +0x369
  github.com/bufbuild/protocompile.(*executor).compile()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:244 +0xf8
  github.com/bufbuild/protocompile.(*task).asFile()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:458 +0x96a
  github.com/bufbuild/protocompile.(*executor).doCompile()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:358 +0x53e
  github.com/bufbuild/protocompile.(*executor).compileLocked.func1()
      /runner/_work/my-project/vendor/github.com/bufbuild/protocompile/compiler.go:277 +0x10c
==================
jhump commented

@nhatthm, do your tests use a source resolver that has an override version of google/protobuf/descriptor.proto? If so, does the search result provide the Proto field and not the Desc field?

I do see a potential concurrency problem and will create a fix right now. But it turns out that the underlying problem has existed since before v0.3.0. It happens if the resolver you provide to the compiler returns descriptor protos that might be shared across multiple compile operations.

Before v0.3.0, it could have only been triggered if multiple concurrent compile operations imported the same descriptor, and the resolver returned the same proto to both operations. But as of v0.3.0, even if a file does not directly import google/protobuf/descriptor.proto, it can still get used (and possibly mutated) from multiple concurrent operations.

@nhatthm, do your tests use a source resolver that has an override version of google/protobuf/descriptor.proto? If so, does the search result provide the Proto field and not the Desc field?

I don't quite understand what you mean but thanks for the hint. Here is the logic

func newParser(importPaths ...string) *protoparse.Parser {
	fds := loadFileDescriptors() // <==== It DOES load the standard imports, including `google/protobuf/descriptor.proto`

	p := &protoparse.Parser{
		ImportPaths: importPaths,
		LookupImportProto: func(path string) (*dpb.FileDescriptorProto, error) {
			if fd, ok := fds[path]; ok {
				return fd, nil
			}

			return nil, fmt.Errorf("%w: %s", errProtoNotFound, path)
		},
	}

	return p
}

This is new since github.com/jhump/protoreflect@v1.15.0-rc1, I found out that it does the same job

https://github.com/jhump/protoreflect/blob/bc601aed79cb23f3a46dd0a2817d139551f94c28/desc/protoparse/parser.go#L456

After removing all the standard imports on my side, it works again.

So I guess we can close this issue, thanks for your support!

jhump commented

@nhatthm, I'm glad you have a work-around that works. #103 will also fix the issue in a more bullet-proof way.