rafaeljusto/redigomock

Several methods are not safe for concurrent use

Closed this issue · 8 comments

Do is mostly safe (although see #59 and #61), but other commands are not.

In general it's not possible to fully solve this problem -- Do and Errors, for example, cannot be made safe from each other without breaking compatibility, since Errors is a public field. So in #59 I just documented it as a problem. But there may be some cases that are possible to improve. Here's a race detector trace which shows one possible problem:

WARNING: DATA RACE
Write at 0x00c0002ff7d8 by goroutine 91: 
  github.com/rafaeljusto/redigomock.(*Conn).removeRelatedCommands()
      /home/benkraft/.go/pkg/mod/github.com/rafaeljusto/redigomock@v2.3.0+incompatible/redigomock.go:136 +0x224
  github.com/rafaeljusto/redigomock.(*Conn).Command()
      /home/benkraft/.go/pkg/mod/github.com/rafaeljusto/redigomock@v2.3.0+incompatible/redigomock.go:78 +0x13d
[continues into our code calling Command()]

Previous read at 0x00c0002ff7d8 by goroutine 95:
  github.com/rafaeljusto/redigomock.(*Conn).find()
      /home/benkraft/.go/pkg/mod/github.com/rafaeljusto/redigomock@v2.3.0+incompatible/redigomock.go:115 +0x42
  github.com/rafaeljusto/redigomock.(*Conn).do()
      /home/benkraft/.go/pkg/mod/github.com/rafaeljusto/redigomock@v2.3.0+incompatible/redigomock.go:192 +0x94
  github.com/rafaeljusto/redigomock.(*Conn).Do()
      /home/benkraft/.go/pkg/mod/github.com/rafaeljusto/redigomock@v2.3.0+incompatible/redigomock.go:188 +0x4a3
  github.com/gomodule/redigo/redis.(*activeConn).Do()
      /home/benkraft/.go/pkg/mod/github.com/gomodule/redigo@v2.0.0+incompatible/redis/pool.go:447 +0x1d6
[continues into our code calling Do()]

You can see the two are racing on c.commands, which makes sense.

Unfortunately I can't share the test at issue, but basically what it does is

// prod code
func DeleteInBackground() {
  go conn.Do("DELETE", ...)
}

// test
conn := ...
conn.Command(...).Expect(...)

DeleteInBackground()

conn.Command(...).Expect(...)

It's not clear what this test was doing was really right, but ideally it would at least be race-detector-clean.

After looking more closely, it looks like Flush, Send, and Receive is also not safe for concurrent use, due to unprotected accesses to c.queue and c.replies. If someone is interested in fixing those, it should be easy to write some race detector tests similar to #61 and reproduce the issue, and since those fields are private, it is presumably fixable without API changes.

Thanks for reporting! I'm now thinking of refactoring the library and breaking backward compatibility with a new major version to solve all those concurrencies issues. 🤔

Yeah, that makes sense to me -- I didn't want to be the one say it's time to break compat (and I think it will not be a trivial break, sadly), especially before I realized how many different places these problems could manifest, but if you think it makes sense it certainly works for me.

To really solve this everywhere (or at least make an API conservative enough to do so), here's what I would think would be needed:

  • Make Cmd.Responses, Cmd.Called, Conn.SubResponses, and Conn.Errors private (replace them with reader/writer methods as necessary; some already exist or aren't necessary)
  • Document that Response, Cmd.Name, Cmd.Args, and all the other fields of Conn (ReceiveWait, ReceiveNow, *Mock) should not be modified after construction (or make them private and provide appropriate constructors, but that doesn't seem super necessary to me)
    It's not a small compat break, sadly, although people who are using those variables are probably the most likely ones to hit the problems.

Then it's just putting locks everywhere -- honestly one big lock on the whole Conn and one for each Cmd would probably be easiest, but one could also do a lock/once for each variable. I think writing some race detector tests will probably catch all of the most common cases easily.

If that all makes sense to you, I'll probably have time this afternoon or tomorrow to take a stab at it, unless you'd prefer to do it yourself.

Yeah, if you have some time to start the refactoring it would be great. I can jump in to help you out once we have an initial version. But if you're also busy, I can start working on it this weekend or at the beginning of next week.

Alright, see #62. There are a few places where it needs more tests, at least, and if you're excited there's room to be a bit more precise about the locking too. (See the last commit message for the details.) I probably won't have time to finish that up until next week, so feel free to jump in and do that, or I can get to it then!

Thank you for all your work! I will jump in this weekend on it.

there's room to be a bit more precise about the locking too.

Yeah, I don't think we need to be precise as this is just a testing library. If people start complaining of testing performance we can review it.

@benjaminjkraft All merged now, new tag v3.0.0 created, also changed modules to be v3 to follow the tag:

go get -u github.com/rafaeljusto/redigomock/v3

Thanks again for your contribution! 🙂