sync: enhancement: have the new sync.Map use sharding

Question

sync: enhancement: have the new sync.Map use sharding

orcaman opened this issue 7 years ago · 6 comments

Hi guys,

Was glad to see the new sync.Map code merged into master recently. Did you guys consider using sharding technique such as the one used here to reduce time spent between locks?

Cheers,
Or

Answer 1 · 2017-05-14T11:46:29.000Z

Related #18802

Answer 2 · 2017-05-14T13:03:59.000Z

CC @bcmills

Note that we don't use the issue tracker for questions. This may be better discussed on the golang-dev mailing list. See https://golang.org/wiki/Questions .

Answer 3 · 2017-05-15T04:16:30.000Z

The short answer is, "Yes, we considered it."

The longer answer is that sync.Map addresses the specific problem of cache contention for maps used in the Go standard library. Those maps tend to be append-only (no overwriting existing keys) and mostly-read (lots of reads relative to the number of writes). For reads on existing keys, the algorithm in the 1.9 sync.Map produces no steady-state cache contention,^† regardless of the number of concurrent readers per key. Sharding wouldn't help much for that use-case, because concurrent readers of the same shard would end up contending on the cache line containing the lock for that shard.

(Also note that sharded data structures can sometimes fail dramatically in the case of unlucky shard assignments: for example, see #17953.)

That said, the 1.9 implementation almost certainly has room for improvement. The "new key" path in particular would likely benefit from some kind of sharding: at the moment all newly-added keys are guarded by a single Mutex, which is fine if you don't have new keys in the steady state but not good for maps with a lot of churn. Since I was prototyping in x/sync (outside the standard library), computing a shard for an arbitrary key would have added a lot of complexity, but for sync.Map itself you could probably tie in to the runtime's built-in hash functions.

If you have a specific workload in mind, I would encourage you to write some benchmarks and see what you can come up with for when the 1.10 window opens.

† Except to the extent that the runtime's memory allocator introduces false-sharing by allocating read-only values on the same cache lines as non-read-only values.

Answer 4 · 2017-05-15T04:34:02.000Z

Okay, I guess there's nothing left to do here. Question was answered. Thanks, @bcmills.

Answer 5 · 2017-05-15T05:34:30.000Z

(But @orcaman, feel free to file other bugs if you you have follow-up info per @bcmills's feedback)

Answer 6 · 2017-05-15T07:26:36.000Z

@bcmills thanks for the elaborate answer.

@bradfitz thanks, if I get to benchmarking as suggested by @bcmills I will follow up here.