golang/go

runtime: use SwissTable

zhangyunhao116 opened this issue · 82 comments

Abstract

From ByteDance Programming Language Team

We suggest using SwissTable in the runtime to replace the original implementation of the hashmap.

SwissTable is now used in abseil and rust by default. We borrowed ideas from the two implementations and combined some optimization ideas of Go's original hashmap to construct the current implementation.

See the following comment for performance comparison(the GitHub issue has a character limit).

Motivations

  • We need to improve the performance of the hashmap as much as possible. At ByteDance, our Go service consumes about 4% of the CPU on hashmap. It is worth noting that on many services, the CPU consumption of mapassign and mapaccess are almost 1:1, so the improvement of insertion performance is equally important.

  • Support dynamic adjustment of the capacity of the hashmap. Some Go services will make a map too large due to burst traffic, which will cause a lot of pressure on the memory, but the map does not shrink after elements removal. We found that there are also many related discussions in the community:

Implementations

Tasks

  • Basic version: without SIMD, follow the original memory layout.
  • Add SIMD support for x86: WIP, in the following CL.
  • Aggregates all tophash into an array: In the plan, it requires more performance comparison data.
  • Support hashmap resizing: More discussion is needed.
  • Set bucketCnt to 4 on 32-bit platforms: More discussion is needed.

Advantages compared to the original implementation

  • The overall performance is better, and we can also use SIMD to improve performance.

  • SwissTable's LoadFactor can be set higher to save some memory.

  • Open-addressing and making dynamic adjustments to the capacity is easier to implement.

  • After allocating a fixed-size hashmap, if the number of inserted elements does not exceed the capacity, no additional memory is required, and the performance will be significantly improved, which has a greater advantage over reused fixed-size hashmap. (The original hashmap may still allocate additional memory even when the limit is not exceeded)

Disadvantages compared to the original implementation

  • The rehash is done at once, so

    • Some services that are sensitive to latency may experience performance degradation. (but we do not observe this for now)
    • In some cases, when the hashmap needs to grow, it may consume more CPU than the original implementation. (The previous implementation uses incremental rehash)
  • Since SwissTable needs to ensure that each bucket has at least one empty slot, it will cause performance degradation in some cases. For example, if we insert an element when there are seven elements in the hashmap, the CPU consumption will significantly increase because the hashmap needs to grow.

Implementation details

Here is an overview of the basic version. For more detailed implementation, you can read the code. Before reading the code, it is recommended to check the video or the article in the references to get a general idea of how it works.

Differences with SwissTable

  • Move tophash into the bucket, not as an array alone.

The classic SwissTable aggregates all tophash into an array (also called control bytes array, here we follow the convention, called tophash), while the original hashmap of Go puts tophash and items in the same bucket.

Our experiments outside the runtime found that if tophash is used as an array alone, we couldn’t find any performance wins. This is probably because we need two memory allocations when initializing the hashmap.

So we use the bucket memory layout almost the same as the original in the current version. In the future, we may consider aggregating all tophash into an array, using a data structure similar to sync.Pool, so that all tophash of the hashmap can be reused to improve performance.

Due to this change, when we delete a key/value, we only need to determine whether the current bucket still has empty slots, because the location where we access the bucket is relatively fixed, unlike the classic implementation, which can be accessed from different addresses.

Differences with the original implementation

For hashmap header (hmap) and hiter

  • Removed fields about overflow and incremental rehash.

For bucket (bmap)

  • The tophash is still 8-bit(uint8), but 7-bit for hash info and 1-bit as flags.

  • Removed overflow pointer.

For the way of querying

  • The query process changes from querying the overflow pointer to finding the next bucket (using triangular probing) until the first empty slot is found. SwissTable theoretically guarantees that there is at least one empty slot.

For initializing bucket arrays

  • The initial state of the tophash is 0xff instead of 0; 0xff represents the empty slot now.

References

Performance comparison

Here the SwissTable is just a basic version. To avoid a CL bringing too many changes, it does not use SIMD and is compatible with the previous hashmap memory layout(it means many optimizations are not included in this version). Its performance still has room for improvement; we will add these optimizations in subsequent CLs.

In general, the performance changes for the basic version of SwissTable are as follows.

  • There is a significant improvement (20% ~ 50%) when querying a large hashmap or querying an element that does not exist in the hashmap. When querying a hashmap with fewer elements, performance degradation is up to 20%.

  • Insertions and deletions have been significantly improved in almost all cases (20% ~ 50%).

  • The performance of iterate is improved by about 10%.

  • Memory usage is reduced by 0% ~ 25% in most cases. Reusing a fixed-size hashmap no longer consumes any extra memory.

  • Since the growth is done at once, the CPU usage is significantly reduced in the case of continuous growth, but in some cases, the CPU consumption may also increase significantly.

Platform

  • GO: linux/amd64

  • CPU: AMD 3700x(8C16T)

  • OS: Ubuntu 22.04.1 LTS (GNU/Linux 5.15.0-43-generic x86_64)

  • MEMORY: 16G x 2 (DDR4 3200MHz)

Benchmark-1

Located in https://github.com/zhangyunhao116/gomapbench

This benchmark contains performance comparisons of common operations of common types in different situations.

INFO
name                                   old time/op    new time/op    delta
MapIter/Int/6-16                         60.2ns ± 1%    55.7ns ± 0%    -7.35%  (p=0.000 n=9+7)
MapIter/Int/12-16                         121ns ±10%     102ns ± 3%   -15.51%  (p=0.000 n=10+10)
MapIter/Int/18-16                         173ns ± 2%     165ns ± 1%    -4.58%  (p=0.000 n=10+8)
MapIter/Int/24-16                         214ns ± 4%     194ns ± 3%    -9.46%  (p=0.000 n=10+10)
MapIter/Int/30-16                         287ns ± 3%     272ns ± 2%    -5.22%  (p=0.000 n=10+9)
MapIter/Int/64-16                         609ns ± 0%     572ns ± 3%    -6.10%  (p=0.000 n=7+10)
MapIter/Int/128-16                       1.21µs ± 4%    1.17µs ± 2%    -3.63%  (p=0.004 n=10+10)
MapIter/Int/256-16                       2.46µs ± 2%    2.23µs ± 2%    -9.31%  (p=0.000 n=10+10)
MapIter/Int/512-16                       4.98µs ± 4%    4.38µs ± 2%   -12.22%  (p=0.000 n=10+10)
MapIter/Int/1024-16                      9.94µs ± 3%    8.84µs ± 2%   -11.11%  (p=0.000 n=10+10)
MapIter/Int/2048-16                      20.1µs ± 3%    17.8µs ± 1%   -11.33%  (p=0.000 n=10+10)
MapIter/Int/4096-16                      39.7µs ± 1%    35.5µs ± 2%   -10.49%  (p=0.000 n=10+10)
MapIter/Int/8192-16                      79.3µs ± 2%    71.8µs ± 1%    -9.42%  (p=0.000 n=10+10)
MapIter/Int/65536-16                      631µs ± 2%     571µs ± 1%    -9.52%  (p=0.000 n=10+9)
MapAccessHit/Int64/6-16                  4.24ns ± 1%    3.71ns ± 1%   -12.37%  (p=0.000 n=8+9)
MapAccessHit/Int64/12-16                 6.61ns ± 6%    6.63ns ±11%      ~     (p=0.796 n=9+9)
MapAccessHit/Int64/18-16                 6.74ns ± 4%    7.01ns ± 1%    +4.02%  (p=0.002 n=9+8)
MapAccessHit/Int64/24-16                 7.81ns ± 8%    8.17ns ± 5%    +4.62%  (p=0.034 n=10+8)
MapAccessHit/Int64/30-16                 6.23ns ± 6%    6.99ns ± 1%   +12.34%  (p=0.000 n=10+7)
MapAccessHit/Int64/64-16                 6.68ns ± 6%    7.22ns ± 2%    +8.08%  (p=0.000 n=10+9)
MapAccessHit/Int64/128-16                6.98ns ± 6%    7.32ns ± 6%    +4.93%  (p=0.004 n=10+10)
MapAccessHit/Int64/256-16                7.12ns ± 2%    7.28ns ± 2%    +2.21%  (p=0.002 n=9+10)
MapAccessHit/Int64/512-16                7.68ns ± 4%    7.40ns ± 2%    -3.65%  (p=0.000 n=9+10)
MapAccessHit/Int64/1024-16               8.85ns ± 2%    7.35ns ± 2%   -16.92%  (p=0.000 n=10+9)
MapAccessHit/Int64/2048-16               11.2ns ± 3%     7.7ns ± 1%   -31.23%  (p=0.000 n=10+8)
MapAccessHit/Int64/4096-16               14.1ns ± 1%     7.9ns ± 1%   -43.96%  (p=0.000 n=7+8)
MapAccessHit/Int64/8192-16               16.2ns ± 2%     8.2ns ± 2%   -49.14%  (p=0.000 n=10+10)
MapAccessHit/Int64/65536-16              19.6ns ± 2%    10.5ns ± 1%   -46.18%  (p=0.000 n=10+9)
MapAccessHit/Int32/6-16                  3.78ns ± 1%    3.86ns ± 1%    +2.12%  (p=0.000 n=10+8)
MapAccessHit/Int32/12-16                 6.23ns ± 6%    6.16ns ± 4%      ~     (p=0.483 n=10+9)
MapAccessHit/Int32/18-16                 6.42ns ± 5%    6.99ns ± 3%    +8.91%  (p=0.000 n=10+9)
MapAccessHit/Int32/24-16                 7.46ns ± 5%    8.60ns ± 8%   +15.32%  (p=0.000 n=9+10)
MapAccessHit/Int32/30-16                 6.07ns ± 5%    6.97ns ± 1%   +14.88%  (p=0.000 n=10+8)
MapAccessHit/Int32/64-16                 6.50ns ± 4%    7.18ns ± 5%   +10.34%  (p=0.000 n=9+10)
MapAccessHit/Int32/128-16                6.68ns ± 6%    7.15ns ± 4%    +7.12%  (p=0.000 n=10+10)
MapAccessHit/Int32/256-16                6.81ns ± 4%    7.32ns ± 4%    +7.45%  (p=0.000 n=10+10)
MapAccessHit/Int32/512-16                7.40ns ± 3%    7.30ns ± 3%      ~     (p=0.156 n=9+10)
MapAccessHit/Int32/1024-16               8.46ns ± 2%    7.38ns ± 1%   -12.79%  (p=0.000 n=10+9)
MapAccessHit/Int32/2048-16               11.0ns ± 3%     7.6ns ± 2%   -31.46%  (p=0.000 n=10+10)
MapAccessHit/Int32/4096-16               13.9ns ± 2%     7.8ns ± 2%   -44.12%  (p=0.000 n=10+10)
MapAccessHit/Int32/8192-16               15.8ns ± 2%     8.1ns ± 1%   -48.99%  (p=0.000 n=10+10)
MapAccessHit/Int32/65536-16              19.0ns ± 2%    10.3ns ± 2%   -45.85%  (p=0.000 n=10+10)
MapAccessHit/Str/6-16                    13.5ns ± 1%    12.7ns ± 2%    -5.88%  (p=0.000 n=9+10)
MapAccessHit/Str/12-16                   9.77ns ±11%    9.40ns ± 2%      ~     (p=0.460 n=10+8)
MapAccessHit/Str/18-16                   9.05ns ± 7%    9.47ns ± 1%    +4.66%  (p=0.001 n=9+9)
MapAccessHit/Str/24-16                   9.92ns ± 7%   10.07ns ± 9%      ~     (p=0.604 n=9+10)
MapAccessHit/Str/30-16                   8.46ns ± 7%    9.43ns ± 1%   +11.50%  (p=0.000 n=10+8)
MapAccessHit/Str/64-16                   8.91ns ± 5%    9.57ns ± 2%    +7.39%  (p=0.000 n=10+8)
MapAccessHit/Str/128-16                  9.82ns ± 4%   10.58ns ± 5%    +7.75%  (p=0.000 n=10+10)
MapAccessHit/Str/256-16                  11.8ns ± 4%    11.4ns ± 1%    -3.01%  (p=0.008 n=10+8)
MapAccessHit/Str/512-16                  14.6ns ± 2%    11.9ns ± 2%   -18.89%  (p=0.000 n=8+9)
MapAccessHit/Str/1024-16                 17.9ns ± 2%    12.7ns ± 1%   -29.02%  (p=0.000 n=10+9)
MapAccessHit/Str/2048-16                 21.5ns ± 3%    13.0ns ± 2%   -39.43%  (p=0.000 n=10+9)
MapAccessHit/Str/4096-16                 25.1ns ± 1%    13.2ns ± 2%   -47.63%  (p=0.000 n=10+10)
MapAccessHit/Str/8192-16                 26.6ns ± 1%    14.2ns ± 1%   -46.62%  (p=0.000 n=9+9)
MapAccessHit/Str/65536-16                31.8ns ± 1%    17.3ns ± 2%   -45.64%  (p=0.000 n=10+10)
MapAccessMiss/Int64/6-16                 8.03ns ± 1%    7.35ns ± 1%    -8.39%  (p=0.000 n=10+10)
MapAccessMiss/Int64/12-16                10.2ns ± 2%    18.3ns ± 3%   +79.48%  (p=0.000 n=8+8)
MapAccessMiss/Int64/18-16                10.3ns ± 1%     7.2ns ± 1%   -29.85%  (p=0.000 n=9+8)
MapAccessMiss/Int64/24-16                13.9ns ± 2%    10.8ns ± 2%   -22.43%  (p=0.000 n=7+8)
MapAccessMiss/Int64/30-16                10.1ns ± 3%     7.1ns ± 3%   -29.69%  (p=0.000 n=8+8)
MapAccessMiss/Int64/64-16                10.3ns ± 1%     7.9ns ±12%   -23.29%  (p=0.000 n=7+10)
MapAccessMiss/Int64/128-16               10.2ns ± 2%     7.6ns ± 8%   -25.64%  (p=0.000 n=10+9)
MapAccessMiss/Int64/256-16               10.4ns ± 2%     7.9ns ± 7%   -23.40%  (p=0.000 n=9+9)
MapAccessMiss/Int64/512-16               10.3ns ± 3%     7.9ns ± 8%   -23.37%  (p=0.000 n=9+10)
MapAccessMiss/Int64/1024-16              10.4ns ± 2%     8.0ns ± 4%   -22.40%  (p=0.000 n=9+10)
MapAccessMiss/Int64/2048-16              10.4ns ± 2%     7.9ns ± 3%   -23.97%  (p=0.000 n=10+10)
MapAccessMiss/Int64/4096-16              10.4ns ± 2%     8.2ns ± 3%   -21.67%  (p=0.000 n=10+10)
MapAccessMiss/Int64/8192-16              10.3ns ± 2%     8.6ns ± 3%   -16.95%  (p=0.000 n=10+10)
MapAccessMiss/Int64/65536-16             13.5ns ± 2%    10.9ns ± 3%   -19.23%  (p=0.000 n=9+8)
MapAccessMiss/Int32/6-16                 6.17ns ± 2%    7.45ns ± 2%   +20.86%  (p=0.000 n=10+10)
MapAccessMiss/Int32/12-16                8.50ns ± 1%   15.94ns ± 3%   +87.44%  (p=0.000 n=8+8)
MapAccessMiss/Int32/18-16                8.49ns ± 2%    8.21ns ±30%      ~     (p=0.156 n=9+10)
MapAccessMiss/Int32/24-16                11.9ns ± 2%    10.8ns ± 2%    -9.03%  (p=0.000 n=8+7)
MapAccessMiss/Int32/30-16                8.48ns ± 1%    7.53ns ±15%      ~     (p=0.138 n=10+10)
MapAccessMiss/Int32/64-16                8.49ns ± 1%    8.11ns ±17%    -4.50%  (p=0.034 n=8+10)
MapAccessMiss/Int32/128-16               8.89ns ± 2%    7.86ns ±10%   -11.57%  (p=0.000 n=7+10)
MapAccessMiss/Int32/256-16               8.72ns ± 3%    7.83ns ± 3%   -10.23%  (p=0.000 n=10+9)
MapAccessMiss/Int32/512-16               8.79ns ± 4%    7.70ns ± 4%   -12.40%  (p=0.000 n=10+9)
MapAccessMiss/Int32/1024-16              8.81ns ± 1%    7.74ns ± 5%   -12.10%  (p=0.000 n=9+10)
MapAccessMiss/Int32/2048-16              9.16ns ± 1%    7.91ns ± 2%   -13.64%  (p=0.000 n=7+9)
MapAccessMiss/Int32/4096-16              9.40ns ± 2%    8.15ns ± 3%   -13.29%  (p=0.000 n=9+9)
MapAccessMiss/Int32/8192-16              9.47ns ± 2%    8.29ns ± 2%   -12.45%  (p=0.000 n=9+9)
MapAccessMiss/Int32/65536-16             12.9ns ± 2%    10.6ns ± 2%   -18.32%  (p=0.000 n=8+10)
MapAccessMiss/Str/6-16                   5.69ns ± 1%    6.89ns ± 1%   +21.05%  (p=0.000 n=10+9)
MapAccessMiss/Str/12-16                  11.7ns ± 5%    10.6ns ±11%    -9.74%  (p=0.001 n=10+10)
MapAccessMiss/Str/18-16                  10.2ns ± 1%     9.6ns ± 1%    -5.81%  (p=0.000 n=8+8)
MapAccessMiss/Str/24-16                  12.5ns ±11%    11.4ns ± 9%    -8.83%  (p=0.011 n=10+10)
MapAccessMiss/Str/30-16                  10.8ns ± 4%     8.9ns ± 1%   -17.66%  (p=0.000 n=10+8)
MapAccessMiss/Str/64-16                  10.8ns ± 4%     9.3ns ± 8%   -13.63%  (p=0.000 n=10+10)
MapAccessMiss/Str/128-16                 12.0ns ± 7%     9.9ns ±10%   -17.57%  (p=0.000 n=10+10)
MapAccessMiss/Str/256-16                 11.2ns ± 3%    10.0ns ± 5%   -11.09%  (p=0.000 n=10+10)
MapAccessMiss/Str/512-16                 11.1ns ± 3%     9.6ns ± 6%   -13.51%  (p=0.000 n=10+10)
MapAccessMiss/Str/1024-16                12.3ns ± 2%     9.7ns ± 3%   -20.85%  (p=0.000 n=10+10)
MapAccessMiss/Str/2048-16                15.7ns ± 2%    10.1ns ± 4%   -35.32%  (p=0.000 n=10+10)
MapAccessMiss/Str/4096-16                14.0ns ± 3%    10.2ns ± 3%   -27.20%  (p=0.000 n=10+10)
MapAccessMiss/Str/8192-16                14.1ns ± 2%    10.9ns ± 4%   -22.70%  (p=0.000 n=10+10)
MapAccessMiss/Str/65536-16               19.2ns ± 2%    13.6ns ± 4%   -29.47%  (p=0.000 n=10+10)
MapAssignGrow/Int64/6-16                 69.1ns ± 1%    69.5ns ± 1%      ~     (p=0.133 n=9+10)
MapAssignGrow/Int64/12-16                 759ns ± 0%     539ns ± 0%   -28.91%  (p=0.000 n=7+10)
MapAssignGrow/Int64/18-16                1.67µs ± 0%    1.19µs ± 1%   -28.56%  (p=0.000 n=10+10)
MapAssignGrow/Int64/24-16                2.02µs ± 0%    1.36µs ± 0%   -32.89%  (p=0.000 n=10+10)
MapAssignGrow/Int64/30-16                3.53µs ± 0%    2.53µs ± 0%   -28.19%  (p=0.000 n=8+10)
MapAssignGrow/Int64/64-16                7.84µs ± 1%    5.38µs ± 1%   -31.36%  (p=0.000 n=10+10)
MapAssignGrow/Int64/128-16               15.5µs ± 0%    10.9µs ± 0%   -29.52%  (p=0.000 n=9+10)
MapAssignGrow/Int64/256-16               30.2µs ± 0%    21.9µs ± 0%   -27.59%  (p=0.000 n=10+10)
MapAssignGrow/Int64/512-16               58.8µs ± 0%    43.7µs ± 0%   -25.58%  (p=0.000 n=10+9)
MapAssignGrow/Int64/1024-16               116µs ± 0%      88µs ± 0%   -24.24%  (p=0.000 n=10+9)
MapAssignGrow/Int64/2048-16               231µs ± 0%     175µs ± 0%   -24.04%  (p=0.000 n=10+10)
MapAssignGrow/Int64/4096-16               458µs ± 0%     351µs ± 0%   -23.49%  (p=0.000 n=10+10)
MapAssignGrow/Int64/8192-16               919µs ± 0%     712µs ± 0%   -22.51%  (p=0.000 n=10+10)
MapAssignGrow/Int64/65536-16             8.69ms ± 1%    6.29ms ± 0%   -27.66%  (p=0.000 n=10+7)
MapAssignGrow/Int32/6-16                 71.7ns ± 1%    67.6ns ± 1%    -5.72%  (p=0.000 n=9+10)
MapAssignGrow/Int32/12-16                 734ns ± 0%     521ns ± 0%   -29.00%  (p=0.000 n=9+10)
MapAssignGrow/Int32/18-16                1.61µs ± 1%    1.15µs ± 0%   -28.59%  (p=0.000 n=10+7)
MapAssignGrow/Int32/24-16                1.97µs ± 0%    1.32µs ± 0%   -33.10%  (p=0.000 n=10+9)
MapAssignGrow/Int32/30-16                3.41µs ± 0%    2.43µs ± 1%   -28.67%  (p=0.000 n=10+10)
MapAssignGrow/Int32/64-16                7.45µs ± 0%    5.13µs ± 0%   -31.09%  (p=0.000 n=10+10)
MapAssignGrow/Int32/128-16               15.0µs ± 0%    10.7µs ± 0%   -28.89%  (p=0.000 n=10+9)
MapAssignGrow/Int32/256-16               29.8µs ± 0%    22.0µs ± 1%   -26.05%  (p=0.000 n=9+10)
MapAssignGrow/Int32/512-16               58.1µs ± 0%    42.9µs ± 0%   -26.07%  (p=0.000 n=10+8)
MapAssignGrow/Int32/1024-16               115µs ± 0%      86µs ± 0%   -24.96%  (p=0.000 n=10+10)
MapAssignGrow/Int32/2048-16               226µs ± 1%     171µs ± 1%   -24.44%  (p=0.000 n=10+10)
MapAssignGrow/Int32/4096-16               445µs ± 1%     341µs ± 0%   -23.31%  (p=0.000 n=10+10)
MapAssignGrow/Int32/8192-16               900µs ± 0%     686µs ± 0%   -23.72%  (p=0.000 n=10+10)
MapAssignGrow/Int32/65536-16             7.96ms ± 0%    6.11ms ± 1%   -23.18%  (p=0.000 n=7+10)
MapAssignGrow/Str/6-16                   86.5ns ± 2%    91.6ns ± 2%    +5.88%  (p=0.000 n=10+10)
MapAssignGrow/Str/12-16                   908ns ± 0%     719ns ± 0%   -20.87%  (p=0.000 n=10+10)
MapAssignGrow/Str/18-16                  2.01µs ± 0%    1.59µs ± 0%   -20.98%  (p=0.000 n=7+8)
MapAssignGrow/Str/24-16                  2.36µs ± 0%    1.79µs ± 1%   -23.93%  (p=0.000 n=10+10)
MapAssignGrow/Str/30-16                  4.17µs ± 0%    3.22µs ± 0%   -22.72%  (p=0.000 n=10+10)
MapAssignGrow/Str/64-16                  9.21µs ± 0%    6.73µs ± 0%   -26.96%  (p=0.000 n=7+9)
MapAssignGrow/Str/128-16                 18.6µs ± 0%    13.4µs ± 1%   -27.60%  (p=0.000 n=8+10)
MapAssignGrow/Str/256-16                 35.5µs ± 0%    26.8µs ± 0%   -24.53%  (p=0.000 n=10+10)
MapAssignGrow/Str/512-16                 70.6µs ± 0%    53.3µs ± 0%   -24.58%  (p=0.000 n=10+10)
MapAssignGrow/Str/1024-16                 142µs ± 1%     109µs ± 0%   -23.46%  (p=0.000 n=9+10)
MapAssignGrow/Str/2048-16                 289µs ± 1%     221µs ± 0%   -23.54%  (p=0.000 n=10+9)
MapAssignGrow/Str/4096-16                 588µs ± 0%     448µs ± 0%   -23.83%  (p=0.000 n=9+9)
MapAssignGrow/Str/8192-16                1.27ms ± 1%    0.91ms ± 0%   -28.82%  (p=0.000 n=10+10)
MapAssignGrow/Str/65536-16               12.4ms ± 1%     9.9ms ± 1%   -20.47%  (p=0.000 n=10+9)
MapAssignPreAllocate/Pointer/6-16         310ns ± 1%     281ns ± 0%    -9.41%  (p=0.000 n=10+7)
MapAssignPreAllocate/Pointer/12-16        864ns ± 0%     668ns ± 1%   -22.73%  (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/18-16       1.31µs ± 0%    0.97µs ± 0%   -25.94%  (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/24-16       1.77µs ± 0%    1.26µs ± 0%   -29.07%  (p=0.000 n=10+9)
MapAssignPreAllocate/Pointer/30-16       2.16µs ± 0%    1.60µs ± 1%   -26.13%  (p=0.000 n=10+9)
MapAssignPreAllocate/Pointer/64-16       4.73µs ± 1%    3.23µs ± 0%   -31.72%  (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/128-16      9.18µs ± 0%    6.36µs ± 1%   -30.65%  (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/256-16      18.0µs ± 0%    12.4µs ± 0%   -30.98%  (p=0.000 n=8+10)
MapAssignPreAllocate/Pointer/512-16      36.2µs ± 1%    24.6µs ± 0%   -32.11%  (p=0.000 n=10+8)
MapAssignPreAllocate/Pointer/1024-16     72.4µs ± 1%    50.3µs ± 0%   -30.49%  (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/2048-16      145µs ± 1%     102µs ± 0%   -29.93%  (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/4096-16      291µs ± 0%     209µs ± 1%   -28.18%  (p=0.000 n=10+9)
MapAssignPreAllocate/Pointer/8192-16      590µs ± 0%     444µs ± 1%   -24.76%  (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/65536-16    6.21ms ± 1%    5.68ms ± 1%    -8.48%  (p=0.000 n=9+9)
MapAssignPreAllocate/Int64/6-16          74.7ns ± 2%    71.2ns ± 2%    -4.68%  (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/12-16          540ns ± 0%     360ns ± 0%   -33.39%  (p=0.000 n=9+9)
MapAssignPreAllocate/Int64/18-16          811ns ± 0%     525ns ± 0%   -35.20%  (p=0.000 n=9+9)
MapAssignPreAllocate/Int64/24-16         1.16µs ± 0%    0.68µs ± 1%   -41.94%  (p=0.000 n=9+10)
MapAssignPreAllocate/Int64/30-16         1.33µs ± 1%    0.87µs ± 1%   -34.60%  (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/64-16         2.97µs ± 1%    1.75µs ± 0%   -41.01%  (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/128-16        5.73µs ± 0%    3.39µs ± 0%   -40.79%  (p=0.000 n=9+8)
MapAssignPreAllocate/Int64/256-16        11.0µs ± 0%     6.5µs ± 0%   -40.50%  (p=0.000 n=10+8)
MapAssignPreAllocate/Int64/512-16        21.8µs ± 0%    12.9µs ± 0%   -40.91%  (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/1024-16       43.6µs ± 0%    26.4µs ± 0%   -39.42%  (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/2048-16       88.0µs ± 1%    52.5µs ± 0%   -40.40%  (p=0.000 n=10+8)
MapAssignPreAllocate/Int64/4096-16        175µs ± 0%     108µs ± 0%   -38.60%  (p=0.000 n=9+10)
MapAssignPreAllocate/Int64/8192-16        362µs ± 0%     224µs ± 1%   -38.10%  (p=0.000 n=8+10)
MapAssignPreAllocate/Int64/65536-16      3.78ms ± 0%    3.21ms ± 2%   -15.14%  (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/6-16          72.8ns ± 1%    69.2ns ± 1%    -4.98%  (p=0.000 n=9+10)
MapAssignPreAllocate/Int32/12-16          503ns ± 0%     342ns ± 0%   -32.02%  (p=0.000 n=10+9)
MapAssignPreAllocate/Int32/18-16          770ns ± 1%     498ns ± 1%   -35.29%  (p=0.000 n=10+8)
MapAssignPreAllocate/Int32/24-16         1.11µs ± 1%    0.65µs ± 0%   -41.79%  (p=0.000 n=10+8)
MapAssignPreAllocate/Int32/30-16         1.26µs ± 0%    0.81µs ± 1%   -35.71%  (p=0.000 n=10+8)
MapAssignPreAllocate/Int32/64-16         2.79µs ± 1%    1.63µs ± 0%   -41.60%  (p=0.000 n=10+9)
MapAssignPreAllocate/Int32/128-16        5.60µs ± 1%    3.39µs ± 1%   -39.52%  (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/256-16        11.1µs ± 1%     6.7µs ± 1%   -39.66%  (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/512-16        21.6µs ± 0%    12.2µs ± 0%   -43.30%  (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/1024-16       42.4µs ± 1%    24.2µs ± 0%   -42.91%  (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/2048-16       84.9µs ± 0%    50.0µs ± 0%   -41.08%  (p=0.000 n=10+8)
MapAssignPreAllocate/Int32/4096-16        173µs ± 0%     104µs ± 0%   -40.04%  (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/8192-16        347µs ± 1%     212µs ± 1%   -39.07%  (p=0.000 n=10+8)
MapAssignPreAllocate/Int32/65536-16      3.69ms ± 1%    3.07ms ± 2%   -16.96%  (p=0.000 n=10+10)
MapAssignPreAllocate/Str/6-16            89.6ns ± 2%    94.6ns ± 1%    +5.49%  (p=0.000 n=9+7)
MapAssignPreAllocate/Str/12-16            641ns ± 1%     504ns ± 0%   -21.37%  (p=0.000 n=9+9)
MapAssignPreAllocate/Str/18-16           1.01µs ± 0%    0.76µs ± 0%   -24.98%  (p=0.000 n=7+8)
MapAssignPreAllocate/Str/24-16           1.36µs ± 0%    0.96µs ± 0%   -29.74%  (p=0.000 n=9+9)
MapAssignPreAllocate/Str/30-16           1.68µs ± 0%    1.22µs ± 0%   -27.15%  (p=0.000 n=10+9)
MapAssignPreAllocate/Str/64-16           3.82µs ± 0%    2.43µs ± 0%   -36.29%  (p=0.000 n=10+10)
MapAssignPreAllocate/Str/128-16          7.60µs ± 0%    4.66µs ± 0%   -38.67%  (p=0.000 n=9+10)
MapAssignPreAllocate/Str/256-16          13.8µs ± 0%     9.2µs ± 0%   -33.55%  (p=0.000 n=10+10)
MapAssignPreAllocate/Str/512-16          27.5µs ± 0%    18.2µs ± 0%   -33.89%  (p=0.000 n=10+9)
MapAssignPreAllocate/Str/1024-16         55.4µs ± 0%    37.8µs ± 0%   -31.90%  (p=0.000 n=10+10)
MapAssignPreAllocate/Str/2048-16          111µs ± 0%      77µs ± 0%   -30.20%  (p=0.000 n=9+9)
MapAssignPreAllocate/Str/4096-16          226µs ± 0%     159µs ± 0%   -29.57%  (p=0.000 n=10+9)
MapAssignPreAllocate/Str/8192-16          481µs ± 0%     368µs ± 1%   -23.34%  (p=0.000 n=9+10)
MapAssignPreAllocate/Str/65536-16        4.83ms ± 1%    4.11ms ± 1%   -14.86%  (p=0.000 n=10+10)
MapAssignReuse/Pointer/6-16               330ns ± 2%     290ns ± 1%   -12.18%  (p=0.000 n=10+10)
MapAssignReuse/Pointer/12-16              757ns ± 1%     550ns ± 1%   -27.35%  (p=0.000 n=9+9)
MapAssignReuse/Pointer/18-16             1.16µs ± 1%    0.81µs ± 1%   -30.13%  (p=0.000 n=9+10)
MapAssignReuse/Pointer/24-16             1.61µs ± 1%    1.08µs ± 1%   -32.59%  (p=0.000 n=8+10)
MapAssignReuse/Pointer/30-16             1.91µs ± 2%    1.31µs ± 1%   -31.72%  (p=0.000 n=10+9)
MapAssignReuse/Pointer/64-16             4.12µs ± 1%    2.75µs ± 1%   -33.26%  (p=0.000 n=8+9)
MapAssignReuse/Pointer/128-16            8.22µs ± 2%    5.50µs ± 1%   -33.03%  (p=0.000 n=10+10)
MapAssignReuse/Pointer/256-16            16.4µs ± 1%    11.0µs ± 1%   -33.01%  (p=0.000 n=9+10)
MapAssignReuse/Pointer/512-16            33.1µs ± 2%    21.9µs ± 1%   -33.72%  (p=0.000 n=10+10)
MapAssignReuse/Pointer/1024-16           66.7µs ± 2%    44.8µs ± 1%   -32.78%  (p=0.000 n=10+9)
MapAssignReuse/Pointer/2048-16            132µs ± 2%      92µs ± 1%   -30.11%  (p=0.000 n=10+10)
MapAssignReuse/Pointer/4096-16            268µs ± 1%     187µs ± 1%   -30.25%  (p=0.000 n=9+10)
MapAssignReuse/Pointer/8192-16            546µs ± 4%     389µs ± 1%   -28.67%  (p=0.000 n=10+10)
MapAssignReuse/Pointer/65536-16          5.14ms ± 3%    4.18ms ± 1%   -18.60%  (p=0.000 n=10+10)
MapAssignReuse/Int64/6-16                81.1ns ± 1%    74.8ns ± 2%    -7.75%  (p=0.000 n=10+10)
MapAssignReuse/Int64/12-16                413ns ± 3%     152ns ± 1%   -63.04%  (p=0.000 n=9+10)
MapAssignReuse/Int64/18-16                429ns ± 3%     223ns ± 1%   -48.00%  (p=0.000 n=8+10)
MapAssignReuse/Int64/24-16               1.01µs ± 4%    0.30µs ± 1%   -69.93%  (p=0.000 n=10+9)
MapAssignReuse/Int64/30-16                618ns ± 3%     358ns ± 2%   -42.00%  (p=0.000 n=9+10)
MapAssignReuse/Int64/64-16               1.31µs ± 1%    0.75µs ± 2%   -42.74%  (p=0.000 n=8+8)
MapAssignReuse/Int64/128-16              2.64µs ± 3%    1.51µs ± 2%   -42.96%  (p=0.000 n=9+10)
MapAssignReuse/Int64/256-16              5.23µs ± 4%    3.00µs ± 2%   -42.59%  (p=0.000 n=10+10)
MapAssignReuse/Int64/512-16              10.3µs ± 3%     6.0µs ± 1%   -41.70%  (p=0.000 n=10+8)
MapAssignReuse/Int64/1024-16             21.0µs ± 2%    12.1µs ± 2%   -42.09%  (p=0.000 n=10+10)
MapAssignReuse/Int64/2048-16             42.2µs ± 2%    25.5µs ± 2%   -39.56%  (p=0.000 n=10+10)
MapAssignReuse/Int64/4096-16             85.3µs ± 3%    51.3µs ± 2%   -39.88%  (p=0.000 n=10+10)
MapAssignReuse/Int64/8192-16              176µs ± 3%     107µs ± 2%   -39.54%  (p=0.000 n=10+10)
MapAssignReuse/Int64/65536-16            1.73ms ± 3%    1.16ms ± 2%   -33.33%  (p=0.000 n=10+10)
MapAssignReuse/Int32/6-16                79.6ns ± 1%    72.4ns ± 2%    -9.14%  (p=0.000 n=10+10)
MapAssignReuse/Int32/12-16                356ns ± 7%     150ns ± 1%   -57.73%  (p=0.000 n=9+10)
MapAssignReuse/Int32/18-16                394ns ± 5%     217ns ± 2%   -44.82%  (p=0.000 n=9+9)
MapAssignReuse/Int32/24-16               1.01µs ± 7%    0.29µs ± 2%   -70.98%  (p=0.000 n=10+10)
MapAssignReuse/Int32/30-16                618ns ± 5%     348ns ± 2%   -43.68%  (p=0.000 n=10+10)
MapAssignReuse/Int32/64-16               1.29µs ± 4%    0.73µs ± 2%   -43.23%  (p=0.000 n=10+10)
MapAssignReuse/Int32/128-16              2.61µs ± 3%    1.46µs ± 2%   -43.88%  (p=0.000 n=10+10)
MapAssignReuse/Int32/256-16              5.21µs ± 3%    2.92µs ± 2%   -43.90%  (p=0.000 n=10+10)
MapAssignReuse/Int32/512-16              10.2µs ± 3%     5.9µs ± 1%   -42.54%  (p=0.000 n=10+9)
MapAssignReuse/Int32/1024-16             20.5µs ± 3%    11.8µs ± 1%   -42.72%  (p=0.000 n=10+9)
MapAssignReuse/Int32/2048-16             41.7µs ± 3%    24.8µs ± 2%   -40.50%  (p=0.000 n=9+10)
MapAssignReuse/Int32/4096-16             84.5µs ± 2%    50.8µs ± 1%   -39.93%  (p=0.000 n=10+9)
MapAssignReuse/Int32/8192-16              170µs ± 1%     103µs ± 2%   -39.52%  (p=0.000 n=10+8)
MapAssignReuse/Int32/65536-16            1.68ms ± 3%    1.12ms ± 2%   -33.25%  (p=0.000 n=10+10)
MapAssignReuse/Str/6-16                  96.8ns ± 3%   102.0ns ± 1%    +5.33%  (p=0.000 n=10+10)
MapAssignReuse/Str/12-16                  491ns ± 3%     202ns ± 1%   -58.99%  (p=0.000 n=8+9)
MapAssignReuse/Str/18-16                  500ns ± 4%     292ns ± 2%   -41.68%  (p=0.000 n=10+10)
MapAssignReuse/Str/24-16                 1.13µs ± 3%    0.40µs ± 1%   -64.33%  (p=0.000 n=9+10)
MapAssignReuse/Str/30-16                  783ns ± 5%     480ns ± 2%   -38.77%  (p=0.000 n=10+10)
MapAssignReuse/Str/64-16                 1.53µs ± 2%    1.02µs ± 2%   -33.00%  (p=0.000 n=10+10)
MapAssignReuse/Str/128-16                3.06µs ± 1%    2.04µs ± 2%   -33.22%  (p=0.000 n=10+10)
MapAssignReuse/Str/256-16                6.15µs ± 3%    4.08µs ± 2%   -33.59%  (p=0.000 n=9+10)
MapAssignReuse/Str/512-16                12.3µs ± 3%     8.2µs ± 1%   -33.49%  (p=0.000 n=10+10)
MapAssignReuse/Str/1024-16               25.3µs ± 2%    17.1µs ± 2%   -32.55%  (p=0.000 n=10+10)
MapAssignReuse/Str/2048-16               50.7µs ± 3%    34.6µs ± 2%   -31.79%  (p=0.000 n=10+10)
MapAssignReuse/Str/4096-16                103µs ± 2%      70µs ± 2%   -31.98%  (p=0.000 n=10+10)
MapAssignReuse/Str/8192-16                218µs ± 1%     157µs ± 2%   -28.32%  (p=0.000 n=9+10)
MapAssignReuse/Str/65536-16              2.03ms ± 1%    1.64ms ± 2%   -19.62%  (p=0.000 n=10+9)

name old alloc/op new alloc/op delta
MapIter/Int/6-16 0.00B 0.00B ~ (all equal)
MapIter/Int/12-16 0.00B 0.00B ~ (all equal)
MapIter/Int/18-16 0.00B 0.00B ~ (all equal)
MapIter/Int/24-16 0.00B 0.00B ~ (all equal)
MapIter/Int/30-16 0.00B 0.00B ~ (all equal)
MapIter/Int/64-16 0.00B 0.00B ~ (all equal)
MapIter/Int/128-16 0.00B 0.00B ~ (all equal)
MapIter/Int/256-16 0.00B 0.00B ~ (all equal)
MapIter/Int/512-16 0.00B 0.00B ~ (all equal)
MapIter/Int/1024-16 0.00B 0.00B ~ (all equal)
MapIter/Int/2048-16 0.00B 0.00B ~ (all equal)
MapIter/Int/4096-16 0.00B 0.00B ~ (all equal)
MapIter/Int/8192-16 0.00B 0.00B ~ (all equal)
MapIter/Int/65536-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int64/6-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int64/12-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int64/18-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int64/24-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int64/30-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int64/64-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int64/128-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int64/256-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int64/512-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int64/1024-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int64/2048-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int64/4096-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int64/8192-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int64/65536-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int32/6-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int32/12-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int32/18-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int32/24-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int32/30-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int32/64-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int32/128-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int32/256-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int32/512-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int32/1024-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int32/2048-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int32/4096-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int32/8192-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Int32/65536-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Str/6-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Str/12-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Str/18-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Str/24-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Str/30-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Str/64-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Str/128-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Str/256-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Str/512-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Str/1024-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Str/2048-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Str/4096-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Str/8192-16 0.00B 0.00B ~ (all equal)
MapAccessHit/Str/65536-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int64/6-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int64/12-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int64/18-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int64/24-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int64/30-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int64/64-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int64/128-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int64/256-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int64/512-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int64/1024-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int64/2048-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int64/4096-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int64/8192-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int64/65536-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int32/6-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int32/12-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int32/18-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int32/24-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int32/30-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int32/64-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int32/128-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int32/256-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int32/512-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int32/1024-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int32/2048-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int32/4096-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int32/8192-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Int32/65536-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Str/6-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Str/12-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Str/18-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Str/24-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Str/30-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Str/64-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Str/128-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Str/256-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Str/512-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Str/1024-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Str/2048-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Str/4096-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Str/8192-16 0.00B 0.00B ~ (all equal)
MapAccessMiss/Str/65536-16 0.00B 0.00B ~ (all equal)
MapAssignGrow/Int64/6-16 0.00B 0.00B ~ (all equal)
MapAssignGrow/Int64/12-16 317B ± 0% 288B ± 0% -9.15% (p=0.000 n=10+10)
MapAssignGrow/Int64/18-16 931B ± 0% 864B ± 0% -7.20% (p=0.000 n=10+10)
MapAssignGrow/Int64/24-16 1.01kB ± 0% 0.86kB ± 0% -14.37% (p=0.000 n=9+10)
MapAssignGrow/Int64/30-16 2.22kB ± 0% 2.02kB ± 0% -9.13% (p=0.000 n=10+10)
MapAssignGrow/Int64/64-16 5.18kB ± 0% 4.32kB ± 0% -16.55% (p=0.000 n=10+10)
MapAssignGrow/Int64/128-16 10.8kB ± 0% 9.2kB ± 0% -15.18% (p=0.000 n=10+10)
MapAssignGrow/Int64/256-16 21.5kB ± 0% 18.7kB ± 0% -13.15% (p=0.000 n=10+10)
MapAssignGrow/Int64/512-16 43.2kB ± 0% 37.1kB ± 0% -14.11% (p=0.000 n=10+10)
MapAssignGrow/Int64/1024-16 86.6kB ± 0% 78.0kB ± 0% -9.84% (p=0.000 n=10+10)
MapAssignGrow/Int64/2048-16 173kB ± 0% 152kB ± 0% -12.43% (p=0.000 n=10+10)
MapAssignGrow/Int64/4096-16 347kB ± 0% 299kB ± 0% -13.72% (p=0.000 n=10+8)
MapAssignGrow/Int64/8192-16 685kB ± 0% 594kB ± 0% -13.32% (p=0.000 n=10+8)
MapAssignGrow/Int64/65536-16 5.45MB ± 0% 4.72MB ± 0% -13.26% (p=0.000 n=10+10)
MapAssignGrow/Int32/6-16 0.00B 0.00B ~ (all equal)
MapAssignGrow/Int32/12-16 248B ± 0% 224B ± 0% -9.68% (p=0.000 n=10+10)
MapAssignGrow/Int32/18-16 728B ± 0% 672B ± 0% -7.69% (p=0.000 n=10+10)
MapAssignGrow/Int32/24-16 793B ± 0% 672B ± 0% -15.26% (p=0.000 n=10+10)
MapAssignGrow/Int32/30-16 1.74kB ± 0% 1.57kB ± 0% -9.70% (p=0.000 n=10+10)
MapAssignGrow/Int32/64-16 4.01kB ± 0% 3.36kB ± 0% -16.15% (p=0.000 n=9+10)
MapAssignGrow/Int32/128-16 8.34kB ± 0% 7.46kB ± 0% -10.56% (p=0.000 n=10+10)
MapAssignGrow/Int32/256-16 17.0kB ± 0% 15.6kB ± 0% -7.90% (p=0.000 n=10+10)
MapAssignGrow/Int32/512-16 34.2kB ± 0% 30.0kB ± 0% -12.26% (p=0.000 n=10+10)
MapAssignGrow/Int32/1024-16 68.5kB ± 0% 58.7kB ± 0% -14.39% (p=0.000 n=10+10)
MapAssignGrow/Int32/2048-16 137kB ± 0% 116kB ± 0% -15.43% (p=0.000 n=10+10)
MapAssignGrow/Int32/4096-16 266kB ± 0% 231kB ± 0% -13.33% (p=0.000 n=10+10)
MapAssignGrow/Int32/8192-16 532kB ± 0% 460kB ± 0% -13.57% (p=0.000 n=10+10)
MapAssignGrow/Int32/65536-16 4.25MB ± 0% 3.67MB ± 0% -13.70% (p=0.000 n=10+8)
MapAssignGrow/Str/6-16 0.00B 0.00B ~ (all equal)
MapAssignGrow/Str/12-16 446B ± 0% 416B ± 0% -6.73% (p=0.000 n=10+10)
MapAssignGrow/Str/18-16 1.38kB ± 0% 1.31kB ± 0% -5.13% (p=0.000 n=10+10)
MapAssignGrow/Str/24-16 1.47kB ± 0% 1.31kB ± 0% -10.63% (p=0.000 n=10+10)
MapAssignGrow/Str/30-16 3.32kB ± 0% 3.10kB ± 0% -6.62% (p=0.000 n=10+10)
MapAssignGrow/Str/64-16 7.76kB ± 0% 6.56kB ± 0% -15.42% (p=0.000 n=10+10)
MapAssignGrow/Str/128-16 16.1kB ± 0% 13.3kB ± 0% -16.91% (p=0.000 n=10+10)
MapAssignGrow/Str/256-16 30.5kB ± 0% 26.9kB ± 0% -11.71% (p=0.000 n=10+10)
MapAssignGrow/Str/512-16 61.1kB ± 0% 54.2kB ± 0% -11.30% (p=0.000 n=10+10)
MapAssignGrow/Str/1024-16 122kB ± 0% 112kB ± 0% -8.66% (p=0.000 n=10+10)
MapAssignGrow/Str/2048-16 244kB ± 0% 218kB ± 0% -10.63% (p=0.000 n=10+10)
MapAssignGrow/Str/4096-16 487kB ± 0% 431kB ± 0% -11.58% (p=0.000 n=10+10)
MapAssignGrow/Str/8192-16 974kB ± 0% 857kB ± 0% -12.05% (p=0.000 n=10+9)
MapAssignGrow/Str/65536-16 7.74MB ± 0% 6.82MB ± 0% -11.89% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/6-16 48.0B ± 0% 48.0B ± 0% ~ (all equal)
MapAssignPreAllocate/Pointer/12-16 405B ± 0% 384B ± 0% -5.09% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/18-16 731B ± 0% 720B ± 0% -1.50% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/24-16 837B ± 0% 768B ± 0% -8.24% (p=0.000 n=9+10)
MapAssignPreAllocate/Pointer/30-16 1.40kB ± 0% 1.39kB ± 0% -0.71% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/64-16 3.22kB ± 0% 2.82kB ± 0% -12.66% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/128-16 6.42kB ± 0% 5.89kB ± 0% -8.34% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/256-16 12.3kB ± 0% 11.5kB ± 0% -6.43% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/512-16 24.6kB ± 0% 22.5kB ± 0% -8.42% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/1024-16 49.2kB ± 0% 49.2kB ± 0% -0.05% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/2048-16 98.3kB ± 0% 90.1kB ± 0% -8.36% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/4096-16 197kB ± 0% 180kB ± 0% -8.34% (p=0.000 n=9+10)
MapAssignPreAllocate/Pointer/8192-16 385kB ± 0% 360kB ± 0% -6.39% (p=0.000 n=10+8)
MapAssignPreAllocate/Pointer/65536-16 3.03MB ± 0% 2.88MB ± 0% -4.87% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/6-16 0.00B 0.00B ~ (all equal)
MapAssignPreAllocate/Int64/12-16 317B ± 0% 288B ± 0% -9.15% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/18-16 591B ± 0% 576B ± 0% -2.54% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/24-16 672B ± 0% 576B ± 0% -14.29% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/30-16 1.17kB ± 0% 1.15kB ± 0% -1.20% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/64-16 2.72kB ± 0% 2.30kB ± 0% -15.29% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/128-16 5.42kB ± 0% 4.86kB ± 0% -10.23% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/256-16 10.3kB ± 0% 9.5kB ± 0% -8.03% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/512-16 20.6kB ± 0% 18.4kB ± 0% -10.39% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/1024-16 41.1kB ± 0% 41.0kB ± 0% -0.37% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/2048-16 82.2kB ± 0% 73.7kB ± 0% -10.30% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/4096-16 164kB ± 0% 147kB ± 0% -10.29% (p=0.006 n=7+9)
MapAssignPreAllocate/Int64/8192-16 321kB ± 0% 295kB ± 0% -7.99% (p=0.000 n=10+9)
MapAssignPreAllocate/Int64/65536-16 2.51MB ± 0% 2.36MB ± 0% -6.19% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/6-16 0.00B 0.00B ~ (all equal)
MapAssignPreAllocate/Int32/12-16 248B ± 0% 224B ± 0% -9.68% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/18-16 461B ± 0% 448B ± 0% -2.82% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/24-16 528B ± 0% 448B ± 0% -15.15% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/30-16 908B ± 0% 896B ± 0% -1.32% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/64-16 2.08kB ± 0% 1.79kB ± 0% -13.85% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/128-16 4.14kB ± 0% 4.10kB ± 0% -1.01% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/256-16 8.25kB ± 0% 8.19kB ± 0% -0.71% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/512-16 16.5kB ± 0% 14.3kB ± 0% -12.98% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/1024-16 32.9kB ± 0% 28.7kB ± 0% -12.91% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/2048-16 65.8kB ± 0% 57.3kB ± 0% -12.87% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/4096-16 123kB ± 0% 115kB ± 0% -7.07% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/8192-16 247kB ± 0% 229kB ± 0% -7.06% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/65536-16 1.96MB ± 0% 1.84MB ± 0% -6.28% (p=0.000 n=10+9)
MapAssignPreAllocate/Str/6-16 0.00B 0.00B ~ (all equal)
MapAssignPreAllocate/Str/12-16 446B ± 0% 416B ± 0% -6.73% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/18-16 912B ± 0% 896B ± 0% -1.75% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/24-16 1.00kB ± 0% 0.90kB ± 0% -10.04% (p=0.002 n=8+10)
MapAssignPreAllocate/Str/30-16 1.81kB ± 0% 1.79kB ± 0% -0.81% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/64-16 4.12kB ± 0% 3.46kB ± 0% -16.12% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/128-16 8.22kB ± 0% 6.78kB ± 0% -17.43% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/256-16 14.4kB ± 0% 13.6kB ± 0% -5.52% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/512-16 28.7kB ± 0% 27.3kB ± 0% -4.99% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/1024-16 57.4kB ± 0% 57.3kB ± 0% -0.04% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/2048-16 115kB ± 0% 106kB ± 0% -7.16% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/4096-16 229kB ± 0% 213kB ± 0% -7.15% (p=0.000 n=10+8)
MapAssignPreAllocate/Str/8192-16 459kB ± 0% 426kB ± 0% -7.15% (p=0.001 n=8+9)
MapAssignPreAllocate/Str/65536-16 3.62MB ± 0% 3.41MB ± 0% -5.88% (p=0.000 n=9+10)
MapAssignReuse/Pointer/6-16 48.0B ± 0% 48.0B ± 0% ~ (all equal)
MapAssignReuse/Pointer/12-16 117B ± 1% 96B ± 0% -17.74% (p=0.000 n=10+10)
MapAssignReuse/Pointer/18-16 155B ± 0% 144B ± 0% -7.10% (p=0.000 n=10+10)
MapAssignReuse/Pointer/24-16 261B ± 0% 192B ± 0% -26.52% (p=0.000 n=10+10)
MapAssignReuse/Pointer/30-16 250B ± 0% 240B ± 0% -4.00% (p=0.000 n=10+10)
MapAssignReuse/Pointer/64-16 512B ± 0% 512B ± 0% ~ (all equal)
MapAssignReuse/Pointer/128-16 1.02kB ± 0% 1.02kB ± 0% ~ (all equal)
MapAssignReuse/Pointer/256-16 2.05kB ± 0% 2.05kB ± 0% ~ (all equal)
MapAssignReuse/Pointer/512-16 4.10kB ± 0% 4.10kB ± 0% ~ (all equal)
MapAssignReuse/Pointer/1024-16 8.19kB ± 0% 8.19kB ± 0% ~ (all equal)
MapAssignReuse/Pointer/2048-16 16.4kB ± 0% 16.4kB ± 0% ~ (all equal)
MapAssignReuse/Pointer/4096-16 32.8kB ± 0% 32.8kB ± 0% ~ (all equal)
MapAssignReuse/Pointer/8192-16 65.5kB ± 0% 65.5kB ± 0% ~ (all equal)
MapAssignReuse/Pointer/65536-16 524kB ± 0% 524kB ± 0% ~ (p=0.137 n=10+8)
MapAssignReuse/Int64/6-16 0.00B 0.00B ~ (all equal)
MapAssignReuse/Int64/12-16 25.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/18-16 13.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/24-16 85.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/30-16 12.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/64-16 8.00B ± 0% 0.00B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/128-16 18.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/256-16 34.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/512-16 66.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/1024-16 128B ± 0% 0B -100.00% (p=0.000 n=9+10)
MapAssignReuse/Int64/2048-16 252B ± 0% 0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/4096-16 506B ± 0% 0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/8192-16 1.03kB ± 0% 0.00kB -100.00% (p=0.000 n=9+10)
MapAssignReuse/Int64/65536-16 8.21kB ± 0% 0.00kB -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/6-16 0.00B 0.00B ~ (all equal)
MapAssignReuse/Int32/12-16 20.7B ± 3% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/18-16 11.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/24-16 69.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/30-16 10.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/64-16 8.00B ± 0% 0.00B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/128-16 18.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/256-16 34.0B ± 0% 0.0B -100.00% (p=0.002 n=8+10)
MapAssignReuse/Int32/512-16 66.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/1024-16 128B ± 0% 0B -100.00% (p=0.000 n=9+10)
MapAssignReuse/Int32/2048-16 252B ± 0% 0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/4096-16 506B ± 0% 0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/8192-16 1.03kB ± 0% 0.00kB -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/65536-16 8.21kB ± 0% 0.00kB -100.00% (p=0.000 n=10+10)
MapAssignReuse/Str/6-16 0.00B 0.00B ~ (all equal)
MapAssignReuse/Str/12-16 30.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Str/18-16 16.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Str/24-16 100B ± 0% 0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Str/30-16 14.7B ± 5% 0.0B -100.00% (p=0.000 n=10+10)
MapAssignReuse/Str/64-16 0.00B 0.00B ~ (all equal)
MapAssignReuse/Str/128-16 0.00B 0.00B ~ (all equal)
MapAssignReuse/Str/256-16 0.30B ±233% 0.00B ~ (p=0.211 n=10+10)
MapAssignReuse/Str/512-16 0.00B 0.00B ~ (all equal)
MapAssignReuse/Str/1024-16 0.00B 0.00B ~ (all equal)
MapAssignReuse/Str/2048-16 0.00B 0.00B ~ (all equal)
MapAssignReuse/Str/4096-16 0.00B 0.00B ~ (all equal)
MapAssignReuse/Str/8192-16 0.00B 0.00B ~ (all equal)
MapAssignReuse/Str/65536-16 0.00B 0.00B ~ (all equal)

name old allocs/op new allocs/op delta
MapIter/Int/6-16 0.00 0.00 ~ (all equal)
MapIter/Int/12-16 0.00 0.00 ~ (all equal)
MapIter/Int/18-16 0.00 0.00 ~ (all equal)
MapIter/Int/24-16 0.00 0.00 ~ (all equal)
MapIter/Int/30-16 0.00 0.00 ~ (all equal)
MapIter/Int/64-16 0.00 0.00 ~ (all equal)
MapIter/Int/128-16 0.00 0.00 ~ (all equal)
MapIter/Int/256-16 0.00 0.00 ~ (all equal)
MapIter/Int/512-16 0.00 0.00 ~ (all equal)
MapIter/Int/1024-16 0.00 0.00 ~ (all equal)
MapIter/Int/2048-16 0.00 0.00 ~ (all equal)
MapIter/Int/4096-16 0.00 0.00 ~ (all equal)
MapIter/Int/8192-16 0.00 0.00 ~ (all equal)
MapIter/Int/65536-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int64/6-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int64/12-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int64/18-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int64/24-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int64/30-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int64/64-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int64/128-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int64/256-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int64/512-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int64/1024-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int64/2048-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int64/4096-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int64/8192-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int64/65536-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int32/6-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int32/12-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int32/18-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int32/24-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int32/30-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int32/64-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int32/128-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int32/256-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int32/512-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int32/1024-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int32/2048-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int32/4096-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int32/8192-16 0.00 0.00 ~ (all equal)
MapAccessHit/Int32/65536-16 0.00 0.00 ~ (all equal)
MapAccessHit/Str/6-16 0.00 0.00 ~ (all equal)
MapAccessHit/Str/12-16 0.00 0.00 ~ (all equal)
MapAccessHit/Str/18-16 0.00 0.00 ~ (all equal)
MapAccessHit/Str/24-16 0.00 0.00 ~ (all equal)
MapAccessHit/Str/30-16 0.00 0.00 ~ (all equal)
MapAccessHit/Str/64-16 0.00 0.00 ~ (all equal)
MapAccessHit/Str/128-16 0.00 0.00 ~ (all equal)
MapAccessHit/Str/256-16 0.00 0.00 ~ (all equal)
MapAccessHit/Str/512-16 0.00 0.00 ~ (all equal)
MapAccessHit/Str/1024-16 0.00 0.00 ~ (all equal)
MapAccessHit/Str/2048-16 0.00 0.00 ~ (all equal)
MapAccessHit/Str/4096-16 0.00 0.00 ~ (all equal)
MapAccessHit/Str/8192-16 0.00 0.00 ~ (all equal)
MapAccessHit/Str/65536-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int64/6-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int64/12-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int64/18-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int64/24-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int64/30-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int64/64-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int64/128-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int64/256-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int64/512-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int64/1024-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int64/2048-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int64/4096-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int64/8192-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int64/65536-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int32/6-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int32/12-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int32/18-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int32/24-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int32/30-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int32/64-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int32/128-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int32/256-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int32/512-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int32/1024-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int32/2048-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int32/4096-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int32/8192-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Int32/65536-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Str/6-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Str/12-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Str/18-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Str/24-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Str/30-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Str/64-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Str/128-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Str/256-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Str/512-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Str/1024-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Str/2048-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Str/4096-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Str/8192-16 0.00 0.00 ~ (all equal)
MapAccessMiss/Str/65536-16 0.00 0.00 ~ (all equal)
MapAssignGrow/Int64/6-16 0.00 0.00 ~ (all equal)
MapAssignGrow/Int64/12-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapAssignGrow/Int64/18-16 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.000 n=10+10)
MapAssignGrow/Int64/24-16 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignGrow/Int64/30-16 6.00 ± 0% 3.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignGrow/Int64/64-16 12.0 ± 0% 4.0 ± 0% -66.67% (p=0.000 n=10+10)
MapAssignGrow/Int64/128-16 19.0 ± 0% 5.0 ± 0% -73.68% (p=0.000 n=10+10)
MapAssignGrow/Int64/256-16 27.0 ± 0% 6.0 ± 0% -77.78% (p=0.000 n=10+10)
MapAssignGrow/Int64/512-16 42.0 ± 0% 7.0 ± 0% -83.33% (p=0.000 n=10+10)
MapAssignGrow/Int64/1024-16 64.0 ± 0% 8.0 ± 0% -87.50% (p=0.000 n=10+10)
MapAssignGrow/Int64/2048-16 100 ± 1% 9 ± 0% -90.96% (p=0.000 n=10+10)
MapAssignGrow/Int64/4096-16 162 ± 0% 10 ± 0% -93.81% (p=0.000 n=10+10)
MapAssignGrow/Int64/8192-16 274 ± 0% 11 ± 0% -95.99% (p=0.000 n=10+10)
MapAssignGrow/Int64/65536-16 2.35k ± 0% 0.01k ± 0% -99.40% (p=0.000 n=10+10)
MapAssignGrow/Int32/6-16 0.00 0.00 ~ (all equal)
MapAssignGrow/Int32/12-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapAssignGrow/Int32/18-16 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.000 n=10+10)
MapAssignGrow/Int32/24-16 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignGrow/Int32/30-16 6.00 ± 0% 3.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignGrow/Int32/64-16 12.0 ± 0% 4.0 ± 0% -66.67% (p=0.000 n=10+10)
MapAssignGrow/Int32/128-16 19.0 ± 0% 5.0 ± 0% -73.68% (p=0.000 n=10+10)
MapAssignGrow/Int32/256-16 28.0 ± 0% 6.0 ± 0% -78.57% (p=0.000 n=10+10)
MapAssignGrow/Int32/512-16 41.0 ± 0% 7.0 ± 0% -82.93% (p=0.000 n=10+10)
MapAssignGrow/Int32/1024-16 59.0 ± 0% 8.0 ± 0% -86.44% (p=0.000 n=10+10)
MapAssignGrow/Int32/2048-16 86.5 ± 1% 9.0 ± 0% -89.60% (p=0.000 n=10+10)
MapAssignGrow/Int32/4096-16 131 ± 1% 10 ± 0% -92.38% (p=0.000 n=10+10)
MapAssignGrow/Int32/8192-16 284 ± 0% 11 ± 0% -96.13% (p=0.002 n=8+10)
MapAssignGrow/Int32/65536-16 2.37k ± 0% 0.01k ± 0% -99.41% (p=0.000 n=10+10)
MapAssignGrow/Str/6-16 0.00 0.00 ~ (all equal)
MapAssignGrow/Str/12-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapAssignGrow/Str/18-16 2.00 ± 0% 2.00 ± 0% ~ (all equal)
MapAssignGrow/Str/24-16 2.00 ± 0% 2.00 ± 0% ~ (all equal)
MapAssignGrow/Str/30-16 4.00 ± 0% 3.00 ± 0% -25.00% (p=0.000 n=10+10)
MapAssignGrow/Str/64-16 7.00 ± 0% 4.00 ± 0% -42.86% (p=0.000 n=10+10)
MapAssignGrow/Str/128-16 9.00 ± 0% 5.00 ± 0% -44.44% (p=0.000 n=10+10)
MapAssignGrow/Str/256-16 10.0 ± 0% 6.0 ± 0% -40.00% (p=0.000 n=10+10)
MapAssignGrow/Str/512-16 20.0 ± 0% 7.0 ± 0% -65.00% (p=0.000 n=10+10)
MapAssignGrow/Str/1024-16 39.0 ± 0% 8.0 ± 0% -79.49% (p=0.000 n=10+10)
MapAssignGrow/Str/2048-16 74.0 ± 0% 9.0 ± 0% -87.84% (p=0.000 n=10+10)
MapAssignGrow/Str/4096-16 143 ± 0% 10 ± 0% -93.01% (p=0.000 n=10+10)
MapAssignGrow/Str/8192-16 280 ± 0% 11 ± 0% -96.07% (p=0.002 n=8+10)
MapAssignGrow/Str/65536-16 2.33k ± 0% 0.01k ± 0% -99.40% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/6-16 6.00 ± 0% 6.00 ± 0% ~ (all equal)
MapAssignPreAllocate/Pointer/12-16 13.0 ± 0% 13.0 ± 0% ~ (all equal)
MapAssignPreAllocate/Pointer/18-16 19.0 ± 0% 19.0 ± 0% ~ (all equal)
MapAssignPreAllocate/Pointer/24-16 25.0 ± 0% 25.0 ± 0% ~ (all equal)
MapAssignPreAllocate/Pointer/30-16 31.0 ± 0% 31.0 ± 0% ~ (all equal)
MapAssignPreAllocate/Pointer/64-16 66.0 ± 0% 65.0 ± 0% -1.52% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/128-16 130 ± 0% 129 ± 0% -0.77% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/256-16 258 ± 0% 257 ± 0% -0.39% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/512-16 514 ± 0% 513 ± 0% -0.19% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/1024-16 1.03k ± 0% 1.02k ± 0% -0.10% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/2048-16 2.05k ± 0% 2.05k ± 0% -0.05% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/4096-16 4.10k ± 0% 4.10k ± 0% -0.02% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/8192-16 8.19k ± 0% 8.19k ± 0% -0.01% (p=0.000 n=10+10)
MapAssignPreAllocate/Pointer/65536-16 65.5k ± 0% 65.5k ± 0% -0.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/6-16 0.00 0.00 ~ (all equal)
MapAssignPreAllocate/Int64/12-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapAssignPreAllocate/Int64/18-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapAssignPreAllocate/Int64/24-16 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/30-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapAssignPreAllocate/Int64/64-16 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/128-16 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/256-16 4.00 ± 0% 1.00 ± 0% -75.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/512-16 5.00 ± 0% 1.00 ± 0% -80.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/1024-16 6.00 ± 0% 1.00 ± 0% -83.33% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/2048-16 7.00 ± 0% 1.00 ± 0% -85.71% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/4096-16 8.00 ± 0% 1.00 ± 0% -87.50% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/8192-16 9.00 ± 0% 1.00 ± 0% -88.89% (p=0.000 n=10+10)
MapAssignPreAllocate/Int64/65536-16 13.0 ± 0% 1.0 ± 0% -92.31% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/6-16 0.00 0.00 ~ (all equal)
MapAssignPreAllocate/Int32/12-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapAssignPreAllocate/Int32/18-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapAssignPreAllocate/Int32/24-16 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/30-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapAssignPreAllocate/Int32/64-16 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/128-16 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/256-16 4.00 ± 0% 1.00 ± 0% -75.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/512-16 5.00 ± 0% 1.00 ± 0% -80.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/1024-16 6.00 ± 0% 1.00 ± 0% -83.33% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/2048-16 7.00 ± 0% 1.00 ± 0% -85.71% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/4096-16 8.00 ± 0% 1.00 ± 0% -87.50% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/8192-16 9.00 ± 0% 1.00 ± 0% -88.89% (p=0.000 n=10+10)
MapAssignPreAllocate/Int32/65536-16 13.0 ± 0% 1.0 ± 0% -92.31% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/6-16 0.00 0.00 ~ (all equal)
MapAssignPreAllocate/Str/12-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapAssignPreAllocate/Str/18-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapAssignPreAllocate/Str/24-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapAssignPreAllocate/Str/30-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapAssignPreAllocate/Str/64-16 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/128-16 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/256-16 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/512-16 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/1024-16 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/2048-16 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/4096-16 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/8192-16 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignPreAllocate/Str/65536-16 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
MapAssignReuse/Pointer/6-16 6.00 ± 0% 6.00 ± 0% ~ (all equal)
MapAssignReuse/Pointer/12-16 12.0 ± 0% 12.0 ± 0% ~ (all equal)
MapAssignReuse/Pointer/18-16 18.0 ± 0% 18.0 ± 0% ~ (all equal)
MapAssignReuse/Pointer/24-16 24.0 ± 0% 24.0 ± 0% ~ (all equal)
MapAssignReuse/Pointer/30-16 30.0 ± 0% 30.0 ± 0% ~ (all equal)
MapAssignReuse/Pointer/64-16 64.0 ± 0% 64.0 ± 0% ~ (all equal)
MapAssignReuse/Pointer/128-16 128 ± 0% 128 ± 0% ~ (all equal)
MapAssignReuse/Pointer/256-16 256 ± 0% 256 ± 0% ~ (all equal)
MapAssignReuse/Pointer/512-16 512 ± 0% 512 ± 0% ~ (all equal)
MapAssignReuse/Pointer/1024-16 1.02k ± 0% 1.02k ± 0% ~ (all equal)
MapAssignReuse/Pointer/2048-16 2.05k ± 0% 2.05k ± 0% ~ (all equal)
MapAssignReuse/Pointer/4096-16 4.10k ± 0% 4.10k ± 0% ~ (all equal)
MapAssignReuse/Pointer/8192-16 8.19k ± 0% 8.19k ± 0% ~ (all equal)
MapAssignReuse/Pointer/65536-16 65.5k ± 0% 65.5k ± 0% ~ (all equal)
MapAssignReuse/Int64/6-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Int64/12-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Int64/18-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Int64/24-16 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/30-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Int64/64-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Int64/128-16 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/256-16 2.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/512-16 3.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/1024-16 4.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/2048-16 5.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/4096-16 6.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/8192-16 7.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int64/65536-16 11.0 ± 0% 0.0 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/6-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Int32/12-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Int32/18-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Int32/24-16 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/30-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Int32/64-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Int32/128-16 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/256-16 2.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/512-16 3.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/1024-16 4.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/2048-16 5.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/4096-16 6.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/8192-16 7.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Int32/65536-16 11.0 ± 0% 0.0 -100.00% (p=0.000 n=10+10)
MapAssignReuse/Str/6-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Str/12-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Str/18-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Str/24-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Str/30-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Str/64-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Str/128-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Str/256-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Str/512-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Str/1024-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Str/2048-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Str/4096-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Str/8192-16 0.00 0.00 ~ (all equal)
MapAssignReuse/Str/65536-16 0.00 0.00 ~ (all equal)

Benchmark-2

The benchmarks from runtime.

INFO
name                              old time/op    new time/op      delta
MegMap-16                           9.01ns ± 1%      8.83ns ± 1%       -2.00%  (p=0.000 n=8+8)
MegOneMap-16                        4.59ns ± 1%      9.21ns ± 2%     +100.52%  (p=0.000 n=9+10)
MegEqMap-16                         20.4µs ± 2%      20.1µs ± 2%       -1.49%  (p=0.011 n=9+9)
MegEmptyMap-16                      2.18ns ± 3%      2.06ns ± 1%       -5.42%  (p=0.000 n=10+10)
SmallStrMap-16                      8.27ns ± 1%      7.17ns ± 4%      -13.31%  (p=0.000 n=8+10)
MapStringKeysEight_16-16            6.89ns ± 3%      7.79ns ± 1%      +13.09%  (p=0.000 n=9+8)
MapStringKeysEight_32-16            7.75ns ± 5%      7.97ns ± 0%       +2.83%  (p=0.000 n=10+7)
MapStringKeysEight_64-16            7.66ns ± 3%      8.76ns ± 1%      +14.43%  (p=0.000 n=10+7)
MapStringKeysEight_1M-16            7.72ns ± 3%  16483.00ns ± 1%  +213363.36%  (p=0.000 n=10+8)
IntMap-16                           5.21ns ± 1%      6.42ns ±16%      +23.30%  (p=0.000 n=8+10)
MapFirst/1-16                       2.68ns ± 1%      2.45ns ± 1%       -8.78%  (p=0.000 n=10+9)
MapFirst/2-16                       2.68ns ± 1%      2.43ns ± 1%       -9.34%  (p=0.000 n=9+9)
MapFirst/3-16                       2.66ns ± 2%      2.44ns ± 1%       -7.98%  (p=0.000 n=10+9)
MapFirst/4-16                       2.68ns ± 1%      2.44ns ± 0%       -8.67%  (p=0.000 n=8+9)
MapFirst/5-16                       2.69ns ± 1%      2.45ns ± 0%       -8.73%  (p=0.000 n=8+8)
MapFirst/6-16                       2.68ns ± 0%      2.44ns ± 1%       -8.78%  (p=0.000 n=8+9)
MapFirst/7-16                       2.69ns ± 1%      2.45ns ± 1%       -9.10%  (p=0.000 n=10+9)
MapFirst/8-16                       2.67ns ± 1%      4.79ns ± 1%      +79.47%  (p=0.000 n=10+9)
MapFirst/9-16                       4.66ns ± 1%      4.75ns ± 2%       +1.98%  (p=0.000 n=10+9)
MapFirst/10-16                      4.68ns ± 2%      4.73ns ± 1%       +1.19%  (p=0.016 n=9+10)
MapFirst/11-16                      4.69ns ± 2%      4.78ns ± 1%       +1.93%  (p=0.000 n=9+8)
MapFirst/12-16                      4.64ns ± 2%      4.75ns ± 2%       +2.47%  (p=0.005 n=10+10)
MapFirst/13-16                      4.69ns ± 1%      4.78ns ± 1%       +1.86%  (p=0.000 n=8+9)
MapFirst/14-16                      4.68ns ± 2%      4.79ns ± 1%       +2.34%  (p=0.000 n=10+8)
MapFirst/15-16                      4.67ns ± 1%      6.90ns ± 2%      +47.88%  (p=0.000 n=8+10)
MapFirst/16-16                      4.65ns ± 2%      6.95ns ± 1%      +49.60%  (p=0.000 n=10+9)
MapMid/1-16                         2.68ns ± 1%      2.62ns ± 1%       -2.15%  (p=0.000 n=10+9)
MapMid/2-16                         3.23ns ± 1%      2.95ns ± 1%       -8.59%  (p=0.000 n=10+9)
MapMid/3-16                         3.21ns ± 1%      2.94ns ± 0%       -8.33%  (p=0.000 n=10+7)
MapMid/4-16                         3.64ns ± 2%      3.46ns ± 1%       -4.84%  (p=0.000 n=10+8)
MapMid/5-16                         3.67ns ± 1%      3.47ns ± 1%       -5.46%  (p=0.000 n=10+8)
MapMid/6-16                         4.15ns ± 2%      3.93ns ± 1%       -5.25%  (p=0.000 n=10+9)
MapMid/7-16                         4.19ns ± 1%      3.91ns ± 1%       -6.55%  (p=0.000 n=10+10)
MapMid/8-16                         4.84ns ± 2%      5.76ns ±11%      +19.02%  (p=0.000 n=10+10)
MapMid/9-16                         5.66ns ±10%      5.83ns ± 9%         ~     (p=0.393 n=10+10)
MapMid/10-16                        5.81ns ± 5%      6.20ns ± 4%       +6.67%  (p=0.001 n=8+7)
MapMid/11-16                        5.80ns ± 4%      6.17ns ±11%         ~     (p=0.278 n=9+10)
MapMid/12-16                        5.87ns ±10%      6.26ns ± 2%       +6.48%  (p=0.008 n=9+7)
MapMid/13-16                        6.15ns ±13%      5.92ns ± 8%         ~     (p=0.315 n=10+8)
MapMid/14-16                        5.29ns ± 8%      6.18ns ±14%      +16.89%  (p=0.001 n=9+10)
MapMid/15-16                        5.44ns ±12%      7.04ns ± 2%      +29.37%  (p=0.000 n=10+10)
MapMid/16-16                        5.91ns ±11%      7.18ns ± 3%      +21.44%  (p=0.000 n=10+9)
MapLast/1-16                        2.69ns ± 1%      2.64ns ± 2%       -2.00%  (p=0.000 n=8+10)
MapLast/2-16                        3.22ns ± 2%      2.93ns ± 1%       -9.05%  (p=0.000 n=9+10)
MapLast/3-16                        3.66ns ± 2%      3.47ns ± 1%       -5.17%  (p=0.000 n=10+8)
MapLast/4-16                        4.20ns ± 1%      3.88ns ± 2%       -7.54%  (p=0.000 n=8+9)
MapLast/5-16                        4.85ns ± 2%      4.39ns ± 2%       -9.53%  (p=0.000 n=9+9)
MapLast/6-16                        5.33ns ± 1%      5.10ns ± 8%       -4.27%  (p=0.017 n=9+10)
MapLast/7-16                        5.79ns ± 1%      5.33ns ± 2%       -7.93%  (p=0.000 n=9+8)
MapLast/8-16                        6.28ns ± 1%      6.55ns ±17%         ~     (p=0.704 n=9+10)
MapLast/9-16                        6.90ns ±18%      6.87ns ±12%         ~     (p=1.000 n=9+9)
MapLast/10-16                       6.83ns ±15%      7.32ns ± 7%       +7.10%  (p=0.010 n=9+8)
MapLast/11-16                       7.84ns ± 3%      7.46ns ± 5%       -4.78%  (p=0.006 n=7+8)
MapLast/12-16                       7.58ns ± 9%      7.58ns ±12%         ~     (p=0.743 n=8+9)
MapLast/13-16                       9.15ns ±25%     14.06ns ± 2%      +53.70%  (p=0.000 n=10+8)
MapLast/14-16                       6.31ns ±15%      7.95ns ± 8%      +25.94%  (p=0.000 n=9+8)
MapLast/15-16                       6.20ns ±16%      7.11ns ± 2%      +14.59%  (p=0.004 n=10+10)
MapLast/16-16                       6.78ns ±19%      7.19ns ± 1%         ~     (p=0.139 n=9+8)
MapCycle-16                         11.5ns ± 2%      15.8ns ± 2%      +36.83%  (p=0.000 n=10+10)
RepeatedLookupStrMapKey32-16        9.37ns ± 2%      7.81ns ± 1%      -16.67%  (p=0.000 n=9+9)
RepeatedLookupStrMapKey1M-16        16.6µs ± 1%      16.6µs ± 2%         ~     (p=0.604 n=10+9)
MakeMap/[Byte]Byte-16                120ns ± 1%       116ns ± 1%       -3.61%  (p=0.000 n=10+10)
MakeMap/[Int]Int-16                  164ns ± 1%       160ns ± 0%       -2.36%  (p=0.000 n=10+9)
NewEmptyMap-16                      4.32ns ± 1%      4.55ns ± 3%       +5.26%  (p=0.000 n=10+10)
NewSmallMap-16                      19.4ns ± 2%      24.1ns ± 1%      +24.31%  (p=0.000 n=10+10)
MapIter-16                          73.7ns ± 2%      82.3ns ± 3%      +11.60%  (p=0.000 n=10+10)
MapIterEmpty-16                     3.65ns ± 1%      4.11ns ± 2%      +12.80%  (p=0.000 n=9+9)
SameLengthMap-16                    3.14ns ± 2%      3.38ns ± 2%       +7.50%  (p=0.000 n=9+10)
BigKeyMap-16                        10.0ns ± 4%      11.9ns ± 1%      +18.84%  (p=0.000 n=10+10)
BigValMap-16                        10.0ns ± 3%      11.9ns ± 2%      +18.74%  (p=0.000 n=10+10)
SmallKeyMap-16                      8.51ns ± 1%      9.91ns ± 1%      +16.44%  (p=0.000 n=10+9)
MapPopulate/1-16                    10.4ns ± 1%      14.4ns ± 2%      +38.28%  (p=0.000 n=9+9)
MapPopulate/10-16                    608ns ± 1%       495ns ± 1%      -18.58%  (p=0.000 n=10+10)
MapPopulate/100-16                  9.32µs ± 1%      6.28µs ± 1%      -32.59%  (p=0.000 n=9+10)
MapPopulate/1000-16                  111µs ± 0%        86µs ± 0%      -22.28%  (p=0.000 n=10+10)
MapPopulate/10000-16                 965µs ± 0%       745µs ± 1%      -22.78%  (p=0.000 n=10+10)
MapPopulate/100000-16               10.4ms ± 1%       7.4ms ± 0%      -28.46%  (p=0.000 n=10+10)
ComplexAlgMap-16                    20.6ns ± 1%      22.9ns ± 1%      +10.86%  (p=0.000 n=9+10)
GoMapClear/Reflexive/1-16           19.3ns ± 1%      17.9ns ± 1%       -7.39%  (p=0.000 n=10+10)
GoMapClear/Reflexive/10-16          21.2ns ± 1%      18.9ns ± 2%      -10.91%  (p=0.000 n=10+9)
GoMapClear/Reflexive/100-16         37.3ns ± 1%      40.2ns ± 1%       +7.90%  (p=0.000 n=9+9)
GoMapClear/Reflexive/1000-16         345ns ± 2%       433ns ± 1%      +25.73%  (p=0.000 n=10+10)
GoMapClear/Reflexive/10000-16       2.62µs ± 1%      3.94µs ± 1%      +50.33%  (p=0.000 n=10+10)
GoMapClear/NonReflexive/1-16        80.2ns ± 2%      62.8ns ± 2%      -21.61%  (p=0.000 n=9+9)
GoMapClear/NonReflexive/10-16       92.2ns ± 2%      72.5ns ± 2%      -21.42%  (p=0.000 n=10+10)
GoMapClear/NonReflexive/100-16       216ns ± 2%       101ns ± 1%      -53.49%  (p=0.000 n=9+8)
GoMapClear/NonReflexive/1000-16     2.13µs ± 1%      0.34µs ± 2%      -84.02%  (p=0.000 n=10+9)
GoMapClear/NonReflexive/10000-16    16.4µs ± 1%       2.1µs ± 1%      -86.96%  (p=0.000 n=10+8)
MapStringConversion/32/simple-16    7.79ns ± 1%     12.22ns ± 1%      +56.92%  (p=0.000 n=9+9)
MapStringConversion/32/struct-16    7.72ns ± 1%     12.32ns ± 2%      +59.70%  (p=0.000 n=9+10)
MapStringConversion/32/array-16     7.72ns ± 3%     12.00ns ± 3%      +55.34%  (p=0.000 n=10+9)
MapStringConversion/64/simple-16    7.47ns ± 1%     11.77ns ± 3%      +57.65%  (p=0.000 n=9+10)
MapStringConversion/64/struct-16    7.49ns ± 1%     11.94ns ± 2%      +59.38%  (p=0.000 n=9+10)
MapStringConversion/64/array-16     7.50ns ± 1%     11.92ns ± 2%      +58.89%  (p=0.000 n=9+10)
MapInterfaceString-16               11.2ns ± 5%      13.7ns ±61%         ~     (p=1.000 n=8+10)
MapInterfacePtr-16                  10.8ns ±29%      11.3ns ±52%         ~     (p=0.684 n=10+10)
NewEmptyMapHintLessThan8-16         6.54ns ± 1%      6.04ns ± 1%       -7.56%  (p=0.000 n=9+9)
NewEmptyMapHintGreaterThan8-16       271ns ± 2%       268ns ± 1%         ~     (p=0.062 n=10+9)
MapPop100-16                        10.8µs ± 6%       7.2µs ±14%      -33.41%  (p=0.000 n=10+10)
MapPop1000-16                        170µs ± 1%       101µs ± 1%      -40.54%  (p=0.000 n=9+9)
MapPop10000-16                      3.35ms ± 3%      1.74ms ± 4%      -48.15%  (p=0.000 n=10+9)
MapAssign/Int32/256-16              10.1ns ± 4%       8.9ns ± 1%      -12.13%  (p=0.000 n=10+9)
MapAssign/Int32/65536-16            22.0ns ± 1%      12.9ns ± 1%      -41.11%  (p=0.000 n=10+8)
MapAssign/Int64/256-16              10.7ns ± 2%       8.6ns ± 2%      -19.79%  (p=0.000 n=8+9)
MapAssign/Int64/65536-16            23.2ns ± 2%      13.1ns ± 2%      -43.39%  (p=0.000 n=10+9)
MapAssign/Str/256-16                16.6ns ± 4%      13.1ns ± 3%      -21.18%  (p=0.000 n=10+9)
MapAssign/Str/65536-16              27.5ns ± 3%      20.5ns ± 3%      -25.66%  (p=0.000 n=10+9)
MapOperatorAssign/Int32/256-16      9.89ns ± 2%      8.69ns ± 3%      -12.15%  (p=0.000 n=9+9)
MapOperatorAssign/Int32/65536-16    21.9ns ± 1%      12.8ns ± 3%      -41.52%  (p=0.000 n=10+10)
MapOperatorAssign/Int64/256-16      10.9ns ± 3%       8.6ns ± 1%      -21.20%  (p=0.000 n=9+8)
MapOperatorAssign/Int64/65536-16    22.4ns ± 0%      13.2ns ± 2%      -41.15%  (p=0.000 n=7+9)
MapOperatorAssign/Str/256-16        1.43µs ± 2%      1.40µs ± 1%       -1.93%  (p=0.000 n=10+10)
MapOperatorAssign/Str/65536-16       234ns ± 3%       237ns ± 4%         ~     (p=0.138 n=10+10)
MapAppendAssign/Int32/256-16        21.0ns ± 2%      20.1ns ± 4%       -4.43%  (p=0.000 n=8+10)
MapAppendAssign/Int32/65536-16      38.5ns ± 2%      33.9ns ± 2%      -12.00%  (p=0.000 n=10+10)
MapAppendAssign/Int64/256-16        21.4ns ± 3%      20.4ns ± 6%       -4.26%  (p=0.008 n=9+10)
MapAppendAssign/Int64/65536-16      39.8ns ± 2%      34.9ns ± 1%      -12.40%  (p=0.000 n=10+8)
MapAppendAssign/Str/256-16          52.8ns ± 7%      52.1ns ± 5%         ~     (p=0.739 n=10+10)
MapAppendAssign/Str/65536-16        66.5ns ± 2%      67.7ns ± 1%       +1.94%  (p=0.004 n=9+9)
MapDelete/Int32/100-16              27.7ns ± 2%      17.8ns ± 2%      -35.77%  (p=0.000 n=10+10)
MapDelete/Int32/1000-16             22.6ns ± 2%      12.8ns ± 1%      -43.31%  (p=0.000 n=9+9)
MapDelete/Int32/10000-16            25.0ns ± 2%      15.2ns ± 1%      -39.33%  (p=0.000 n=7+10)
MapDelete/Int64/100-16              29.8ns ± 3%      17.6ns ± 2%      -40.98%  (p=0.000 n=10+9)
MapDelete/Int64/1000-16             23.3ns ± 2%      13.1ns ± 1%      -43.77%  (p=0.000 n=10+8)
MapDelete/Int64/10000-16            25.6ns ± 2%      16.1ns ± 1%      -37.13%  (p=0.000 n=9+9)
MapDelete/Str/100-16                38.4ns ± 4%      24.8ns ± 2%      -35.29%  (p=0.000 n=10+10)
MapDelete/Str/1000-16               30.8ns ± 2%      20.7ns ± 1%      -32.72%  (p=0.000 n=10+10)
MapDelete/Str/10000-16              35.8ns ± 2%      24.4ns ± 2%      -31.95%  (p=0.000 n=10+10)
MapDelete/Pointer/100-16            30.7ns ± 2%      21.0ns ± 1%      -31.69%  (p=0.000 n=10+10)
MapDelete/Pointer/1000-16           24.5ns ± 2%      16.6ns ± 1%      -32.37%  (p=0.000 n=10+10)
MapDelete/Pointer/10000-16          27.1ns ± 3%      19.8ns ± 2%      -26.87%  (p=0.000 n=10+10)

name old alloc/op new alloc/op delta
NewEmptyMap-16 0.00B 0.00B ~ (all equal)
NewSmallMap-16 0.00B 0.00B ~ (all equal)
MapPopulate/1-16 0.00B 0.00B ~ (all equal)
MapPopulate/10-16 179B ± 0% 176B ± 0% -1.68% (p=0.000 n=10+10)
MapPopulate/100-16 3.35kB ± 0% 2.64kB ± 0% -21.16% (p=0.000 n=10+10)
MapPopulate/1000-16 53.3kB ± 0% 48.7kB ± 0% -8.62% (p=0.000 n=9+10)
MapPopulate/10000-16 428kB ± 0% 368kB ± 0% -13.87% (p=0.000 n=9+10)
MapPopulate/100000-16 3.62MB ± 0% 2.89MB ± 0% -20.08% (p=0.000 n=8+10)
MapStringConversion/32/simple-16 0.00B 0.00B ~ (all equal)
MapStringConversion/32/struct-16 0.00B 0.00B ~ (all equal)
MapStringConversion/32/array-16 0.00B 0.00B ~ (all equal)
MapStringConversion/64/simple-16 0.00B 0.00B ~ (all equal)
MapStringConversion/64/struct-16 0.00B 0.00B ~ (all equal)
MapStringConversion/64/array-16 0.00B 0.00B ~ (all equal)
NewEmptyMapHintLessThan8-16 0.00B 0.00B ~ (all equal)
NewEmptyMapHintGreaterThan8-16 1.15kB ± 0% 1.15kB ± 0% ~ (all equal)
MapAppendAssign/Int32/256-16 45.5B ± 3% 43.5B ±13% ~ (p=0.291 n=8+10)
MapAppendAssign/Int32/65536-16 18.3B ± 4% 16.7B ± 4% -8.74% (p=0.000 n=10+10)
MapAppendAssign/Int64/256-16 42.8B ±12% 44.5B ±15% ~ (p=0.466 n=10+10)
MapAppendAssign/Int64/65536-16 18.6B ± 8% 17.0B ± 0% -8.60% (p=0.000 n=10+9)
MapAppendAssign/Str/256-16 88.1B ± 4% 87.9B ± 4% ~ (p=1.000 n=10+10)
MapAppendAssign/Str/65536-16 35.0B ± 0% 33.4B ± 4% -4.57% (p=0.000 n=6+10)

name old allocs/op new allocs/op delta
NewEmptyMap-16 0.00 0.00 ~ (all equal)
NewSmallMap-16 0.00 0.00 ~ (all equal)
MapPopulate/1-16 0.00 0.00 ~ (all equal)
MapPopulate/10-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapPopulate/100-16 17.0 ± 0% 4.0 ± 0% -76.47% (p=0.000 n=10+10)
MapPopulate/1000-16 72.7 ± 1% 8.0 ± 0% -89.00% (p=0.000 n=10+10)
MapPopulate/10000-16 319 ± 0% 11 ± 0% -96.55% (p=0.002 n=8+10)
MapPopulate/100000-16 4.00k ± 0% 0.01k ± 0% -99.65% (p=0.000 n=10+10)
MapStringConversion/32/simple-16 0.00 0.00 ~ (all equal)
MapStringConversion/32/struct-16 0.00 0.00 ~ (all equal)
MapStringConversion/32/array-16 0.00 0.00 ~ (all equal)
MapStringConversion/64/simple-16 0.00 0.00 ~ (all equal)
MapStringConversion/64/struct-16 0.00 0.00 ~ (all equal)
MapStringConversion/64/array-16 0.00 0.00 ~ (all equal)
NewEmptyMapHintLessThan8-16 0.00 0.00 ~ (all equal)
NewEmptyMapHintGreaterThan8-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapAppendAssign/Int32/256-16 0.00 0.00 ~ (all equal)
MapAppendAssign/Int32/65536-16 0.00 0.00 ~ (all equal)
MapAppendAssign/Int64/256-16 0.00 0.00 ~ (all equal)
MapAppendAssign/Int64/65536-16 0.00 0.00 ~ (all equal)
MapAppendAssign/Str/256-16 0.00 0.00 ~ (all equal)
MapAppendAssign/Str/65536-16 0.00 0.00 ~ (all equal)

Change https://go.dev/cl/426614 mentions this issue: runtime: use SwissTable

cc @golang/runtime

cc @randall77

Overall this sounds great. The main concern I have up front here is that growing happens all at once rather than incrementally. Could you explain more why that is and if it is possible to do so incrementally?

Do iterators get invalidated when the map is modified? This is common for open-addressing hash tables. An important feature of the current map is that you can modify the map while iterating and the iterator still works.

Do iterators get invalidated when the map is modified? This is common for open-addressing hash tables. An important feature of the current map is that you can modify the map while iterating and the iterator still works.

I checked the implementation and it does look like iterators stay valid when the map is modified.

The semantics of Go iterators basically mean you can't move items in the map data structure once you've placed them. So, for instance, you can't move a valid item to fill in a deleted one - you need to always use a tombstone. But other than that, iterators don't add much of a restriction to the implementation.

This is really exciting. Thanks for doing all of this work.

The benchmarks from runtime.

There are many slowdowns in these benchmarks, some quite significant. Do you understand the source of these slowdowns? Is it something that can be fixed?

The previous implementation uses incremental rehash

This is an important property to maintain because it can significantly impact the tail latency of services. We've done a lot of work throughout the Go runtime over the years to avoid unpredictable performance spikes and it would be a shame to regress here.

I'm not personally familiar with the details of SwissTable. Is incremental rehashing something that can be done and just hasn't been implemented yet, or is it fundamentally difficult?

If it's fundamentally difficult to do directly in SwissTable, I've been kickaround the the idea of adopting aspects of extendible hashing for a while, which could be a general solution to incremental resizing. I should probably just file an "idea" issue about this, but I'll try to lay it out here.

The idea is that, once a hash table grows beyond some size threshold based on bounding how long it takes to resize a map, you switch to a two-level scheme. I'm guessing that threshold would be around 4MiB, but that would have to be determined experimentally. In the two-level scheme, the top level is an index array keyed by the top k bits of the hash, which then points to map shards, which are individual SwissTables. The Wikipedia article I linked has some good diagrams for visualizing this. Each shard s contains only the items whose top js <= k bits are all identical, and multiple index entries will point to the same shard if that shard's js < k. When an individual shard overflows the threshold size, you split just that shard into two new shards s1 and s2 with js1 = js2 = js+1, and update the index. If js1 > k, then you also double the index array in order to increase k by 1.

This would enable incremental growth and bound the resize cost paid by any single map operation. Iterators can continue to work while a map is resized by holding on to the pre-resize shard they are currently iterating (let the GC collect this once all iterators have moved past it). It would also reduce overall memory footprint because the total memory size of a map doesn't have to be a power of two (even if each individual shard is). It would result in less memory fragmentation caused by large maps because it limits the size of each individual allocation. And it would address problems with our current map resizing algorithm where, if a map is grown to a size that puts it into resizing mode, but then modifications stop and the application switches to only reading from the map, accesses are relatively slow and the map continues to consume additional memory (#51410).

I placed this in the Unplanned milestone for now since it seems like there's still a bunch to discuss before landing it (it's non-committal). I assigned it to @zhangyunhao116 since you're driving the effort at the moment.

Thanks for your advice! @aclements

We hope to get more feedback from the community and the Go team.

Do you understand the source of these slowdowns? Is it something that can be fixed?

Many reasons affect the performance degradation of these benchmarks(for basic version):

  • Most benchmarks from the Go runtime are running on a small map(contains very few elements). The original implementation has a feature called emptyRest that would break the current loop after encountering it. SwissTable does not have this feature. In this case, the previous implementation is better(the situation is usually reversed with the larger map. ). This affects the performance of BigKeyMap and MapStringConversion. In most cases, the larger the number of map elements, the better performance of the SwissTable.

  • The Go runtime's mapaccess benchmarks only test the hit case ( MapFirst MapMid MapLast). SwissTable tends to perform better in the case of Miss.

  • The growth of SwissTable is done at once, which means in some cases, the performance will be significantly degraded. For MapStringKeysEight, because it inserts a total of 8 elements if using SwissTable, now the map is composed of two buckets; this invalidates our optimization when the number of buckets is 1 in map_faststr.go. We can also observe a similar case in MapFirst, where there is a significant performance change when the number of buckets changes. Some of these problems may be solved by adding special optimizations, such as growing the capacity after the first bucket is full.

  • We removed some optimizations to avoid a CL containing many changes, such as

    • Store bucketmask instead of B. So that we don't need to calculate the bucketmask every time, it can save about 0.3ns each time, about 10% improvement for some benchmarks that consume less time.
    • Remove the overflow pointer in the bucket. It will help the overall performance and GC.

Is incremental rehashing something that can be done and just hasn't been implemented yet, or is it fundamentally difficult?

I think it is fundamentally difficult since a single empty slot in SwissTable can break the search process(https://faultlore.com/blah/hashbrown-tldr/#deletion).

If it's fundamentally difficult to do directly in SwissTable, I've been kickaround the the idea of adopting aspects of extendible hashing for a while, which could be a general solution to incremental resizing. I should probably just file an "idea" issue about this, but I'll try to lay it out here.

That sounds like a great idea! I'll think about this idea carefully later.
We plan to internally deploy this change to some latency-sensitive services to observe the impact. If there is a significant impact on the latency, we will also consider using different ways to solve it if possible.

I like to use small maps for many purposes, so it worries me a bit that SwissTable seems to perform worse for them. Maybe a special case is needed for small maps?

Maybe a special case is needed for small maps?

Yes, but it may have to consider carefully(optimizing the small map may cause performance impact in other situations).

SwissTable does not always cause worse performance for small maps, and we can see the cases in the benchmark(e.g., in most key miss scenarios, the performance improves instead). Could you please tell me the cases where the small maps are often used?

Some of the slowdowns in the benchmarks can be fixed. PatchSet 10 should fix the slowdown of Pop in the runtime benchmark. I will update the benchmark result later :)

Well I use small maps often in stead of case statements, or for mapping a few input file names to processors, or for lookups such as os.Substitute can do. I am looking forward to the improved results.

Thanks for the detailed breakdown of the performance losses! My take is that there are two things that need to happen on that front:

  • Work on improving the small map case. I'm okay with some slowdown here, but I also think these are common enough that it's worth some effort to improve.
  • Optionally, improve the runtime's map benchmarks. It seems like we just don't have great benchmarks. :) We should continue to cover the small map case, though. If I'm interpreting the names of the benchmarks from your suite correctly, it looks like you are covering small maps.

You mentioned that SwissTable is better for the "miss" case than the "hit" case. If I understand your layout correctly (I haven't looked closely), you're already packing the tophash and the data together, so you'd expect one cache miss for a hit. Is that right? Does the SIMD version of SwissTable level the field between the hit and miss cases at all?

Stepping back, I think there are three high-level potential concerns to address with this:

Incremental resizing. We've already been discussing this above. :)

DoS prevention. Currently, Go maps have fairly strong denial-of-service properties. The main concern is if an attacker can control the placement of items to induce collisions, leading to n^2 behavior. Having a good, per-map seed that is factored into the hash computation (not just combined at the end) goes a long way toward preventing this. Beyond this, the concern becomes leaking too much information about long-lived maps such that an attacker could reconstruct the seed. Having non-deterministic iteration order helps a lot here. A third problem is that copying elements one-by-one from one map into another, newly allocated map can easily go n^2, but having a per-map seed also mitigates that. I think all of these mechanisms are still in place in your implementation, but wanted to confirm that.

SIMD, specifically how to actually use SIMD instructions in the map implementation. The most straightforward approach is to write the SIMD bits in assembly, but that has significant maintainability problems and may even perform worse because of the cost of transitioning to and from assembly. So, that suggests adding compiler intrinsics. General SIMD intrinsics are a wide-open area right now and I don't think we want to block the map implementation on solving that problem. :) A lot of proposals have floated around (admittedly, some of which have not been published). We could start trying them out just in the runtime. Or, if I understand correctly, SwissTable doesn't need very many or any particularly fancy SIMD operations, so maybe we just hack in some intrinsics (which we could use as a starting point for more general exploration).

CAFxX commented

The main concern is if an attacker can control the placement of items to induce collisions, leading to n^2 behavior. Having a good, per-map seed that is factored into the hash computation (not just combined at the end) goes a long way toward preventing this. Beyond this, the concern becomes leaking too much information about long-lived maps such that an attacker could reconstruct the seed. Having non-deterministic iteration order helps a lot here. A third problem is that copying elements one-by-one from one map into another, newly allocated map can easily go n^2, but having a per-map seed also mitigates that. I think all of these mechanisms are still in place in your implementation, but wanted to confirm that.

Another option that would help alleviate most n^2 concerns is having a per-map seed that changes when the map grows/shrinks.

a per-map seed that changes when the map grows/shrinks.

This is challenging with incremental map growth. FWIW, I believe we do opportunistically reset the seed when the length hits zero.

Thanks for all of the feedback :)

Work on improving the small map case.

PatchSet-11 adding the optimization of bucketmask into it, the benchmark has been improved for almost all cases(updated the benchmark result).

There are still two optimizations that can be added

  • Using a two-level scheme for map_fast32 and map_fast64. The idea is that we can iterate the bucket instead of using matchTopHash if iterating the bucket can have better performance. Now the threshold is 2, meaning that if 0 <= buckets <= 2, we iterate the bucket(bucket == 0 is a special case, we don't need to calculate the hash). For buckets > 2, we use the matchTopHash scheme. We can change the threshold based on the experiment, and uint32 and uint64 can have different thresholds.

  • Remove the overflow pointer in the bucket. I have encountered some problems due to memory overlap, which is still a work in progress. It has been added in PatchSet 12.

Improving the runtime's map benchmarks.

Indeed. We can add some new benchmarks to it. I feel like adding the new one into https://github.com/zhangyunhao116/gomapbench first and then migrating some of it into the runtime is a good idea.

DoS prevention.

Yes! All of these mechanisms are still in the current implementation, including the non-deterministic iteration and resetting the map's seed when the length hits zero.

SIMD

We planned to write the SIMD directly in assembly, but the experiments outside the runtime show that their performance is almost the same. I wonder if we can use ABIInternal can make the assembly run faster?

I think SIMD intrinsics is a great idea! Agree that we can start trying SIMD intrinsics just in the runtime. After some basic optimizations are done, I will start to implement the SIMD version :) Most SIMD in the SwissTable is not very tricky except the prepareSameSizeGrow, and this function is rarely called(only used in sameSizeGrow).

Another challenge of SIMD is that its default bucket size is 16(the basic version is 8, which is the same as before), which means memory consumption may increase significantly in some cases.

Indeed. We can add some new benchmarks to it. I feel like adding the new one into https://github.com/zhangyunhao116/gomapbench first and then migrating some of it into the runtime is a good idea.

Sounds good. Possibly we should fold your package into bent, so they run as part of the performance dashboard.

Yes! All of these mechanisms are still in the current implementation, including the non-deterministic iteration and resetting the map's seed when the length hits zero.

Perfect!

We planned to write the SIMD directly in assembly, but the experiments outside the runtime show that their performance is almost the same. I wonder if we can use ABIInternal can make the assembly run faster?

Yeah, this doesn't surprise me if you're jumping to assembly just to run a few instructions. ABIInternal would certainly improve that, since it would probably allow the entire call to be registerized. I'm not sure exactly what operations you need, but if they're taking or returning vector values there's still going to be some overhead since ABIInternal doesn't know anything about passing vector registers.

I'm not sure how much effort it is to add a few one-off SIMD intrinsics. You just need SSE, right, not AVX? The operations themselves are probably not hard to add. The register allocator already knows about the XMM registers, but only as non-vector 64-bit floating-point. That mostly shouldn't matter unless it decides to spill one (thanks @cherrymui for pointing that out). If you need AVX, things may get a lot more complicated, or maybe not since the YMM registers are just aliased onto the XMM registers, so we would just call them X registers anyway.

Another challenge of SIMD is that its default bucket size is 16(the basic version is 8, which is the same as before), which means memory consumption may increase significantly in some cases.

It's hard to say how concerning this is. Sure, if an application has a huge number of small maps, perhaps with large key or value types (though they can't go over 128 bytes each), this could add up to a lot. But the relative overhead of this decreases as map size increases. I'll see if we can get some data on this.

Are you considering the trick described in this video, where you need to allocate 16 bytes of control (or a little more for unaligned groups) but the actual bucket array can be smaller? The net result of that might actually be that the tables would be smaller than current tables.

This is exciting, but before going too far down the optimization path, it seems to me it’d be good to figure out incremental growth, as that could be a dealbreaker.

I asked @mcy for useful references to learn more about SwissTables and he pointed me to these:

I totally agree with @josharian . Making sure there's a workable incremental growth story is higher priority than SIMD optimizations.

or a little more for unaligned groups

I have not had a chance to look closely at your code yet, but I just realized you mentioned interleaving the control bytes and the values, so you must be using aligned groups. De-interleaving them may be worth exploring, but again incremental growth is more important to figure out.

Hi there, this is indeed quite exciting!

Two other references I found useful:

  • This thread here, where the authors of Swisstable compare notes with the authors of the related Facebook Folly F14.
  • A related C++ hashtable-benchmarks repo comparing Swisstable, F14, and some others.
    • The numbers are somewhat artificially improved because it bypasses the hashing of values to focus on the performance of the different containers.

(My possibly incorrect understanding of the history is the Facebook team was inspired by the 2017 CppCon Swisstable talk, but before any Swisstable code was released to the public, the Facebook team implemented their own version of the core idea along with some different tradeoffs. One difference of interest might be the recommended F14FastMap variant that based on entry size chooses between values inline vs. values packed in a contiguous array).

Also, FWIW, I had an older rough Swisstable implemented outside the runtime that used SSE for the 16-way probing, but which did not support incremental growth.

Inspired by this issue here, I dusted it off recently and I am attempting to add incremental growth. I have a sketch of how it could work, but it is a bit subtle, so before adding too much noise here, I am trying to implement at least a first cut of incremental growth. I will hopefully post more details here within a few days (which might be "whoops, sorry, flawed idea").

Finally, just to be clear, what I am doing is a simple prototype. I don't want to derail any work on the current CLs, and I would be more than happy for the talented team at ByteDance to be the ones to make something real in the runtime. ;-)

You just need SSE, right, not AVX?

Yes, we only need the SSE2 actually, the AVX2(including SSE3) doesn't faster than SSE2 in our tests, but it is a platform-related case, it could be different if we have different CPUs.

Are you considering the trick described in this video, where you need to allocate 16 bytes of control (or a little more for unaligned groups) but the actual bucket array can be smaller?

We don't need this trick yet :) Since the current implementation puts the tophash array and the key/value pairs in the same bucket as before. If we can put all the tophash as a separate field in the future(the plan in the TODOs), I'm happy to add this optimization.

Making sure there's a workable incremental growth story is higher priority than SIMD optimizations.

Agree! I'm benchmarking the SwissTable to see the impact of the growing mode now. The main concern I have for the incremental growth is its overhead, we need a fast algorithm. I will try to implement it later :)

Hi there. A quick update is that I did change my older swisstable implementation at thepudds/swisstable to do incremental growth without invalidating iterators. It hopefully preserves the semantics of the runtime map.

There may still be bugs of course, but there are some tests, along with some fun fuzzing that creates coverage-guided permutations of calling sequences & arguments (via fzgen.Chain) in a self-validating map that attempts to track for example whether or not the same key is allowed to be emitted twice during a range operations based on the history of that key (e.g., a sample interesting sequence generated by the fuzzer might be: populate a map, add key X, start iteration, add key Y to trigger growth, delete X, then re-add X, finish iteration).

The performance results seem promising so far. I will post some benchmark results shortly.

Overall, I suspect some flavor of swisstable implementation in the runtime would be a nice win.

The short version of the iterator approach is:

  • Evacuation status is maintained during growth in a separate growth status slice. (This growth status slice uses memory, but it is not per element bur rather per group info, and it uses less than the memory overhead of the overflow buckets used by the current runtime map, even for small keys).
  • Iterators hold on to references to the current table and an immutable-during-growth old table. (The runtime iterators also hold on to references to tables, but the runtime's old buckets are not immutable).
  • Iterators walk both the old and current tables, with de-duplication to avoid emitting the same key twice and checking the live tables when needed to emit the golden data. It has some logic to avoid some hashing while doing this, and I think it does less overall hashing during mid-move iteration than the runtime map iterator (but need to confirm the hashing frequency vs. the runtime a bit more).

There are almost certainly better approaches, but this is one example of swisstables respecting the current Go map semantics with incremental resize.

Here are some benchmark results for thepudds/swisstable. These are all for int64 keys/values. The sub-benchmark names are the count of elements in a map. Fill factor is lower than the C++ implementation to match the runtime map fill factor, mainly to make comparisons easier.

Benchmark Results
old is the runtime map, new is this swisstable implementation.
-
name                      old time/op    new time/op    delta
FillGrow/664-4            63.3µs ± 1%    58.9µs ± 2%   -6.87%  (p=0.000 n=20+20)
FillGrow/999-4             112µs ± 3%     105µs ± 2%   -5.77%  (p=0.000 n=20+20)
FillGrow/681575-4          141ms ± 4%     111ms ± 3%  -21.37%  (p=0.000 n=20+19)
FillGrow/1022362-4         262ms ± 4%     207ms ± 4%  -20.95%  (p=0.000 n=20+20)
FillGrow/5452596-4         1.51s ± 5%     1.24s ± 4%  -17.81%  (p=0.000 n=18+19)
FillGrow/8178894-4         2.72s ± 3%     2.22s ± 2%  -18.54%  (p=0.000 n=20+18)
FillPresize/664-4         29.3µs ± 2%    22.5µs ± 3%  -23.22%  (p=0.000 n=20+20)
FillPresize/999-4         45.0µs ± 4%    36.8µs ± 4%  -18.12%  (p=0.000 n=20+20)
FillPresize/681575-4      80.6ms ± 7%    66.1ms ± 7%  -17.93%  (p=0.000 n=20+20)
FillPresize/1022362-4      117ms ± 5%     122ms ± 6%   +4.54%  (p=0.000 n=20+20)
FillPresize/5452596-4      761ms ±15%     866ms ± 3%  +13.80%  (p=0.000 n=19+20)
FillPresize/8178894-4      1.12s ± 7%     1.43s ± 4%  +27.42%  (p=0.000 n=20+20)
GetHitHot/664-4           20.7µs ±15%    16.3µs ±12%  -21.28%  (p=0.000u n=20+18)
GetHitHot/999-4           19.5µs ±15%    16.5µs ± 9%  -15.62%  (p=0.000 n=20+19)
GetHitHot/681575-4        21.5µs ± 8%    17.8µs ±11%  -17.44%  (p=0.000 n=20+20)
GetHitHot/1022362-4       19.8µs ±11%    16.7µs ±11%  -15.32%  (p=0.000 n=19+19)
GetHitHot/5452596-4       21.4µs ±13%    17.0µs ±10%  -20.66%  (p=0.000 n=20+19)
GetHitHot/8178894-4       19.8µs ±17%    17.0µs ± 9%  -14.04%  (p=0.000 n=20+20)
GetMissHot/664-4          17.2µs ±23%    21.6µs ±19%  +25.54%  (p=0.000 n=20+20)
GetMissHot/999-4          15.7µs ±12%    20.2µs ±13%  +28.61%  (p=0.000 n=20+20)
GetMissHot/681575-4       17.7µs ±20%    22.6µs ±23%  +27.68%  (p=0.000 n=20+20)
GetMissHot/1022362-4      15.5µs ± 5%    19.6µs ±10%  +26.30%  (p=0.000 n=20+18)
GetMissHot/5452596-4      17.8µs ±12%    20.9µs ±19%  +17.80%  (p=0.000 n=20+19)
GetMissHot/8178894-4      15.3µs ± 6%    19.8µs ±11%  +28.92%  (p=0.000 n=19+20)
GetAllStartCold/664-4      1.10s ± 1%     0.87s ± 0%  -20.85%  (p=0.000 n=10+8)
GetAllStartCold/999-4      1.15s ± 1%     0.91s ± 1%  -20.34%  (p=0.000 n=10+10)
GetAllStartCold/681575-4   3.22s ± 3%     2.96s ± 1%   -8.04%  (p=0.000 n=9+10)
GetAllStartCold/1022362-4  3.25s ± 1%     3.44s ± 4%   +5.75%  (p=0.000 n=10+10)
GetAllStartCold/5452596-4  4.50s ± 2%     5.90s ± 3%  +31.27%  (p=0.000 n=10+10)
GetAllStartCold/8178894-4  4.77s ± 1%     6.82s ± 3%  +43.15%  (p=0.000 n=9+10)

Some comments on the results:

  • GetAllStartCold difference for larger maps might be greater cold penalty for swisstable's separate control bytes compared to the runtime map layout.
  • GetMissHot result here I think is probably not because of probing at these particular sizes (though that might be a factor at different sizes). Instead, it might be at least in part due to two transitions between Go/asm for a miss -- first transition to look for a match, then second transition to look for the presence of EMPTY slots. A typical hit in contrast only does the first Go/asm transition. (One improvement could be to have a version of the ASM that does both operations in a single transition).

Those seem promising, including this is not a fully optimized implementation (for example, no real attempt to eliminate bounds and shift checks, the Go/asm interface is heavier weight than it needs to be, there is some code reuse for convenience but that causes some unneeded work, and so on).

This is also comparing thepudds/swisstable to mapaccess2_fast64, which takes some shortcuts (e.g., in some cases skipping comparing topHash if comparing the values is faster).

On the other hand, these are only preliminary results, and this implementation ignores some corner cases like NaNs and -0.0 vs +0.0, which at least has the potential to slow things down.

This next benchmark replicates #51410 (though with a larger table to make the pattern clearer), and is the only benchmark here with strings. This is a case of a map getting "stuck" mid-growth, which then slows down iteration. Note that the runtime map and this swisstable implementation have opposite best/worst cases for the percentage of the map in mid-growth evacuation, which are called out in the comments.

Mid-move Iterator Benchmark Results
name                     time/op
RangeString_Swiss/414    2.80µs ± 4%
RangeString_Swiss/415    2.84µs ± 3%
RangeString_Swiss/416    2.82µs ± 4%
RangeString_Swiss/417    5.06µs ± 5% // best mid-move evac % for this implementation
RangeString_Swiss/418    5.61µs ± 6%
RangeString_Swiss/419    6.79µs ± 2%
[...]                  
RangeString_Swiss/445    28.4µs ± 3%
RangeString_Swiss/446    29.2µs ± 2%
RangeString_Swiss/447    30.3µs ± 3% // worst mid-move evac % for this implementation
RangeString_Swiss/448    4.30µs ± 1%
RangeString_Swiss/449    4.27µs ± 1%
-
name                     time/op
RangeString_Std/414      5.77µs ± 3%
RangeString_Std/415      5.76µs ± 4%
RangeString_Std/416      5.86µs ± 6%
RangeString_Std/417      36.6µs ± 2%  // worst mid-move evac % for stdlib map
RangeString_Std/418      35.8µs ± 3%
RangeString_Std/419      34.7µs ± 2%
RangeString_Std/420      34.0µs ± 2%
RangeString_Std/421      33.2µs ± 2%
[...]

For those mid-move iteration results, one caveat is the current iteration benchmark might not yet be a fair enough apples-to-apples comparision, including I need to tweak the swisstable calling sequence for iteration to better match the runtime map approach.

Next steps include doing a sweep over a finer-grained set of element counts, as well as smaller and larger table sizes. I think it naturally supports the memory optimization from the later C++ swisstable for small table sizes, but not yet attempted.

I had a few different sketches of how incremental growth could work, but this was a less complex approach compared to some of the other sketches. Part of the reason for this approach is also that aspirationally I'd like to try a simple or even simplistic "throw some atomics" at it, including because part of the appeal of doing growth work when reading is the nice attribute that growth penalties are then short-lived or not forever, which might open up some tradeoffs on the growth approach.

@thepudds Thanks! The github.com/thepudds/swisstable is a nice approach! Agree that we need a simple way to implement it.

I'm considering using a mechanism similar to the original hashmap(also like the thepudds/swisstable, but convert the fixedTable to buckets, for small map, it requires less memory, since we don't need extra headers for each map), do the incremental rehash if we need to resize the hashmap. It looks like a workable plan, and the simplest way for now.

We need to modify some features of the SwissTable:

  1. We need a special sentinel to represent an evacuated bucket. The SwissTable C++ implementation's kSentinel looks like a good example.

They may be some differences compared to the original hashmap:

  1. SwissTable evacuates a bucket chain instead of one bucket with its overflow buckets. For SwissTable, says we have an item that should be in bucket A, but in fact, it could be in every bucket probing from bucket A until we can find an empty slot. In most cases, the length of the bucket chain is less than 2, most for 1.

Here is a simple benchmark: SwissTable(without incremental rehash) VS hashmap(Go1.19)
(It makes me think that incremental rehash is a necessary mechanism for some latency-sensitive servers.)

swisstable-v1-access
swisstable-v1-insert

After some explorations, I think the mechanism used by the original hashmap is not a workable plan for SwissTable. The key problem is that, in the original hashmap, we can use evacuatedX and evacuatedY to make sure that we iterate an item only once if the hash isn't repeatable(it is based on the rule that an old bucket expands to 2 buckets in the new buckets).

For SwissTable, we can't ensure that an oldbucket expands to 2 buckets during the growth, actually, an item can go to any slot in the new buckets using triangular probing. This case breaks the compatibility.

I will try other approaches, extendible hashing is a good plan, but it still requires some time to verify and implement :)

I will try other approaches, extendible hashing is a good plan, but it still requires some time to verify and implement :)

Hi @zhangyunhao116, I'm curious if you have had time to work on this? As of November it sounded like you were thinking about going a different direction than the initial SwissTable CL you had sent, including due to the constraints of the current runtime map around incremental resizing and iterator invalidation and so on.

I'm tempted to try to put together a CL following a couple of the prototype approaches from thepudds/swisstable. I might see a plausible path to success, including respecting the current constraints of the runtime map while still getting good performance.

However, I don't want to duplicate work or step on any toes if you are still working on this or were planning on returning to it soon?

And of course, I'm not 100% sure of success, including maybe the performance ends up not being compelling enough, and I need to flesh things out more including probably changing the layout and more extensive performance testing and so on, but I've been holding off spending time on it in part because I wanted to see what happened here, as well as to explicitly check what you thought about me trying something.

Finally, thanks again for moving this forward!!

Hi @thepudds , feel free to send a CL if you have a workable idea on this :)

On the way to implementing this feature, I'm still considering the necessity, because it will make the SwissTable become more complicated and hard to verify its correctness. And I'm still not sure if some services in the real world require this feature (or in some rare cases, using the SIMD can solve the problem?).

These days I'm trying to use the SwissTable in our company services. The SwissTable implementation works well for simple situations like binary tools and RPC servers. The main task, for me now, is to verify its correctness, performance improvement, and corner cases. The extra abstraction from the incremental growth will probably cause performance degradation, especially for the small map.

We still need to collect more feedback from the service owners (from other teams in our company) to see the impact of immediately growing and whether or not some services need this feature.

If we think this feature is necessary for some of our internal services, I will continue to work on it. The feedback collection stage may last about 1 ~ 2 months, and I will update some information on this issue :)

Hey! I had a use case of inserting millions of key-value pairs into a hashmap with only a relatively small amount of look-ups. I had prior experience with extendible hashing and so I implemented a new extendible hashing hashmap tuned for insertion speed in huge maps. I'm not well studied on hashmaps so beware - I basically just winged it and somehow I feel that there's a fatal flaw somewhere. My software and tests works just fine so I dunno.

Inserting 100 * 1000 * 1000 [int64, int64] key-value pairs to it takes ~13s while for the built-in map it takes ~19s. Also the built-in map uses 2x the amount of physical memory. That particular implementation I specialized to my use case so all hashmaps had a fixed amount 64 buckets and each bucket had 8 key-value pairs.

Motivated by the insertion performance and the reduced memory overhead I modified it to use dynamic amount of buckets so that small maps don't waste memory. See this gist for that hastily modified version of my original implementation. Note that this dynamic version is almost 2x slower for my use case when it comes to inserting and look-ups. Also note that only Put and Get are implemented in a mostly proper manner. No random iteration order, no shrinking after Delete, etc etc etc. In my gist implementation Put has competitive performance against the built-in map but Get and Delete fail somewhat.

A few interesting aspects about it that I want to point out:

  • By storing additional 8 bits of the hash value (triehash8) per key-value pair we get to grow the size of the map 8 times without having to recalculate the full hash for the keys. See the split() function for that where the 8 bits are used to split the key-value pairs correctly to the two maps. These 8 bits can also be used during probing to reduce false positives when matching hashes. Very useful property when your keys are long strings.
  • In my original implementation all hashmaps had 64 buckets each. In the gist version I start with 1 bucket and after a few splits I start doubling the amount of buckets per hashmap until they grow up to 64 buckets. 64 buckets because it got me the best insert performance. So when inserting the 9th key-value pair a new hashmap with 1 bucket is allocated and ~half of the key-value pairs from the first map are moved to it. In total there's 8+8 slots between the two maps and so we got our doubling in size.
  • Growing produces hardly any garbage for the GC to collect if implemented properly. If the hashmap that is being split is reused then there is nothing left behind to collect (if we don't allocate more buckets for the hashmap). Gist does not do this. Iterating could mess up this plan though.
  • All around interesting behaviour when it comes to CPU caches. Growth doesn't pollute the cache with the totality of the old and new map, lookup has some additional cache-miss spots, etc etc.
  • The hashmap can be limited to mostly make only ~smallish allocations and I would assume that it would help the allocator/GC in mysterious ways when all of the hashmaps in the software makes allocations of roughly the same size. The BenchmarkShortUse in main_test.go kind of tries to showcase this and it has really interesting results when multiple CPU cores are available to the Go runtime (works on my machine).

It would be nice to have someone with a wrinkly brain have a go at extendible hashing, perhaps with a SwissTable in the insides. All the literature that I find on extendible hashing (outside of database stuff) is very surface level and my search engine only brought me to this issue. Thank you for visiting my blog.

*** Next up: Visit a cleaned up version of it here.

I wrote a new Go Swiss Tables implementation that we're intending to use in Pebble and CockroachDB. The README has benchmarking numbers vs Go's builtin map. Per @aclements suggestion, this implementation has a layer of extendible hashing on top of a per-bucket Swiss Table foundation. The default max-bucket-capacity is 4095, and it is sometimes visible in the benchmark numbers where the transition from 1 to more than 1 bucket occurs. The max-bucket-capacity is configurable, and you can set it arbitrarily high if you want absolute performance at the expense of resize latency. The Swiss Tables portion of the code follows the Abseil implementation (the other Go implementations I've examined did not). As noted above, use of actual SIMD instructions isn't feasible due to the lack of intrinsics (the function call overhead to call the assembly routines is too high). If intrinsics become available it should be straightforward to incorporate them into this implementation.

The performance is good, usually much faster than Go's builtin map, particularly at larger map sizes. The performance is also better than github.com/dolthub/swiss, though I credit that package with inspiring this effort. The extendible hashing layer provides incremental resizing. Mutation during iteration has similar semantics to Go's builtin map (existing keys will be visited once, new keys may or may not be visited, deleted keys won't be visited if they were deleted before the iteration reached them).

The Go map implementation has a few tricks that are not easy to reproduce outside of a runtime supported implementation:

  • Specialized implementation of int32, int64, and string keys. In particular, the fast-path to perform linear search (and avoid computing hash(key)) when the map is a single bucket makes map faster at tiny map sizes (<=6 entries). The string key specialization also avoids performing a whole key comparison in certain cases, and looks for pointer equality to avoid comparing the underlying data.
  • The Go map structure (hmap), and the first bucket in the map, can be stack allocated in some cases that are triggered by the benchmarks. It was a bit head scratching initially how map was getting away with zero allocations in some benchmarks.

The code is well commented, but I'm also happy to answer questions about the implementation. I'd be supportive of attempting to replace the existing Go map implementation with this one, though I do not intend to do that myself.

Hi @petermattis, very exciting! This seems very promising.

Some initial questions, and sorry in advance if any of this is off base.

  1. I'm curious if you could say a few brief additional words about the performance hit of the extra allocations mentioned here in the README (including maybe whether you expect that impact could be eliminated if inside the runtime or with some tweaks, vs. if you view it as inherent to the approach):

    The benchmarks where swiss.Map performs worse are due to a fast-path optimization in Go's builtin maps for tiny (6 or fewer element) maps with int32, int64, or string keys, and when growing a map. In the latter case, Go's builtin map is somewhat more parsimonious with allocations at small map sizes and the performance hit of the extra allocation dominates in the operations.

  2. Setting aside for a moment the extensible hashing, in a single Swiss Table (a single shard), it looks like in your implementation the key/value layout within a single Swiss Table might be KV|KV|KV|KV|KV|KV|KV, which I think follows the Abseil C++ implementation, and that's in contrast to the current runtime map key/value layout of KKKKKKKKK|VVVVVVVV per bucket-- is that correct?

    • (There might be a trade-off there, including the runtime map's layout of keys and values can be more compact depending on the alignment, but your implementation's layout can have better locality in some cases, including in some cases finding the entirety of a KV within a single cache line if things are small enough, or in an adjacent cache line, etc.).
  3. Have you had a chance to benchmark cold hits and cold misses? (I might have missed that in quick look so far). As you likely know (including I mentioned this above), the Abseil folks and the Folly folks were collaborating for a period on hashtable benchmarks to help compare the original Swiss Table implementation vs the derived Folly F14 implementation. If interested, you can see their approach for cold hits / cold misses if you poke around for example in the C++ benchmark code here. (I had a WIP and likely imperfect set of cold map Go benchmarks here, partially based on that).

  4. Still setting aside the extensible hashing, for a single Swiss Table in your implementation, is it correct that the control bytes are contiguous and separated from the KV pairs for a single shard? For a map that is sufficiently warm, I think one of the benefits of that is that the control bytes are very compact per element and can be residing in cache, but for sufficiently cold maps, needing to visit the control bytes as well as the actual key/values might translate to extra cache misses compared to how the current runtime map has the tophash bytes interleaved with the keys and values, with each bucket starting with the tophash bytes.

    • (How it plays out I think might depend on key sizes, current occupancy, whether the map completely cold vs. slightly warm, and so on... but for example, in a given sufficiently cold map with sufficiently small keys, the current runtime map might be able to have 1 cache miss to access all of the tophash bytes, key, and value, but the Swiss Table layout in the same situation might have 2 cache misses -- first for the control bytes and second for the key/value. With slightly larger keys or slightly worse luck with a cold map, the runtime map might for example also access an adjacent cache line, but that might still be faster than the Swiss Table layout of two cache misses on cache lines that are further apart).
  5. For a sufficiently cold map, would the bucket directory in the Cockroach implementation also potentially translate to an additional cache miss compared to the current runtime map in some cases?

In general, it might be the case that a new runtime map implementation could be sufficiently faster in enough common scenarios even while slower in some scenarios that it could still be a large net win overall, and that might be the answer for some of these questions around layout, cold map impact, and so on.

In any event, sorry if some of this is wrong or doesn't make sense as questions. (I've looked at some of this in the past and toyed a bit with a swisstable implementation, but not recently, so maybe I am misremembering certain issues or perhaps never fully understood properly 😅).

I'm curious if you could say a few brief additional words about the performance hit of the extra allocations mentioned here in the README (including maybe whether you expect that impact could be eliminated if inside the runtime or with some tweaks, vs. if you view it as inherent to the approach):

The performance hit is minor and only affects tiny maps. The other effect on play on tiny maps is the runtime specialization of int32, int64, and string keys to perform linear search in a single bucket map. A runtime implementation of Swiss Tables would have neither of these limitations. (i.e. They aren't fundamental to Swiss Tables, but the limitations of what can be done without compiler support).

Setting aside for a moment the extensible hashing, in a single Swiss Table (a single shard), it looks like in your implementation the key/value layout within a single Swiss Table might be KV|KV|KV|KV|KV|KV|KV, which I think follows the Abseil C++ implementation, and that's in contrast to the current runtime map key/value layout of KKKKKKKKK|VVVVVVVV per bucket-- is that correct?

That's correct. Interleaving the keys and values seems preferable when there are no buckets (as in the Swiss Tables design). Yes, there can be some space wastage due to this interleaving. Even when there are buckets, interleaving keys and values might be desirable. 8-bytes/key * 8 keys == 64 bytes which is a typical cache line size.

Have you had a chance to benchmark cold hits and cold misses? (I might have missed that in quick look so far). As you likely know (including I mentioned this #54766 (comment)), the Abseil folks and the Folly folks were collaborating for a period on hashtable benchmarks to help compare the original Swiss Table implementation vs the derived Folly F14 implementation. If interested, you can see their approach for cold hits / cold misses if you poke around for example in the C++ benchmark code here. (I had a WIP and likely imperfect set of cold map Go benchmarks here, partially based on that).

I didn't benchmark cold hits and misses directly, though at very large map sizes the performance between the runtime map and swiss.Map converge again. It looks like large map sizes is what the hashtable-benchmark is doing to test cold hits and misses. On my M1 mac I can see this happening when the memory for slots exceeds 8MB and the performance continues to decrease after that. Interleaving the controls with the slots is something to experiment with, though then the indexing becomes more complicated.

Still setting aside the extensible hashing, for a single Swiss Table in your implementation, is it correct that the control bytes are contiguous and separated from the KV pairs for a single shard?

Yes, that's correct.

For a map that is sufficiently warm, I think one of the benefits of that is that the control bytes are very compact per element and can be residing in cache, but for sufficiently cold maps, needing to visit the control bytes as well as the actual key/values might translate to extra cache misses compared to how the current runtime map has the tophash bytes interleaved with the keys and values in each bucket.

As you mention, there are a variety of factors at play here. The control bytes can avoid looking at the keys/values at all on a miss. In large maps, cold misses (i.e. gets for non-existent keys) are faster in swiss.Map. Sometimes significantly so. The size of the keys plays a factor. So does whether the key is entirely contained in the slot or whether there is an indirection (e.g. for string keys).

For a sufficiently cold map, would the bucket directory in the Cockroach implementation also potentially translate to an additional cache miss compared to the current runtime map in some cases?

Yes. There is an indirection involved in the directory. I've been wondering if that can be eliminated, though I don't currently see how to do so completely. The directory is a []*bucket[K,V]. Could it be a []bucket[K,V]? The current implementation requires multiple directory entries to point to the same bucket until it is split (this is somewhat fundamental to extendible hashing). With a []bucket[K,V] we'd have to do a two-step lookup where the entry is found and then the local-depth is examined to find the "true" bucket. That seems like just as many cache misses as the current approach. Perhaps bucket.localDepth could be pulled out into a parallel array of uint8. Yet again there is another cache miss to locate the bucket, but the number of directory entries should be small so perhaps that bucketLocalDepth []uint8 slice can always be expected to be in cache. A million entry map requires only <=1024 directory entries (depends on the load-factor and the max-bucket-capacity). If we want to get really fancy we could shrink localDepth down to a single bit. Unless the hash function is really bad localDepth either equals globalDepth or is globalDepth-1. Now your million entry map needs 128 bytes to determine exactly which bucket to jump to. We'd have to enforce that localDepth is either globalDepth or globalDepth-1 by splitting any buckets which violate this requirement when the directory grows, but I think that would be a rare occurrence.

Yes. There is an indirection involved in the directory. I've been wondering if that can be eliminated, though I don't currently see how to do so completely. The directory is a []*bucket[K,V]. Could it be a []bucket[K,V]?

The answer is yes and it wasn't as complicated as what I was ideating above. See cockroachdb/swiss#21 and https://github.com/cockroachdb/swiss/blob/pmattis/optimize-cache-misses/map.go#L152-L162 for details. That PR also switches to interleaving 8 control bytes and 8 slots. The benchmark numbers were updated to include larger map sizes in order to get a sense for cold access performance and I also included the alloc/op and allocs/op numbers which also compare favorably vs runtime map for most cases.

Hi @petermattis, that's great!

I think cockroachdb/swiss#21 completely addresses my question/concern #4 from my comment above around extra cache misses due to the control bytes being separate, and I think it also addresses concern #5 as well, though TBH I'm still digesting your most recent changes to the bucket directory. (I did have some follow-up questions around why would it be the case that for "very large map sizes the performance between the runtime map and swiss.Map converge again" even for cold hits for your initial version, but I think those questions are hopefully moot now with your latest version).

It looks like large map sizes is what the hashtable-benchmark is doing to test cold hits and misses.

If I understood correctly, I had thought the cold hit and cold miss benchmarks in the C++ hashtable-benchmark repo weren't just doing large map sizes, but rather were attempting to create enough maps so that in total they did not fit in cache (with a benchmark parameter for the total memory used for those maps that defaults to a large amount of memory), which for a given test and given map size could translate to a few large maps or many smaller maps. In other words, I thought it attempted to make the maps start almost completely cold, regardless of size.

In any event, this all looks very promising!

There's maybe still an open question around whether it would be acceptable for the runtime to do KV|KV|KV|KV|KV|KV|KV instead of KKKKKKKKK|VVVVVVVV and have something like a map[uint64]uint32 take an extra 4 bytes per KV of additional space due to alignment, but there's also the space win for a Swiss Table of not having overflow buckets (which in the runtime map might have a median of something like ~6% space overhead for larger maps, though I didn't double-check that number just now). Maybe the runtime could specialize on aligned vs. non-aligned KVs, or maybe someone will have a clever idea about how to get the best of both worlds or split the difference or similar.

At one point I had a prototype CL which did the best packing setup for a bucket given alignment constraints. So for instance, with uint32 keys and uint64 values, it would do KKVVKKVVKKVVKKVV and for uint64 keys and values it would do KVKVKVKVKVKVKVKV.
It just kept a small table in the map type to know where things were laid out.
It wasn't terribly promising, the overhead of the extra indirection didn't pay in most circumstances. But it wasn't horrible - it might be worth revisiting the technique for a new map implementation.

There's maybe still an open question around whether it would be acceptable for the runtime to do KV|KV|KV|KV|KV|KV|KV instead of KKKKKKKKK|VVVVVVVV and have something like a map[uint64]uint32 take an extra 4 bytes per KV of additional space due to alignment, but there's also the space win for a Swiss Table of not having overflow buckets (which in the runtime map might have a median of something like ~6% space overhead for larger maps, though I didn't double-check that number just now). Maybe the runtime could specialize on aligned vs. non-aligned KVs, or maybe someone will have a clever idea about how to get the best of both worlds or split the difference or similar.

I experimented with separated keys and values and saw a minor performance drop in some benchmarks, though I didn't run the full suite. There is nothing difficult about such a change now that keys, values, and controls are bundled together in 8-element groups.

At one point I had a prototype CL which did the best packing setup for a bucket given alignment constraints. So for instance, with uint32 keys and uint64 values, it would do KKVVKKVVKKVVKKVV and for uint64 keys and values it would do KVKVKVKVKVKVKVKV.
It just kept a small table in the map type to know where things were laid out.
It wasn't terribly promising, the overhead of the extra indirection didn't pay in most circumstances. But it wasn't horrible - it might be worth revisiting the technique for a new map implementation.

My old C++ days are whispering partial specialization, though that created some pretty difficult to follow C++ at times. What would be the Go equivalent? How does the Go compiler dispatch to the _fast32, _fast64, and _faststr variants? If there were MapInterleavedKVs[K,V] and MapSeparatedKVs[K,V] types, could the compiler select which one to use based on the types of K and V. (I'm specifically asking if something within the compiler/runtime could be done, not a language addition to make this possible).

I experimented with separated keys and values and saw a minor performance drop in some benchmarks, though I didn't run the full suite. There is nothing difficult about such a change now that keys, values, and controls are bundled together in 8-element groups.

Here are some benchmark numbers where old is with interleaved KVs and new is separated KVs.

name                                  old time/op  new time/op   delta
MapGetHit/swissMap/Int64/6-10         4.87ns ± 1%   4.95ns ± 1%   +1.66%  (p=0.008 n=5+5)
MapGetHit/swissMap/Int64/12-10        4.84ns ± 0%   4.94ns ± 0%   +2.09%  (p=0.008 n=5+5)
MapGetHit/swissMap/Int64/18-10        4.86ns ± 1%   4.92ns ± 0%   +1.21%  (p=0.016 n=5+4)
MapGetHit/swissMap/Int64/24-10        5.54ns ±13%   5.58ns ±11%     ~     (p=0.690 n=5+5)
MapGetHit/swissMap/Int64/30-10        4.85ns ± 0%   4.92ns ± 0%   +1.55%  (p=0.008 n=5+5)
MapGetHit/swissMap/Int64/64-10        4.83ns ± 0%   4.93ns ± 6%     ~     (p=0.286 n=4+5)
MapGetHit/swissMap/Int64/128-10       4.85ns ± 1%   4.85ns ± 0%     ~     (p=0.730 n=5+4)
MapGetHit/swissMap/Int64/256-10       4.84ns ± 1%   4.86ns ± 1%     ~     (p=0.222 n=5+5)
MapGetHit/swissMap/Int64/512-10       4.94ns ± 3%   5.01ns ± 4%     ~     (p=0.421 n=5+5)
MapGetHit/swissMap/Int64/1024-10      5.13ns ± 2%   5.16ns ± 3%     ~     (p=1.000 n=5+5)
MapGetHit/swissMap/Int64/2048-10      5.13ns ± 2%   5.23ns ± 2%     ~     (p=0.151 n=5+5)
MapGetHit/swissMap/Int64/4096-10      6.38ns ± 1%   6.64ns ± 1%   +3.97%  (p=0.008 n=5+5)
MapGetHit/swissMap/Int64/8192-10      7.10ns ± 1%   7.43ns ± 1%   +4.71%  (p=0.008 n=5+5)
MapGetHit/swissMap/Int64/32768-10     7.54ns ± 0%   7.88ns ± 2%   +4.48%  (p=0.008 n=5+5)
MapGetHit/swissMap/Int64/65536-10     7.64ns ± 1%   7.87ns ± 0%   +3.10%  (p=0.008 n=5+5)
MapGetHit/swissMap/Int64/131072-10    7.93ns ± 1%   8.16ns ± 1%   +2.88%  (p=0.008 n=5+5)
MapGetHit/swissMap/Int64/262144-10    9.31ns ± 3%  10.04ns ± 3%   +7.81%  (p=0.008 n=5+5)
MapGetHit/swissMap/Int64/524288-10    16.7ns ± 1%   18.2ns ± 3%   +8.98%  (p=0.008 n=5+5)
MapGetHit/swissMap/Int64/1048576-10   24.7ns ± 2%   27.6ns ± 0%  +11.74%  (p=0.008 n=5+5)
MapGetHit/swissMap/Int64/2097152-10   33.3ns ± 1%   37.7ns ± 1%  +13.12%  (p=0.008 n=5+5)
MapGetHit/swissMap/Int64/4194304-10   36.6ns ± 0%   43.0ns ± 1%  +17.37%  (p=0.008 n=5+5)

(I'm not showing the MapGetMiss benchmarks because they were unchanged). Sort of what you would expect. Separating the KVs causes a slowdown for large maps which don't fit in the L2 cache. At small to medium map sizes the performance is basically equivalent. For our usages in Pebble/CockroachDB there isn't any concern about the space wastage. I'm curious how big a concern the space wastage is for runtime map? How often does something like map[uint64]uint32 occur?

I'm curious how big a concern the space wastage is for runtime map? How often does something like map[uint64]uint32 occur?

It is a concern, but not a huge one.
I think the most common, and worst, case would be something like map[uint64]bool. The alternating version's buckets are 50%+ bigger.

My old C++ days are whispering partial specialization, though that created some pretty difficult to follow C++ at times. What would be the Go equivalent? How does the Go compiler dispatch to the _fast32, _fast64, and _faststr variants?

That could be done, but there are a lot of specializations you'd need. It's not just the compiler-generated calls but lots of other routines that currently aren't specialized, like mapiternext.

Maybe you could get away with just 2 specializations: KV and (KV)*. Pick the latter if K and V are fine when alternating.

This is fantastic work, @petermattis ! I think we should work on getting this merged into the runtime, and I'll work with my team to figure out how we can slot that work in.

I ran the benchmarks myself so that I could verify your results and slice and dice the raw data. I made it a little more benchstat-friendly and added a few basic performance counters to see what was going on, but otherwise didn't change the benchmarks. I'll try to send a PR for those tweaks later today. One thing I haven't done but would like to is compare memory footprint.

Here's a link to my raw benchmark data and the benchstat output. I've organized the benchstat so that each benchmark is a separate top-level group of tables. Finally, the data plotted in a couple different ways. (This was a great exercise for the benchplot program I started a couple weeks ago! 😀) This was all done at commit cockroachdb/swiss@fa7bdb1, which notably does not include cockroachdb/swiss#21.

Some observations

  • Across all of the benchmarks, cache-misses/op and cache-references/op are extremely noisy, so take those with a grain of salt. Nevertheless, there's often a big enough spread between the two maps that there's still a statistically significant difference.
  • MapIter is a ~60% improvement across all of the benchmarks and all metrics. Wow.
  • MapGetHit is an improvement across the board except for small int64 and int32 maps. This shows up in ~20% more instructions executed in these cases, presumably for the reasons mentioned up-thread.
  • MapGetMiss is often slower for larger maps, which might be an issue. We see this in cpu-cycles/op, but not in instructions/op, indicating this is from CPU stalls. The cache counters are noisy, but oddly tend to show the same or fewer cache references, but 2–4x the cache misses in these caches, suggesting that may be the issue. This needs more investigation (perhaps a top-down analysis).
  • MapPutGrow and MapPutPreallocate are slower on very small maps. This shows up in humongously more cache references and misses on these cases, relatively speaking, though the absolute numbers are incredibly small. I'm really not sure what's up with that, but probably applying specialization will fix this, too. That absolute numbers are so small I'm not worried about this.
  • MapPutReuse is a win across the board and will probably be even more of a win with small map specialization.
  • MapPutDelete is the only benchmark with somewhat mixed results. This is the only benchmark with deletion, suggesting that could use some work. This needs more investigation.
  • MapGetMiss is often slower for larger maps, which might be an issue. [...] This needs more investigation

One quick comment is that might already be addressed by cockroachdb/swiss#21.

Here are some of the benchmarks in the current README on master, where you can see the CockroachDB SwissTable doing worse for MapGetMiss once it reaches 4096 elements:

MapGetMiss/Int32/2048-16             11.1ns ± 2%     7.1ns ± 2%    -35.74%  (p=0.000 n=10+10)
MapGetMiss/Int32/4096-16             11.2ns ± 3%    14.2ns ± 8%    +26.59%  (p=0.000 n=10+10)
MapGetMiss/Int32/8192-16             11.2ns ± 1%    14.9ns ±25%    +32.64%  (p=0.001 n=10+10)
MapGetMiss/Int32/65536-16            13.7ns ± 1%    17.6ns ± 3%    +28.21%  (p=0.000 n=9+9)

But those particular performance losses go away for at least those benchmarks with the results listed in cockroachdb/swiss#21:

MapGetMiss/Int32/2048-16               11.5ns ± 2%     8.5ns ± 3%      -25.93%  (p=0.000 n=9+10)
MapGetMiss/Int32/4096-16               11.5ns ± 1%    10.1ns ± 2%      -12.27%  (p=0.000 n=10+10)
MapGetMiss/Int32/8192-16               11.6ns ± 2%    10.3ns ± 0%      -11.61%  (p=0.000 n=10+8)
MapGetMiss/Int32/65536-16              14.2ns ± 2%    11.8ns ± 1%      -16.82%  (p=0.000 n=9+10)

Maybe that is explained by the removal of indirections in the bucket directory in cockroachdb/swiss#21, but maybe something else explains it.

Thanks for taking a look and verifying my numbers, Austin.

One thing I haven't done but would like to is compare memory footprint.

Ack, this is something I've been meaning to do as well. In particular, I want to make sure that the for a given map size the memory usage is similar or better than the builtin map. One issue to be aware of here (in case anyone else wants to do this) is that the builtin map and swiss.Map resize at different times and just looking at the memory overhead for a handful of sizes can pessimize your view of one implementation or the other. I think we'd want to compute the average space usage per entry across a range of map sizes.

MapGetMiss is often slower for larger maps, which might be an issue. [...] This needs more investigation

One quick comment is that might already be addressed by cockroachdb/swiss#21.

I was about to post a similar response. MapPutDelete also looks quite a bit better with cockroachdb/swiss#21. I expect that PR to get merged this week.

I'll try to send a PR for those tweaks later today.

cockroachdb/swiss#22 and cockroachdb/swiss#23

One quick comment is that might already be addressed by cockroachdb/swiss#21.

I'll try to run this tonight. (For annoying reasons, I have to run this on my laptop to get the performance counters.)

I'll try to run this tonight.

Comparison with cockroachdb/swiss#21 versus runtime maps: https://gist.github.com/aclements/9fb32ac0a287d2ff360f1bc166cdf4b8 I haven't done a detailed analysis yet, but it's still an overall win.

Comparison of before and after the merge of cockroachdb/swiss#21. Overall impression is that some operations sped up, especially PutDelete and GetMiss on large maps, but iteration, gets, and growing slowed down.

Exciting work @petermattis ! Looks very nice and clean!

In the past few months, I have been trying to deploy the basic version of the swisstable to some services via GOEXPERIMENT at my work. The biggest problem I encountered was that many third-party libraries would copy the definition of structs such as hmap in the runtime and then do some magic things, which often caused strange GC problems.

So I developed a swisstable (via GOEXPERIMENT, but the rehash is still done at once) in the runtime that keeps the original memory layout. This implementation is about 1.3x faster than the original hashmap in the overall results. If anyone needs it, I'm happy to contribute it so that someone can implement extensible hashing based on this version.

And I will run some benchmarks for these implementations in the next few days to provide more information to everyone.

Change https://go.dev/cl/567855 mentions this issue: DO NOT SUBMIT: cmd/compile,runtime: reorder hmap fields

I wrote a new Go Swiss Tables implementation that we're intending to use in Pebble and CockroachDB. The README has benchmarking numbers vs Go's builtin map.

I've experimented with dolthub/swiss on non-AMD64 architectures (ARM, ARM64, etc.) and observed that it underperforms compared to the built-in Go map when AMD64 SIMD instructions are unavailable. I hope cockroachdb/swiss performs better on non-AMD64.

I've experimented with dolthub/swiss on non-AMD64 architectures (ARM, ARM64, etc.) and observed that it underperforms compared to the built-in Go map when AMD64 SIMD instructions are unavailable. I hope cockroachdb/swiss performs better on non-AMD64.

cockroachdb/swiss was developed and initially benchmarked on an M1 Macbook Pro. I haven't tested on graviton yet, but I'd be very surprised if the performance was worse vs Go's builtin map except for at the smallest map sizes. There is no usage of SIMD instructions on either amd64 or arm because in my testing the function call overhead made SIMD a pessimization.

@zhangyunhao116

The biggest problem I encountered was that many third-party libraries would copy the definition of structs such as hmap in the runtime and then do some magic things, which often caused strange GC problems.

This is an interesting problem that could indeed be painful if widespread. Do you have some examples of packages doing this?

I wrote https://go.dev/cl/567855 to jumble the order of hmap in order to try to find some of these. I ran this against all of Google's internal Go tests, and found 0 failures. None at all was surprising to me.

That CL doesn't change hiter, though if packages don't look at hmap I'd be a bit surprised if they look at hiter. It also doesn't adjust the runtime._type for maps, internal/abi.MapType. That I know many packages do look at to extract the hash function (even @petermattis's package does this). However, I wouldn't expect that structure to change as much. If it does change, I suspect we can keep Hasher at the same offset if needed to avoid breakage.

I guess I should have tested hiter. For example, gonum depends on its internals in https://github.com/gonum/gonum/blob/v0.14.0/graph/iterator/map.go.

gonum should probably move to using reflect.(*MapIter).Reset instead of that horror.

AFAIK, many json-related projects do that thing:

Simple tests can't reproduce this problem, it occurs in running services for most cases, since it will only panic in some GC timings.

Run the benchmark(v1 branch) for these different implementations.

The performance ranking: swiss0.txt(439.0) > swiss0-samehash.txt (506.9) > swiss1.txt(558.9) > runtime-swisstable.txt(582.5) > runtime.txt(786.7)

If we move the swisstable from lib to the std(for example, swiss0.txt -> runtime-swisstable.txt), it may be slower for some cases, the reasons are:

  • For iteration, the library version doesn't need to init a hiter, it can easily keep all status in a simple function stack.
  • For other cases, looks like the compiler introduces more overhead compared to just calling a struct function. It still needs more investigation.

Compare swiss0-samehash.txt with swiss1.txt, and we can get the performance result(they are different, the swiss1 can be suitable for more situations).

benchstat swiss0-samehash.txt swiss1.txt
goos: linux
goarch: amd64
pkg: github.com/zhangyunhao116/gomapbench
cpu: AMD Ryzen 7 3700X 8-Core Processor             
                                    │ swiss0-samehash.txt │               swiss1.txt               │
                                    │       sec/op        │     sec/op      vs base                │
MapIter/Int/6-16                             20.14n ±  1%    34.10n ±   2%  +69.31% (p=0.000 n=10)
MapIter/Int/12-16                            38.71n ±  4%    49.21n ±   2%  +27.11% (p=0.000 n=10)
MapIter/Int/18-16                            65.28n ±  2%    75.03n ±   2%  +14.93% (p=0.000 n=10)
MapIter/Int/24-16                            74.51n ±  2%    79.71n ±   1%   +6.99% (p=0.000 n=10)
MapIter/Int/30-16                            115.9n ±  4%    123.0n ±   1%   +6.12% (p=0.000 n=10)
MapIter/Int/64-16                            249.5n ±  3%    238.3n ±   4%   -4.49% (p=0.000 n=10)
MapIter/Int/128-16                           565.0n ±  2%    451.1n ±   2%  -20.16% (p=0.000 n=10)
MapIter/Int/256-16                          1273.5n ±  3%    872.2n ±   2%  -31.51% (p=0.000 n=10)
MapIter/Int/512-16                           2.664µ ±  2%    1.792µ ±   2%  -32.75% (p=0.000 n=10)
MapIter/Int/1024-16                          5.355µ ±  2%    4.017µ ±   1%  -24.99% (p=0.000 n=10)
MapIter/Int/2048-16                         10.776µ ±  2%    9.332µ ±   2%  -13.40% (p=0.000 n=10)
MapIter/Int/4096-16                          21.61µ ±  2%    20.17µ ±   2%   -6.68% (p=0.000 n=10)
MapIter/Int/8192-16                          43.97µ ±  2%    41.20µ ±   2%   -6.30% (p=0.000 n=10)
MapIter/Int/65536-16                         421.9µ ±  3%    333.3µ ±   1%  -20.99% (p=0.000 n=10)
MapAccessHit/Int64/6-16                      6.594n ±  5%    6.532n ±   2%        ~ (p=0.160 n=10)
MapAccessHit/Int64/12-16                     6.616n ±  2%    6.816n ±   9%   +3.02% (p=0.005 n=10)
MapAccessHit/Int64/18-16                     6.576n ± 17%    6.537n ±   1%        ~ (p=0.593 n=10)
MapAccessHit/Int64/24-16                     7.577n ± 12%    7.556n ±  13%        ~ (p=0.971 n=10)
MapAccessHit/Int64/30-16                     6.654n ±  1%    6.598n ±   8%        ~ (p=0.469 n=10)
MapAccessHit/Int64/64-16                     6.910n ±  5%    6.950n ±   4%        ~ (p=0.645 n=10)
MapAccessHit/Int64/128-16                    6.892n ±  3%    6.957n ±   4%        ~ (p=0.542 n=10)
MapAccessHit/Int64/256-16                    6.831n ±  3%    6.838n ±   3%        ~ (p=0.927 n=10)
MapAccessHit/Int64/512-16                    6.959n ±  2%    6.887n ±   2%        ~ (p=0.393 n=10)
MapAccessHit/Int64/1024-16                   7.096n ±  2%    6.917n ±   3%   -2.52% (p=0.014 n=10)
MapAccessHit/Int64/2048-16                   7.588n ±  1%    7.449n ±   1%   -1.83% (p=0.004 n=10)
MapAccessHit/Int64/4096-16                   7.799n ±  1%    8.329n ±   1%   +6.78% (p=0.000 n=10)
MapAccessHit/Int64/8192-16                   7.985n ±  2%    8.505n ±   1%   +6.52% (p=0.000 n=10)
MapAccessHit/Int64/65536-16                  10.39n ±  2%    10.88n ±   2%   +4.71% (p=0.000 n=10)
MapAccessHit/Int32/6-16                      6.629n ±  5%    6.695n ±   2%        ~ (p=0.218 n=10)
MapAccessHit/Int32/12-16                     6.700n ±  8%    6.718n ±   4%        ~ (p=0.912 n=10)
MapAccessHit/Int32/18-16                     6.593n ±  1%    6.694n ±  17%        ~ (p=0.075 n=10)
MapAccessHit/Int32/24-16                     7.533n ± 13%    7.436n ±  11%        ~ (p=0.912 n=10)
MapAccessHit/Int32/30-16                     6.692n ±  6%    6.693n ±   6%        ~ (p=0.930 n=10)
MapAccessHit/Int32/64-16                     6.769n ± 10%    6.824n ±   3%        ~ (p=0.796 n=10)
MapAccessHit/Int32/128-16                    6.896n ±  3%    6.905n ±   5%        ~ (p=0.436 n=10)
MapAccessHit/Int32/256-16                    6.806n ±  3%    6.905n ±   4%        ~ (p=0.165 n=10)
MapAccessHit/Int32/512-16                    6.959n ±  3%    6.935n ±   2%        ~ (p=0.725 n=10)
MapAccessHit/Int32/1024-16                   7.058n ±  2%    7.055n ±   2%        ~ (p=0.971 n=10)
MapAccessHit/Int32/2048-16                   7.467n ±  3%    7.490n ±   1%        ~ (p=0.289 n=10)
MapAccessHit/Int32/4096-16                   7.726n ±  1%    8.467n ±   2%   +9.59% (p=0.000 n=10)
MapAccessHit/Int32/8192-16                   7.837n ±  1%    8.754n ±   1%  +11.69% (p=0.000 n=10)
MapAccessHit/Int32/65536-16                  10.19n ±  2%    11.09n ±   2%   +8.83% (p=0.000 n=10)
MapAccessHit/Str/6-16                        9.648n ±  2%    9.252n ±   1%   -4.10% (p=0.000 n=10)
MapAccessHit/Str/12-16                       9.857n ± 19%    9.334n ±  25%        ~ (p=0.165 n=10)
MapAccessHit/Str/18-16                       9.631n ±  1%    9.260n ±   4%   -3.85% (p=0.002 n=10)
MapAccessHit/Str/24-16                      10.293n ± 16%    9.344n ±  20%        ~ (p=0.063 n=10)
MapAccessHit/Str/30-16                       9.731n ±  1%    9.345n ±   8%        ~ (p=0.469 n=10)
MapAccessHit/Str/64-16                      10.086n ±  4%    9.304n ±   8%   -7.75% (p=0.011 n=10)
MapAccessHit/Str/128-16                     10.615n ±  2%    9.980n ±   3%   -5.98% (p=0.000 n=10)
MapAccessHit/Str/256-16                      11.25n ±  1%    10.59n ±   2%   -5.91% (p=0.001 n=10)
MapAccessHit/Str/512-16                      11.61n ±  2%    10.94n ±   2%   -5.81% (p=0.000 n=10)
MapAccessHit/Str/1024-16                     12.39n ±  2%    11.48n ±   2%   -7.38% (p=0.000 n=10)
MapAccessHit/Str/2048-16                     12.65n ±  1%    11.86n ±   2%   -6.28% (p=0.000 n=10)
MapAccessHit/Str/4096-16                     13.09n ±  1%    13.52n ±   2%   +3.32% (p=0.000 n=10)
MapAccessHit/Str/8192-16                     13.97n ±  5%    14.48n ±   1%   +3.65% (p=0.018 n=10)
MapAccessHit/Str/65536-16                    17.27n ±  3%    17.41n ±   2%        ~ (p=0.342 n=10)
MapAccessMiss/Int64/6-16                     5.954n ±  2%    5.832n ±   2%   -2.04% (p=0.001 n=10)
MapAccessMiss/Int64/12-16                   12.695n ± 53%    5.910n ± 123%  -53.45% (p=0.035 n=10)
MapAccessMiss/Int64/18-16                    5.884n ± 57%    5.607n ±   3%   -4.71% (p=0.000 n=10)
MapAccessMiss/Int64/24-16                    9.572n ± 69%    9.649n ±  65%        ~ (p=0.853 n=10)
MapAccessMiss/Int64/30-16                    5.859n ±  1%    5.531n ±   2%   -5.61% (p=0.001 n=10)
MapAccessMiss/Int64/64-16                    6.163n ±  9%    6.479n ±  11%        ~ (p=0.579 n=10)
MapAccessMiss/Int64/128-16                   6.401n ± 13%    6.438n ±  25%        ~ (p=0.631 n=10)
MapAccessMiss/Int64/256-16                   6.482n ±  5%    6.445n ±   5%        ~ (p=0.971 n=10)
MapAccessMiss/Int64/512-16                   6.237n ±  6%    6.471n ±   4%        ~ (p=0.256 n=10)
MapAccessMiss/Int64/1024-16                  6.747n ±  3%    6.473n ±   6%   -4.06% (p=0.029 n=10)
MapAccessMiss/Int64/2048-16                  6.616n ±  3%    6.429n ±   2%   -2.82% (p=0.029 n=10)
MapAccessMiss/Int64/4096-16                  6.866n ±  2%    7.665n ±   2%  +11.63% (p=0.000 n=10)
MapAccessMiss/Int64/8192-16                  7.135n ±  3%    7.993n ±   2%  +12.03% (p=0.000 n=10)
MapAccessMiss/Int64/65536-16                 9.655n ±  5%   10.310n ±   3%   +6.78% (p=0.000 n=10)
MapAccessMiss/Int32/6-16                     5.952n ±  2%    5.899n ±   3%        ~ (p=0.101 n=10)
MapAccessMiss/Int32/12-16                    12.74n ± 54%    13.11n ±  55%        ~ (p=0.093 n=10)
MapAccessMiss/Int32/18-16                    5.761n ±  1%    5.697n ±   3%        ~ (p=0.280 n=10)
MapAccessMiss/Int32/24-16                    9.509n ± 71%   15.860n ±  40%        ~ (p=0.105 n=10)
MapAccessMiss/Int32/30-16                    5.710n ± 31%    5.571n ±  31%   -2.44% (p=0.009 n=10)
MapAccessMiss/Int32/64-16                    6.569n ± 14%    6.081n ±   8%        ~ (p=0.075 n=10)
MapAccessMiss/Int32/128-16                   6.594n ±  8%    6.530n ±  12%        ~ (p=0.739 n=10)
MapAccessMiss/Int32/256-16                   6.369n ±  7%    6.651n ±   6%        ~ (p=0.218 n=10)
MapAccessMiss/Int32/512-16                   6.481n ±  4%    6.536n ±   7%        ~ (p=0.781 n=10)
MapAccessMiss/Int32/1024-16                  6.652n ±  3%    6.502n ±   6%        ~ (p=0.072 n=10)
MapAccessMiss/Int32/2048-16                  6.492n ±  5%    6.479n ±   3%        ~ (p=1.000 n=10)
MapAccessMiss/Int32/4096-16                  6.881n ±  3%    7.679n ±   3%  +11.58% (p=0.000 n=10)
MapAccessMiss/Int32/8192-16                  6.902n ±  3%    7.948n ±   1%  +15.16% (p=0.000 n=10)
MapAccessMiss/Int32/65536-16                 9.342n ±  4%   10.200n ±   3%   +9.18% (p=0.000 n=10)
MapAccessMiss/Str/6-16                       8.100n ± 12%    7.982n ±   7%        ~ (p=0.325 n=10)
MapAccessMiss/Str/12-16                      8.622n ± 15%    9.386n ±  16%   +8.86% (p=0.043 n=10)
MapAccessMiss/Str/18-16                      8.951n ± 11%    8.816n ±   8%        ~ (p=0.165 n=10)
MapAccessMiss/Str/24-16                      9.342n ±  7%    9.331n ±   9%        ~ (p=0.739 n=10)
MapAccessMiss/Str/30-16                      8.056n ±  6%    7.982n ±   7%        ~ (p=0.631 n=10)
MapAccessMiss/Str/64-16                      8.498n ±  6%    8.973n ±  10%        ~ (p=0.063 n=10)
MapAccessMiss/Str/128-16                     8.428n ±  5%    8.785n ±   6%        ~ (p=0.075 n=10)
MapAccessMiss/Str/256-16                     8.534n ±  3%    8.545n ±   6%        ~ (p=0.971 n=10)
MapAccessMiss/Str/512-16                     8.592n ±  2%    8.513n ±   8%        ~ (p=0.481 n=10)
MapAccessMiss/Str/1024-16                    8.809n ±  3%    9.086n ±   7%        ~ (p=0.190 n=10)
MapAccessMiss/Str/2048-16                    8.995n ±  3%    9.083n ±   2%        ~ (p=0.315 n=10)
MapAccessMiss/Str/4096-16                    9.207n ±  1%   10.019n ±   2%   +8.83% (p=0.000 n=10)
MapAccessMiss/Str/8192-16                    9.851n ±  2%   10.410n ±   2%   +5.67% (p=0.000 n=10)
MapAccessMiss/Str/65536-16                   12.14n ±  1%    12.67n ±   2%   +4.37% (p=0.000 n=10)
MapAssignGrow/Int64/6-16                     250.1n ±  1%    308.6n ±   0%  +23.39% (p=0.000 n=10)
MapAssignGrow/Int64/12-16                    590.8n ±  1%    699.9n ±   1%  +18.47% (p=0.000 n=10)
MapAssignGrow/Int64/18-16                    1.124µ ±  0%    1.319µ ±   2%  +17.40% (p=0.000 n=10)
MapAssignGrow/Int64/24-16                    1.254µ ±  0%    1.451µ ±   3%  +15.67% (p=0.000 n=10)
MapAssignGrow/Int64/30-16                    2.223µ ±  0%    2.527µ ±   1%  +13.70% (p=0.000 n=10)
MapAssignGrow/Int64/64-16                    4.581µ ±  1%    5.065µ ±   0%  +10.55% (p=0.000 n=10)
MapAssignGrow/Int64/128-16                   9.175µ ±  0%    9.947µ ±   0%   +8.42% (p=0.000 n=10)
MapAssignGrow/Int64/256-16                   18.19µ ±  0%    19.40µ ±   1%   +6.65% (p=0.000 n=10)
MapAssignGrow/Int64/512-16                   36.20µ ±  1%    38.08µ ±   0%   +5.18% (p=0.000 n=10)
MapAssignGrow/Int64/1024-16                  72.65µ ±  2%    76.01µ ±   1%   +4.62% (p=0.000 n=10)
MapAssignGrow/Int64/2048-16                  144.4µ ±  1%    151.8µ ±   0%   +5.12% (p=0.000 n=10)
MapAssignGrow/Int64/4096-16                  289.3µ ±  1%    356.2µ ±   2%  +23.12% (p=0.000 n=10)
MapAssignGrow/Int64/8192-16                  580.3µ ±  0%    784.5µ ±   1%  +35.19% (p=0.000 n=10)
MapAssignGrow/Int64/65536-16                 5.166m ±  2%    7.346m ±   0%  +42.20% (p=0.000 n=10)
MapAssignGrow/Int32/6-16                     244.7n ±  1%    339.6n ±   1%  +38.83% (p=0.000 n=10)
MapAssignGrow/Int32/12-16                    568.6n ±  1%    720.8n ±   1%  +26.77% (p=0.000 n=10)
MapAssignGrow/Int32/18-16                    1.056µ ±  1%    1.326µ ±   1%  +25.51% (p=0.000 n=10)
MapAssignGrow/Int32/24-16                    1.190µ ±  1%    1.456µ ±   0%  +22.35% (p=0.000 n=10)
MapAssignGrow/Int32/30-16                    2.073µ ±  1%    2.540µ ±   0%  +22.53% (p=0.000 n=10)
MapAssignGrow/Int32/64-16                    4.255µ ±  1%    5.079µ ±   0%  +19.37% (p=0.000 n=10)
MapAssignGrow/Int32/128-16                   8.462µ ±  1%   10.049µ ±   1%  +18.75% (p=0.000 n=10)
MapAssignGrow/Int32/256-16                   16.73µ ±  1%    19.43µ ±   1%  +16.18% (p=0.000 n=10)
MapAssignGrow/Int32/512-16                   33.29µ ±  0%    38.10µ ±   0%  +14.46% (p=0.000 n=10)
MapAssignGrow/Int32/1024-16                  66.10µ ±  1%    76.65µ ±   1%  +15.96% (p=0.000 n=10)
MapAssignGrow/Int32/2048-16                  132.9µ ±  0%    152.6µ ±   1%  +14.80% (p=0.000 n=10)
MapAssignGrow/Int32/4096-16                  266.9µ ±  0%    357.2µ ±   0%  +33.80% (p=0.000 n=10)
MapAssignGrow/Int32/8192-16                  538.2µ ±  0%    790.4µ ±   1%  +46.85% (p=0.000 n=10)
MapAssignGrow/Int32/65536-16                 4.586m ±  1%    7.451m ±   1%  +62.46% (p=0.000 n=10)
MapAssignGrow/Str/6-16                       310.9n ±  0%    408.8n ±   1%  +31.47% (p=0.000 n=10)
MapAssignGrow/Str/12-16                      771.1n ±  0%    933.2n ±   0%  +21.01% (p=0.000 n=10)
MapAssignGrow/Str/18-16                      1.512µ ±  0%    1.798µ ±   0%  +18.88% (p=0.000 n=10)
MapAssignGrow/Str/24-16                      1.678µ ±  0%    2.019µ ±   1%  +20.36% (p=0.000 n=10)
MapAssignGrow/Str/30-16                      2.968µ ±  1%    3.480µ ±   0%  +17.27% (p=0.000 n=10)
MapAssignGrow/Str/64-16                      6.028µ ±  0%    6.911µ ±   0%  +14.64% (p=0.000 n=10)
MapAssignGrow/Str/128-16                     11.89µ ±  1%    13.53µ ±   1%  +13.81% (p=0.000 n=10)
MapAssignGrow/Str/256-16                     23.66µ ±  1%    26.66µ ±   0%  +12.68% (p=0.000 n=10)
MapAssignGrow/Str/512-16                     46.93µ ±  0%    52.29µ ±   1%  +11.44% (p=0.000 n=10)
MapAssignGrow/Str/1024-16                    94.57µ ±  0%   103.92µ ±   1%   +9.89% (p=0.000 n=10)
MapAssignGrow/Str/2048-16                    192.8µ ±  0%    210.6µ ±   1%   +9.22% (p=0.000 n=10)
MapAssignGrow/Str/4096-16                    399.2µ ±  1%    469.2µ ±   4%  +17.54% (p=0.000 n=10)
MapAssignGrow/Str/8192-16                    820.4µ ±  1%   1019.2µ ±   0%  +24.23% (p=0.000 n=10)
MapAssignGrow/Str/65536-16                   10.14m ±  4%    10.83m ±   1%   +6.84% (p=0.000 n=10)
MapAssignPreAllocate/Int64/6-16              252.7n ±  0%    296.3n ±   0%  +17.28% (p=0.000 n=10)
MapAssignPreAllocate/Int64/12-16             377.3n ±  0%    463.5n ±   0%  +22.85% (p=0.000 n=10)
MapAssignPreAllocate/Int64/18-16             536.2n ±  1%    623.5n ±   3%  +16.26% (p=0.000 n=10)
MapAssignPreAllocate/Int64/24-16             662.7n ±  0%    760.0n ±   0%  +14.68% (p=0.000 n=10)
MapAssignPreAllocate/Int64/30-16             828.4n ±  1%    954.9n ±   0%  +15.27% (p=0.000 n=10)
MapAssignPreAllocate/Int64/64-16             1.566µ ±  1%    1.791µ ±   0%  +14.34% (p=0.000 n=10)
MapAssignPreAllocate/Int64/128-16            2.964µ ±  1%    3.226µ ±   2%   +8.84% (p=0.000 n=10)
MapAssignPreAllocate/Int64/256-16            5.588µ ±  1%    6.232µ ±   1%  +11.54% (p=0.000 n=10)
MapAssignPreAllocate/Int64/512-16            10.88µ ±  0%    11.61µ ±   4%   +6.70% (p=0.000 n=10)
MapAssignPreAllocate/Int64/1024-16           22.51µ ±  2%    23.92µ ±   0%   +6.27% (p=0.000 n=10)
MapAssignPreAllocate/Int64/2048-16           45.26µ ±  1%    48.58µ ±   1%   +7.35% (p=0.000 n=10)
MapAssignPreAllocate/Int64/4096-16           91.40µ ±  1%   114.93µ ±   1%  +25.75% (p=0.000 n=10)
MapAssignPreAllocate/Int64/8192-16           186.9µ ±  1%    242.2µ ±   1%  +29.57% (p=0.000 n=10)
MapAssignPreAllocate/Int64/65536-16          2.162m ±  2%    2.873m ±   2%  +32.85% (p=0.000 n=10)
MapAssignPreAllocate/Int32/6-16              241.7n ±  1%    300.2n ±   1%  +24.23% (p=0.000 n=10)
MapAssignPreAllocate/Int32/12-16             370.3n ±  0%    466.8n ±   0%  +26.04% (p=0.000 n=10)
MapAssignPreAllocate/Int32/18-16             525.7n ±  0%    631.8n ±   1%  +20.17% (p=0.000 n=10)
MapAssignPreAllocate/Int32/24-16             649.8n ±  0%    765.5n ±   1%  +17.81% (p=0.000 n=10)
MapAssignPreAllocate/Int32/30-16             801.8n ±  0%    970.3n ±   1%  +21.00% (p=0.000 n=10)
MapAssignPreAllocate/Int32/64-16             1.496µ ±  1%    1.805µ ±   1%  +20.66% (p=0.000 n=10)
MapAssignPreAllocate/Int32/128-16            2.815µ ±  1%    3.355µ ±   1%  +19.17% (p=0.000 n=10)
MapAssignPreAllocate/Int32/256-16            5.400µ ±  1%    6.269µ ±   1%  +16.09% (p=0.000 n=10)
MapAssignPreAllocate/Int32/512-16            10.67µ ±  0%    12.15µ ±   0%  +13.88% (p=0.000 n=10)
MapAssignPreAllocate/Int32/1024-16           21.00µ ±  1%    24.91µ ±   0%  +18.61% (p=0.000 n=10)
MapAssignPreAllocate/Int32/2048-16           44.21µ ±  1%    50.66µ ±   1%  +14.61% (p=0.000 n=10)
MapAssignPreAllocate/Int32/4096-16           89.86µ ±  0%   118.38µ ±   1%  +31.74% (p=0.000 n=10)
MapAssignPreAllocate/Int32/8192-16           184.9µ ±  0%    248.6µ ±   1%  +34.46% (p=0.000 n=10)
MapAssignPreAllocate/Int32/65536-16          2.139m ±  3%    2.969m ±   1%  +38.82% (p=0.000 n=10)
MapAssignPreAllocate/Str/6-16                319.5n ±  0%    368.6n ±   1%  +15.35% (p=0.000 n=10)
MapAssignPreAllocate/Str/12-16               515.3n ±  0%    604.7n ±   2%  +17.34% (p=0.000 n=10)
MapAssignPreAllocate/Str/18-16               740.6n ±  1%    862.4n ±   1%  +16.45% (p=0.000 n=10)
MapAssignPreAllocate/Str/24-16               898.2n ±  1%   1049.0n ±   1%  +16.78% (p=0.000 n=10)
MapAssignPreAllocate/Str/30-16               1.147µ ±  0%    1.341µ ±   0%  +16.92% (p=0.000 n=10)
MapAssignPreAllocate/Str/64-16               2.201µ ±  0%    2.522µ ±   1%  +14.61% (p=0.000 n=10)
MapAssignPreAllocate/Str/128-16              4.070µ ±  1%    4.664µ ±   0%  +14.58% (p=0.000 n=10)
MapAssignPreAllocate/Str/256-16              8.011µ ±  0%    9.105µ ±   2%  +13.67% (p=0.000 n=10)
MapAssignPreAllocate/Str/512-16              15.77µ ±  1%    17.92µ ±   1%  +13.64% (p=0.000 n=10)
MapAssignPreAllocate/Str/1024-16             32.82µ ±  1%    36.80µ ±   0%  +12.13% (p=0.000 n=10)
MapAssignPreAllocate/Str/2048-16             67.33µ ±  1%    75.25µ ±   0%  +11.76% (p=0.000 n=10)
MapAssignPreAllocate/Str/4096-16             140.1µ ±  1%    175.6µ ±   1%  +25.34% (p=0.000 n=10)
MapAssignPreAllocate/Str/8192-16             318.4µ ±  1%    399.6µ ±   1%  +25.51% (p=0.000 n=10)
MapAssignPreAllocate/Str/65536-16            3.702m ±  1%    5.426m ±   1%  +46.58% (p=0.000 n=10)
MapAssignReuse/Int64/6-16                    63.90n ±  5%    73.40n ±   3%  +14.88% (p=0.000 n=10)
MapAssignReuse/Int64/12-16                   118.7n ±  1%    155.2n ±   8%  +30.80% (p=0.000 n=10)
MapAssignReuse/Int64/18-16                   177.3n ±  1%    224.0n ±   3%  +26.40% (p=0.000 n=10)
MapAssignReuse/Int64/24-16                   240.8n ±  2%    280.2n ±   5%  +16.36% (p=0.000 n=10)
MapAssignReuse/Int64/30-16                   281.2n ±  4%    338.4n ±   3%  +20.34% (p=0.000 n=10)
MapAssignReuse/Int64/64-16                   582.7n ±  2%    888.2n ±   2%  +52.43% (p=0.000 n=10)
MapAssignReuse/Int64/128-16                  1.142µ ±  3%    1.421µ ±   3%  +24.39% (p=0.000 n=10)
MapAssignReuse/Int64/256-16                  2.317µ ±  1%    3.186µ ±   5%  +37.48% (p=0.000 n=10)
MapAssignReuse/Int64/512-16                  4.612µ ±  1%    6.098µ ±   3%  +32.22% (p=0.000 n=10)
MapAssignReuse/Int64/1024-16                 9.251µ ±  2%   11.401µ ±   1%  +23.24% (p=0.000 n=10)
MapAssignReuse/Int64/2048-16                 19.71µ ±  1%    24.34µ ±   1%  +23.46% (p=0.000 n=10)
MapAssignReuse/Int64/4096-16                 40.42µ ±  3%    57.29µ ±   2%  +41.75% (p=0.000 n=10)
MapAssignReuse/Int64/8192-16                 83.04µ ±  4%   117.62µ ±   1%  +41.65% (p=0.000 n=10)
MapAssignReuse/Int64/65536-16                900.2µ ±  1%   1321.0µ ±   1%  +46.75% (p=0.000 n=10)
MapAssignReuse/Int32/6-16                    64.00n ±  2%    75.00n ±   2%  +17.20% (p=0.000 n=10)
MapAssignReuse/Int32/12-16                   121.2n ±  2%    151.2n ±   4%  +24.75% (p=0.000 n=10)
MapAssignReuse/Int32/18-16                   176.5n ±  4%    211.9n ±   2%  +20.08% (p=0.000 n=10)
MapAssignReuse/Int32/24-16                   239.8n ±  1%    279.8n ±   1%  +16.66% (p=0.000 n=10)
MapAssignReuse/Int32/30-16                   282.2n ±  2%    353.2n ±   1%  +25.16% (p=0.000 n=10)
MapAssignReuse/Int32/64-16                   580.5n ±  4%    736.4n ±   4%  +26.86% (p=0.000 n=10)
MapAssignReuse/Int32/128-16                  1.149µ ±  3%    1.464µ ±   2%  +27.47% (p=0.000 n=10)
MapAssignReuse/Int32/256-16                  2.293µ ±  1%    2.907µ ±   1%  +26.78% (p=0.000 n=10)
MapAssignReuse/Int32/512-16                  4.560µ ±  1%    5.790µ ±   2%  +26.96% (p=0.000 n=10)
MapAssignReuse/Int32/1024-16                 9.118µ ±  2%   11.758µ ±   1%  +28.96% (p=0.000 n=10)
MapAssignReuse/Int32/2048-16                 19.36µ ±  1%    24.75µ ±   2%  +27.85% (p=0.000 n=10)
MapAssignReuse/Int32/4096-16                 40.15µ ±  1%    58.61µ ±   3%  +45.98% (p=0.000 n=10)
MapAssignReuse/Int32/8192-16                 80.71µ ±  3%   120.79µ ±   4%  +49.66% (p=0.000 n=10)
MapAssignReuse/Int32/65536-16                876.2µ ±  1%   1341.3µ ±   1%  +53.07% (p=0.000 n=10)
MapAssignReuse/Str/6-16                      93.27n ±  1%   101.05n ±   2%   +8.34% (p=0.000 n=10)
MapAssignReuse/Str/12-16                     193.3n ±  1%    199.2n ±   1%   +3.03% (p=0.000 n=10)
MapAssignReuse/Str/18-16                     297.9n ±  1%    294.7n ±   1%   -1.07% (p=0.024 n=10)
MapAssignReuse/Str/24-16                     387.0n ±  1%    394.7n ±   1%   +1.99% (p=0.000 n=10)
MapAssignReuse/Str/30-16                     504.8n ±  2%    494.4n ±   2%   -2.05% (p=0.015 n=10)
MapAssignReuse/Str/64-16                     1.052µ ±  2%    1.044µ ±   2%   -0.67% (p=0.010 n=10)
MapAssignReuse/Str/128-16                    2.107µ ±  1%    2.067µ ±   2%   -1.88% (p=0.001 n=10)
MapAssignReuse/Str/256-16                    4.200µ ±  1%    4.095µ ±   4%   -2.50% (p=0.001 n=10)
MapAssignReuse/Str/512-16                    8.428µ ±  1%    8.169µ ±   1%   -3.07% (p=0.000 n=10)
MapAssignReuse/Str/1024-16                   17.43µ ±  2%    16.99µ ±   2%   -2.52% (p=0.000 n=10)
MapAssignReuse/Str/2048-16                   35.99µ ±  2%    35.03µ ±   2%   -2.68% (p=0.004 n=10)
MapAssignReuse/Str/4096-16                   73.59µ ±  2%    80.23µ ±   4%   +9.02% (p=0.000 n=10)
MapAssignReuse/Str/8192-16                   157.6µ ±  3%    173.0µ ±   2%   +9.79% (p=0.000 n=10)
MapAssignReuse/Str/65536-16                  1.643m ±  4%    1.773m ±   1%   +7.91% (p=0.000 n=10)
geomean                                      506.9n          558.9n         +10.26%

Change https://go.dev/cl/569342 mentions this issue: cmd/compile,runtime: reorder hiter fields

I did another round of testing with https://go.dev/cl/569342, and all of the failures I see are from jsoniter -> reflect2 (https://github.com/modern-go/reflect2/blob/master/unsafe_map.go#L104-L130), as mentioned by @zhangyunhao116 in #54766 (comment)

Change https://go.dev/cl/580779 mentions this issue: all: split old and swiss map abi and compiler integration

Change https://go.dev/cl/580778 mentions this issue: all: create swissmap experiment and fork files

Change https://go.dev/cl/580777 mentions this issue: runtime: move zeroVal out of map.go

Change https://go.dev/cl/580916 mentions this issue: DO NOT SUBMIT: runtime: delete swiss map iter implementation

Change https://go.dev/cl/580915 mentions this issue: cmd/compile,runtime: disable swissmap fast variants

Change https://go.dev/cl/582415 mentions this issue: internal/maps: initial swiss table map implementation

Change https://go.dev/cl/582417 mentions this issue: internal/maps: make map natively support 64-bit sizes

Change https://go.dev/cl/582421 mentions this issue: internal/maps: add Clear

Change https://go.dev/cl/582423 mentions this issue: internal/maps: clear zeroed slots

Change https://go.dev/cl/582424 mentions this issue: internal/maps: add iteration

Change https://go.dev/cl/582418 mentions this issue: internal/maps: add basic growing

Change https://go.dev/cl/582416 mentions this issue: internal/maps: add deletion

Change https://go.dev/cl/582420 mentions this issue: internal/maps: add load factor based grow

Change https://go.dev/cl/582422 mentions this issue: internal/maps: add rehash in place

Change https://go.dev/cl/594597 mentions this issue: DO NOT SUBMIT: all: enable GOEXPERIMENT=swissmap by default

Change https://go.dev/cl/594596 mentions this issue: all: wire up swisstable maps

Change https://go.dev/cl/594656 mentions this issue: reflect: add flag tests for MapOf

Change https://go.dev/cl/595116 mentions this issue: internal/runtime/sys: move from runtime/internal/sys

Change https://go.dev/cl/595295 mentions this issue: cmd/gomote,devapp: handle internal/runtime/sys move

Change https://go.dev/cl/595558 mentions this issue: runtime: don't use maps in js note implementation

Change https://go.dev/cl/596295 mentions this issue: runtime: more thorough map benchmarks

Change https://go.dev/cl/602555 mentions this issue: Add linux-amd64-longtest-swissmap builder

Change https://go.dev/cl/582422 mentions this issue: internal/maps: add rehash in place

Change https://go.dev/cl/604936 mentions this issue: cmd/compile,internal/runtime/maps: add extendible hashing