dolthub/swiss

Advice: I think there are many ways to make go-swiss better

ahfuzhang opened this issue · 2 comments

Give some advice about this lib:

  1. ctrl byte is everything, so no need to set kv when delete.
    those lines are not need:

    swiss/map.go

    Line 141 in c0064a0

    m.groups[g].keys[s] = key

swiss/map.go

Line 190 in c0064a0

m.groups[g].keys[s] = k

swiss/map.go

Line 246 in c0064a0

for i := range m.groups {

2.when init ctrl bytes, we can use simd or unsafe code, not one by one.
see:

swiss/map.go

Line 69 in c0064a0

for i := range m.ctrl {

3.If the group's ctrl bytes are all empty, Iter() can be faster.
see:

swiss/map.go

Line 220 in c0064a0

for n := 0; n < len(groups); n++ {

we can add: if metaMatchEmpty(&m.ctrl[g])==0xffff{continue;}

4.Let ctrl bytes slice address is align to 32, so we can use _mm_load_128() to replace __mm_loadu_128():
see:

swiss/map.go

Line 64 in c0064a0

ctrl: make([]metadata, groups),

alloc more 32 bytes, and use unsafe code to choose a start address of 128bit alignment.

  1. If group count is power of 2, not need to use if to check range.
    see:

    swiss/map.go

    Line 96 in c0064a0

    if g >= uint32(len(m.groups)) {

    g = g & 0xffff // if group count is 65536

6.try groupSize=32, and use avx2 to match 32 bytes each time.

  1. XXHash might be better than golang map hasher.

I think I understand the purpose of those lines on advice 1:

when the content of the object is the same but the pointer is different, replace it with the current object. And when delete, we should set to null for gc.

However, for ultimate performance optimization, it is recommended to provide a value type version. Then remove the operations that are valid for reference types but not valid for value types.