puzpuzpuz/xsync

Is it possible to use zero value of Map/MapOf struct?

elee1766 opened this issue · 5 comments

hi,

followed over here from the golang issue.

was testing using the library as a drop in replacement for existing sync.Map and a generic sync.MapOf[K,V] wrapper that we use, but a big issue is that the zero value isn't valid, which would result in a lot of incompatible refactoring for our code. it worked great when i did initialize them.

ideally, something like this would not panic.

func TestMap_ZeroValueValid(t *testing.T) {
	EnableAssertions()
	m := new(Map)
	v := 42
	m.Store("foo", v)
	m.Store("foo", v)
	DisableAssertions()
}

I have a naive solution that uses a sync.Once, happy to open a PR if this is something you would consider changing.

all i did was just just add a sync.Once

type Map struct {
   totalGrowths int64
   totalShrinks int64
   resizing     int64          // resize in progress flag; updated atomically
   resizeMu     sync.Mutex     // only used along with resizeCond
   resizeCond   sync.Cond      // used to wake up resize waiters (concurrent modifications)
   table        unsafe.Pointer // *mapTable

   initialized sync.Once
}

adding init function

func (m *Map) init(sizeHint int) {
	m.initialized.Do(func() {
		m.resizeCond = *sync.NewCond(&m.resizeMu)
		var table *mapTable
		if sizeHint <= minMapTableCap {
			table = newMapTable(minMapTableLen)
		} else {
			tableLen := nextPowOf2(uint32(sizeHint / entriesPerMapBucket))
			table = newMapTable(int(tableLen))
		}
		atomic.StorePointer(&m.table, unsafe.Pointer(table))
	})
}

changing NewMapPresized function

func NewMapPresized(sizeHint int) *Map {
	m := &Map{}
	m.init(sizeHint)
	return m
}

then i added init to these entrypoints:

func (m *Map) Load(key string) (value interface{}, ok bool) {
	m.init(minMapTableCap)
...
}
func (m *Map) Clear() {
	m.init(minMapTableCap)
...
}
func (m *Map) Size() int {
	m.init(minMapTableCap)
...
}
func (m *Map) Range(f func(key string, value interface{}) bool) {
	m.init(minMapTableCap)
...
}
func (m *Map) doCompute(
	key string,
	valueFn func(oldValue interface{}, loaded bool) (interface{}, bool),
	loadIfExists, computeOnly bool,
) (interface{}, bool) {
	m.init(minMapTableCap)
}

and the same thing for the generic.

First of all, thanks for raising this issue.

Such change would add one more atomic read (atomic.LoadUint32 in case of sync.Once) per each operation. Which this cost is rather low, all users will have to pay it even if they use New* functions. So, I'd like to gather some upvotes on this issue before making this change.

As for your use case, you could implement a simple in-house wrapper struct and use it across your code base:

type Map struct {
	m           *xsync.Map
	initialized sync.Once
}

func (m *Map) init() {
	m.initialized.Do(func() {
		m.m = xsync.NewMap()
	})
}

func (m *Map) Load(key string) (value interface{}, ok bool) {
	m.init()
	return m.m.Load(key)
}

func (m *Map) Store(key string, value interface{}) {
	m.init()
	m.m.Store(key, value)
}

// Other wrapped xsync.Map methods go here...

func main() {
	var m Map
	m.Store("foobar", 42)
	v, _ := m.Load("foobar")
	fmt.Println(v)
}

According to Go memory model, sync.Once introduces a synchronization edge, so plain memory accesses after the m.init() call are fine.

thanks for giving attention. i'll probably just replace directive the package in the project where the performance matters

that makes a lot of sense. If we want to reduce the overhead for those already using the New* functions, could doing something like the below possibly be a compromise? (making the comparison for existing users non atomic) It adds some complexity so I could understand any aversion

type Map struct {
   totalGrowths int64
   totalShrinks int64
   resizing     int64          // resize in progress flag; updated atomically
   resizeMu     sync.Mutex     // only used along with resizeCond
   resizeCond   sync.Cond      // used to wake up resize waiters (concurrent modifications)
   table        unsafe.Pointer // *mapTable

   preinit bool
   initialized sync.Once
}

func NewMapPresized(sizeHint int) *Map {
	m := &Map{}
	m.init(sizeHint)
	m.preinit = true
	return m
}

func (m *Map) init(sizeHint int) {
        if m.preinit {
          return 
        }
	m.initialized.Do(func() {
		m.resizeCond = *sync.NewCond(&m.resizeMu)
		var table *mapTable
		if sizeHint <= minMapTableCap {
			table = newMapTable(minMapTableLen)
		} else {
			tableLen := nextPowOf2(uint32(sizeHint / entriesPerMapBucket))
			table = newMapTable(int(tableLen))
		}
		atomic.StorePointer(&m.table, unsafe.Pointer(table))
	})
}

Yes, I guess that approach would do the job.

ok, i put it here for now https://github.com/elee1766/xsync/blob/elee/map.go

this seems to be okay.

as for MapOf, making the zero value valid would mean being able to reflect out a hasher function i believe? since i dont think you can switch on a type assertion from the generic, i don't know if it's possible to preserve performance without resorting to unsafe. i believe that the package is currently getting some fancy compiler optimizations or something i dont completely understand out of the generic use in mapof which makes it so much faster.

i tried to find out how go themselves picks the function, it looks like it's here? https://github.com/golang/go/blob/master/src/cmd/compile/internal/reflectdata/alg.go#L51 not too sure.

Yes, MapOf would require a setter method for the hasher function which is kind of ugly and error-prone. But since MapOf can't act as a drop-in replacement for sync.Map, I'm not sure if it's worth adding zero value support there.