bytedance/sonic

optimize: extra string/[]byte copies when calling Unmarshal

mmosky opened this issue · 1 comments

When using sonic.Unmarshal, unnecessary string and []byte conversions occur, leading to additional copies.

Initially, the byte slice passed to sonic.Unmarshal is converted to a string, resulting in the first copy.

Subsequently, UnmarshalFromString is invoked, internally employing bytes.NewBufferString, which again converts the string back to a byte slice, resulting in the second copy.

A simplest case:

func BenchmarkSonicUnmarshal(t *testing.B) {
	for i := 0; i < t.N; i++ {
		data := []byte{'1'}
		var x int64
		// err := json.Unmarshal(data, &x) // 3 allocs/op
		err := sonic.Unmarshal(data, &x) // 7 allocs/op
		if err != nil {
			t.Fatal(err.Error())
		}
	}
}

Unmarshal arguement is []byte, which is MUTABLE in Golang thus must to be converted to be IMMUTABLE -- This is a standard way obey Golang's specs.
As for performance, you can choose to use UnmarshalFromString or StreamDecoder at first, it all depends on yourself - as long as YOU KNOW WHAT YOUR BUSINESS IS DOING