performance: reducing object allocation overhead via interning
Jesse-Cameron opened this issue · 0 comments
Hi friends,
I wanted to start a discussion about the performance of go-jsonnet
. Specifically, looking at how long it takes for go-jsonnet
to run bench.03.
Locally, when running this snippet in a go bench, it takes ~320ms and ~190MB to run. Which to me feels slower than it should be.
Investigation
I started doing some initial profiling using pprof
. But there were no low hanging fruit that I felt could dramatically improve perf. And I landed on a theory that each time it creates a new function on the call stack, theres a really high allocation overhead?
Reading through this issue: #111. I wanted to experiment with the possibility of using string interning to reduce allocations. Using the go4.org/intern
package I replaced the Identifier
type. And updated references to it.
diff --git a/ast/ast.go b/ast/ast.go
index 90e970f..94b2a19 100644
--- a/ast/ast.go
+++ b/ast/ast.go
@@ -19,15 +19,25 @@ package ast
import (
"fmt"
+
+ "go4.org/intern"
)
// Identifier represents a variable / parameter / field name.
// +gen set
-type Identifier string
+type Identifier *intern.Value
+func NewIdentifier(s string) Identifier {
+ return Identifier(intern.GetByString(s))
+}
+
+func GetString(i *intern.Value) string {
+ return i.Get().(string)
+}
+
You can check the whole changeset here. Getting the stdast
dump to work with the interned strings was a bit of fun 😅 !
Unfortunately, the performance improvement from this change left me wanting more. Shaving a handful of ms
and mb
off of execution doesn't feel like it will be noticeable to users.
name old time/op new time/op delta
_VM-8 316ms ± 0% 298ms ± 0% ~ (p=1.000 n=1+1)
name old alloc/op new alloc/op delta
_VM-8 186MB ± 0% 155MB ± 0% ~ (p=1.000 n=1+1)
name old allocs/op new allocs/op delta
_VM-8 3.64M ± 0% 3.64M ± 0% ~ (p=1.000 n=1+1)
Discussion
If maintainers think it's worth it. I'm more than happy to clean the code above up and submit a PR. But I'm still questioning a few things:
- Is there a more effective way to reduce allocations outside of interning?
- Is adding string interning with the ROI?
- Did I miss something obvious with my interning implementation that caps the perf gain to a certain amount?
Thanks for reading this far!~ Be keep to hear folks thoughts 😁 😁