Performance expectations (implementers and JS developers)
littledan opened this issue · 18 comments
When I've talked with JavaScript developers about this feature, they mention expectations of cheap sharing with workers, cheap structural sharing (e.g., with #[...a, ...b]
not copying everything), and cheap equality. These are key parts of the motivation for the proposal. Something to investigate early is, can these be implemented realistically in JS engines? We should document the state of our understanding in the README, since it's so core to the proposal's use in reality, even if it doesn't affect specification text.
Ok, I purposefully did not talk about structural sharing as I wanted to leave that to the implementers but I can definitely try to spec that out or make it clear it is a motivation.
Well, we will inevitably leave it for implementers, but it would be nice to hear their take on it before getting too far, given JS dev interest.
What implementers can we cc on this issue?
ping @littledan for advice
I believe @syg is looking into this for V8. @kmiller68 and @jorendorff may have thoughts for JSC and SpiderMonkey, and @phoddie for Moddable.
For now, I think we should say something like the following about the performance goals:
A core goal of this proposal is to give JavaScript engines the capability to implement the same kinds of optimizations for this feature as libraries for functional data structures.
When discussing it at more length , we could say:
This proposal is designed to enable classical optimizations for purely functional data structures, including but not limited to:
- Optimizations for making deep equality checks fast:
- For returning
true
quickly, intern ("hash-cons") some data structures- For returning
false
quickly, maintain a hash up the tree of the contents of some structures- Optimizations for manipulating data structures
- In some cases, reuse existing data structures (e.g., when manipulated with object spread), similar to ropes or typical implementations of functional data structures
- In other cases, as determined by the engine, use a flat representation like existing JavaScript object implementations
These optimizations are analogous to the way that modern JavaScript engines handle string concatenation, with various different internal types of strings. The validity of these optimizations rests on the unobservability of the identity of records and tuples. It's not expected that all engines will act identically with respect to these optimizations, but rather, they will each make decisions about particular heuristics to use. Before Stage 4 of this proposal, we plan to publish a guide for best practices for cross-engine optimizable use of Records and Tuples, based on the implementation experience that we will have at that point.
FWIW, from Hermes point of view, this will probably add binary size and complexity (one more case to be handled at runtime). The tradeoffs might be different for a JIT, but the wins in the case of an interpreter are not obvious to me.
I think it probably makes sense for lots of JS implementations not not implement any of these optimizations. They are complex and non-deterministic. However, I don't see how this relates to JIT vs interpreter; this is about the choice of data structures.
In a highly tuned interpreter, every non-predictable branch matters. An additional data structure to support in critical path (like reading a property) will cause a slowdown. JITs can avoid that by specializing the code at runtime.
As programmer, I spect two benefits of inmutable structures, the one is secutiry of not changes, the second is speed improve, specially with a big size data. Of course, the second depend of each JS engine, but it's important.
I think an implementation could implement this such that property read speed is unaffected, if it already has "hidden classes" and reasonable caching—whether it has a JIT or not.
For implementations without those well-understood optimizations, it seems an awkward position (at least) to optimize by blocking the evolution of the language...
Doesn't high-quality interoperability between JS and WebAssembly basically require something along these lines? One danger of TC39 not doing it is that the wasm folks will realize they need it, and design something from scratch—likely ignoring the opportunity to do something that's independently good for JS—and standardize it elsewhere.
I am not advocating against this change, but I am explaining my view as an implementor of a highly optimized JS interpreter. I don't really see how this language feature would help us improve performance, except perhaps when testing for equality. Of course, I might be missing something and we might (and probably will) come up with more optimizations in the future.
The goal of the Records and Tuples proposal isn't to speed up existing JS programs but to enable the intuitive, understandable expression of JS data structure manipulations, in a way that will meet the goals needed by JS programmers, including reasonably good performance for realistic applications.
I share @jorendorff 's impression of "hidden classes"/fast paths being pretty independent of JIT vs interpreter. With or without this proposal, it's possible to make a naive, slow JS implementation. This proposal, like all features, does add some implementation burden. I'd like to assess that burden to make sure it's not too much, by developing various implementations between Stage 2 and 4 of this proposal.
I think high-quality interaction between JS and Wasm would need something a bit different from what Records and Tuples provides. It would need to expose mutable data structures that permit TypedArray-like numeric primitives as well as references to JS objects (and primitives). I hope we can eventually provide these capabilities in JS, but they are outside of the scope of this proposal.
Hidden classes and IC have different performance tradeoffs in an interpreter compared to a JIT, because the interpreter cannot specialize for a different fast path in a specific instance (many interpreters run in constrained environments where patching bytecode is not possible). So, for example, adding a new type of object layout makes everything a little slower even if that new object layout is never used by a particular app. Consequently, to avoid that cost, interpreters (Hermes specifically) will likely use exactly the same memory layout for objects and records and for arrays and tuples. Optimizations assuming a different memory layout will probably not be feasible for us.
Anyway, someone asked about implementors view earlier. That's my 5c.
@tmikov I have a thought on performance but I could be way off-base. Would immutable data structures like records and tuples allow the JS runtime to allocate data onto the stack and thereby remove some need for garbage collection? I recently did a toy project with the Mandelbrot set and tried a naive implementation with complex numbers being objects or fixed length arrays and using pure functions to do complex math operations. To calculate the Mandelbrot set, you need to do a bunch of complex number calculations. I found that using pure functions and representing Complex numbers as objects generated a ton of garbage and increased calculation time by a factor of ~2 (in V8). With strict guarantees of data size, wouldn't V8 more easily be able to store those simple objects on the stack and remove them when they go out of scope (thereby eliminating the need for GC on those objects)?
That makes sense in my head but I could have the wrong mental model.
I would expect performance gains and fewer bugs e.g. in React when React would use Records for its props. A lot of rerenders could be skipped when react just compares the records by identity. This would eliminate the need for manual memoization in many cases in React projects too.
As @fabb, VDOM based libraries with a VDOM like React wouldn't have to rerender because of identity. If you want to read about that, there is a cool article done by Sebastian Lober on it.
After discussing with engines in the past few years, we are coming to the point where no performance expectations should be had about Record & Tuple: the main feature is guarantees and improved equality semantics, not faster execution.
The spec cannot guarantee faster than linear time creation and/or comparison of Record & Tuple. Therefore, for Stage 3, we will not introduce any performance expectation, either in the spec, either in the proposal writeup.