Speculative optimization in tracing JIT compilers
lukego opened this issue · 0 comments
We often think of code in static languages like C/C++ as being compiled into more specialized machine code than dynamic languages like Lua. This makes intuitive sense because source code for static languages contain more specific information than dynamic language source code.
However, RaptorJIT (and the whole LuaJIT family) actually generates more specialized machine code than C/C++ compilers. How can this be?
The reason is that RaptorJIT infers how code works by running it instead of by analyzing its source code (see also #24.) The abstractions of dynamic languages cease to exist at runtime: they are all resolved as a natural consequence of running the code. Each variable gets a value of some specific type, each call enters some specific definition, each object has some concrete type, each branch is either taken or not taken, and each value has specific characteristics (e.g. a particular hashtable has N slots.) This is the information that RaptorJIT uses to generate optimized code.
(RaptorJIT would consider type declarations in the source code to be redundant: why tell me things that I am going to see for myself anyway?)
So the JIT is able to generate extremely specialized machine code using the details inferred from running the code, more specialized even than a C/C++ compiler, but whether it should is another question. The information inferred by running the code tells us exactly how that code executed one time, but it does not guarantee that it will always run that way in the future. Optimizations based on this information are therefore speculative: the optimizer predicts that the program will continue to run the same way it did when it was optimized. If these predictions usually come true then the program will run fast but if they don't then it will run slow.
How much of this speculative optimization do we really want to do? The RaptorJIT answer is "a hell of a lot." Our goal is to write high-level Lua code without any special annotations and to have performance competitive with C. It follows that the compiler has to generate machine code that is aggressively specialized based on the information available. It also follows that we need to understand the compiler well enough to write programs that hit its sweet spots by making speculative predictions come true.
Hence this blog series!