Google's EXEgesis project aims to improve code generation in compilers, via:
- Providing machine-readable lists of instructions for hardware vendors and microarchitectures.
- Inferring latencies and µOps scheduling for each instruction/microarchitecture pair.
- Providing tools for debugging the performance of code based on this data.
For a high-level overview of our efforts, see the slides for a tech talk about EXEgesis (July 2017).
This repository provides a set of tools for extracting data about instructions and latencies from canonical sources and converting them into machine-readable form. Some require parsing PDF files; others are more straightforward.
When latencies and µOps scheduling are not available in the documentation, we auto generate benchmarks to measure them.
The output data is available in the form of a Protocol Buffer message.
It includes:
- A textual description. e.g.
Add with carry imm8 to AL.
- The raw encoding. e.g.
14 ib
and equivalent LLVM mnemonic. e.g.ADC8i8
- Per-microarchitecture instruction latencies. e.g.
min_latency: 2, max_latency: 2
- Per-microarchitecture instruction schedulings. e.g.
Port 0 or 1 or 5 or 6
- This identifies the execution units on which the µOps can be scheduled.
- For example, here is the description of Intel Haswell
Microarchitecture,
it contains 7 ports, the
Add with carry imm8 to AL
instruction above can execute on ports 0, 1, 5 or 6.
- Intel x86-64 - done
- IBM POWER - underway
- ARM Cortex - underway
- Issue tracker: https://github.com/google/EXEgesis/issues
- Mailing list: exegesis-discuss@googlegroups.com
We welcome patches -- see CONTRIBUTING for more information on how to submit a patch.