Ithemal is a data driven model for predicting throughput of a basic block of x86-64 instructions.
More details about Ithemal's approach can be found in our paper.
- Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks
Charith Mendis, Alex Renda, Saman Amarasinghe, Michael Carbin
Proceedings of the 36th International Conference on Machine Learning (ICML) 2019. Bibtex
You first need to install docker and docker-compose.
It is easiest to run Ithemal within the provided docker environment.
To build the docker environment, run docker/docker_build.sh
.
No user interaction is required during the build, despite the various prompts seemingly asking for input.
Once the docker environment is built, connect to it with docker/docker_connect.sh
.
This will drop you into a tmux shell in the container.
It will also start a Jupyter notebook exposed on port 8888 with the password ithemal
; nothing depends on this except for your own convenience, so feel free to disable exposing the notebook by removing the port forwarding on lines 37 and 38 in docker/docker-compose.yml
.
The file system in the container is mounted from the local file system, so changes to the file system on the host will propagate to the docker instance, and vice versa.
The container will continue running in the background, even if you exit.
The container can be stopped with docker/docker_stop.sh
from the host machine.
To detach from the container while keeping jobs running, use the normal tmux detach command of Control-b d
; running docker/docker_connect.sh
will drop you back into the same session.
Models can be downloaded from the Ithemal models repository. Models are split into two parts: the model architecture and the model data (this is an unfortunate historical artifact more than a good design decision). The model architecture contains the code and the token embedding map for the model, and the model data contains the learned model tensors. The versions of the model reported in the paper are:
- The Haswell model
- The Skylake model
- The Ivy Bridge model
Ithemal can be used as a drop-in replacement for the throughput prediction capabilities of IACA
or llvm-mca
via the learning/pytorch/ithemal/predict.py
script.
Ithemal uses the same convention as IACA
to denote what code should be predicted; this can be achieved with any of the files in learning/pytorch/examples
, or by consulting the IACA documentation.
Once you have downloaded one of the models from the previous section, and you have compiled a piece of code to some file, you can generate prediction for this code with something like:
python learning/pytorch/ithemal/predict.py --verbose --model predictor.dump --model-data trained.mdl --file a.out
Raw data is represented as a list of tuples containing (code_id, timing, code_intel, code_xml)
, where code_id
is a unique identifier for the code, timing
is the float representing the timing of the block, code_intel
is the newline-separated human-readable block string (tis is only for debugging and can be empty or None
), and code_xml
is the result of the tokenizer on that block (detailed in the Canonicalization section).
To store datasets, we torch.save
and torch.load
this list of tuples.
The first 80% of a dataset is loaded as the train set, and the last 20% is the test set.
For an example of one of these datasets, look at a small sample of our training data.
To canonicalize a basic block so that it can be used as input for Ithemal, first get the hex representation of the basic block you want to predict (i.e. via xxd
, objdump
, or equivalent).
For instance, the instruction push 0x000000e3
is represented in hex as 68e3000000
.
Next, run the tokenizer as follows:
data_collection/build/bin/tokenizer {HEX_CODE} --token
which will output an XML representation of the basic block, with all implicit operands enumerated.
To train a model, pick a suitable EXPERIMENT_NAME
and EXPERIMENT_TIME
, and run the following command:
python learning/pytorch/ithemal/run_ithemal.py --data {DATA_FILE} --use-rnn train --experiment-name {EXPERIMENT_NAME} --experiment-time {EXPERIMENT_TIME} --sgd --threads 4 --trainers 6 --weird-lr --decay-lr --epochs 100
which will train a model with the parameters reported in the paper.
The results of this are saved into learning/pytorch/saved/EXPERIMENT_NAME/EXPERIMENT_TIME/
, which we will refer to as RESULT_DIR
from now on.
The training loss is printed live as the model is trained, and also saved into RESULT_DIR/loss_report.log
, which is a tab-separated list of epoch, elapsed time, training loss, number of active parallel trainers
.
The results of the trained model on the test set are stored in RESULT_DIR/validation_results.txt
, which consists of a list of of the predicted,actual
value of each item in the test set, followed by the overall loss of the trained model on the test set at the end.
Finally, the resulting trained models and predictor dumps (for use in the command line API above) are saved in RESULT_DIR/trained.mdl
and RESULT_DIR/predictor.dump
respectively.
- The command-line and training infrastructure is in
learning/pytorch/ithemal/
. - The models and optimizer are in
learning/pytorch/models/graph_models.py
andlearning/pytorch/models/train.py
- The data infrastructure is primarily in
learning/pytorch/data/data_cost.py
- The representation of instructions and basic blocks are in
common/common_libs/utilities.py
- The canonicalization and tokenization are in
data_collection/tokenizer/tokenizer.c
anddata_collection/common/
This repo contains a preliminary version of our data collection infrastructure; we plan on releasing a more detailed description and infrastructure soon.