/TrivialCompiler

A toy compiler written in C++17 that translates SysY (a C-like toy language) into ARM-v7a assembly.

Primary LanguageC++GNU General Public License v3.0GPL-3.0

TrivialCompiler

Build and Test

TrivialCompiler is a compiler written in C++17 that translates SysY (a C-like toy language) into ARM-v7a assembly.

License

Copyright (C) 2020 Chenhao Li, Jiajie Chen, Shengqi Chen

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Architecture

Architecture of TrivialCompiler

Errata: the .bc should be .ll in the picture.

Compiling

mkdir build
cmake .. -DCMAKE_BUILD_TYPE=Debug # use Release for better performance, Debug for enabling ASan
make -j
./TrivialCompiler -h # show usage

Usage

./TrivialCompiler [-l ir_file] [-S] [-p] [-d] [-o output_file] [-O level] input_file

Options:

  • -p: print the names of all passes to run and exit
  • -d: enable debug mode (WARNING: will produce excessive amount of output)
  • -O: set optimization level to level (no effect on behaviour currently)
  • -l: dump LLVM IR (text format) to ir_file and exit (by running frontend only)
  • -o: write assembly to output_file

You must specify either -l or -o, or nothing will actually happen.

You could refer to CMakeLists.txt on how to converting LLVM IR or assembly to executable file on ARM-v7a by using llc or gcc for assembling and linking.

Testing

We use ctest to automatically test TrivialCompiler against several modern compilers. For running tests you need to install the following additional packages and their dependencies:

  • llvm (to test IR output)
  • g++-arm-linux-gnueabihf
  • qemu-user (if not running on ARM-v7a architecture)

Several test cases and corresponding configurable CMake flags are provided:

  • FUNC_TEST (default ON): function test cases provided by the contest committee
  • PERF_TEST (default ON): performance test cases provided by the contest committee
  • CUSTOM_TEST (default ON): test cases written by the authors

And there are more flags to configure whether to use modern compilers to compare with TrivialCompiler

  • GCC (default OFF): use GCC to compile (-Ofast) to compare
  • CLANG (default OFF): use Clang (-Ofast) to compare, needs clang to be installed

After configuring CMake, use ctest under your build directory to run all tests.

The results containing stdout and stderr can be located at build/Testing/Temporary/LastTest.log. You could use utils/extract_result.py to analyze the results and write it into a JSON file.

Parser Generation

The parser for standard SysY language is located at srv/conv/parser.{cpp,hpp}. They are generated by a parser generator lalr1 developed by @MashPlant from parser.toml.

If you want to generate from a modified version of parser.toml, first intall parser_gen (requires nightly Rust toolchain):

cargo install --git https://github.com/MashPlant/lalr1 --features="clap toml"

Then invoke utils/gen_parser.py to generate and split the output. This script will also slightly modify the generated code, so that it will meet our usage better.

Please note that there is a minor problem with the generated lexer: the output tables may be different through each run (the lexer will behave consistently in spite of this, of course). So if you want to contribute to the project, and have regenerated srv/conv/parser.{cpp,hpp} without actually modifying the lexer, it would be helpful if you change the lexer tables back before committing, so that the commit log will be more clear.