/ClangEx

A Clang-based C/C++ fact extractor. Generates a tuple-attribute file from corresponding source code facts.

Primary LanguageC++GNU General Public License v3.0GPL-3.0

ClangEx - A Fast C/C++ Fact Extractor

This readme contains information about ClangEx, as well as setup and usage details.

What is ClangEx?

ClangEx is a C/C++ fact extractor that uses Clang libraries to extract details from C or C++ source code to a lightweight but powerful program model. This program model can be visualized or queried using relational algebra to discover problems in source code on an ad-hoc basis.

The model generated by ClangEx is encoded in the Tuple-Attribute (TA) format. This format was developed by Ric Holt at the University of Waterloo and is discussed in this paper. The TA format contains entities, their relationships, and attributes that describe these entities and relationships.

While ClangEx is developed to extract facts to a program model, users must download additional tools from the Sofware Architecture Group (SWAG) at the Univesity of Waterloo to visualize or query their programs. This guide wil cover how these tools are configured in the "setup" section.

Features of ClangEx

Currently, ClangEx supports the following:

  • Both C & C++ language support including C++11 and C++14.

  • A more detailed program model compared to other extractors due to access to Clang's Abstract Syntax Tree (AST).

  • The ability to generate a model for selected source files. This means that not all source files in a program have to be generated at once. This is achieved through ClangEx's manual ID linking system.

  • A fluid metamodel. Users can choose to exclude certain portions of ClangEx's metamodel to generate TA files that suit them. For instance, TA models can be generated that don't include variables or classes.

ClangEx Metamodel

The following diagram higlights the information ClangEx extracts from a target C/C++ program:

alt text

Installation Details

Prerequisties

ClangEx is based on Clang 5.0.0 and requires CMake 3.0.0 or greater to run. Additionally, to build ClangEx, Boost libraries are required. For Boost, the computer building ClangEx requires Boost version 1.6.0 or greater. If you meet any of these prerequisites, feel free to skip their associated section below.

Installing CMake

First, CMake should be installed. On Linux, this is as simple as running:

$ sudo apt-get install cmake

This will install the most current version of CMake based on your sources list. Be sure to check the version to ensure you have a version of CMake greater than 3.0.0. This can be done by running:

$ cmake --version

If you want the most current version of CMake, this will require a bit of additional work. First, download the most recent version from the CMake website and download the Linux source version. The commands below show how CMake is installed from the 3.7.0 Linux source version. Change the version label in the download link to download a different version.

First, we download and unzip the source files:

$ wget https://cmake.org/files/v3.7/cmake-3.7.0.tar.gz
$ tar xvzf cmake-3.7.0.tar.gz
$ cd cmake-3.7.0.tar.gz

Next, install CMake by configuring the Makefile and running it. This can be done by doing the following:

$ ./configure
$ make
$ make install

That's it! You can check if CMake is correctly installed by running:

$ cmake --version

Installing Clang

The best way to install Clang on your system is to download Clang directly and compile the source code directly. As stated, Clang 4.0.0 or greater is required. This guide will cover how to install Clang from source.

First, Clang must be downloaded from the Clang and LLVM website. To do this, simply run the following:

$ svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
$ cd llvm/tools
$ svn co http://llvm.org/svn/llvm-project/cfe/trunk clang
$ cd clang/tools
$ svn co http://llvm.org/svn/llvm-project/clang-tools-extra/trunk extra
$ cd ../../../..:

Next, we need to build LLVM and Clang. Sit back and make a coffee during this process as this can take up to several hours depending on your computer. The following steps will build Clang to a directory called Clang-Build adjacent to the Clang install directory. To change this, simply replace the Clang-Build directory with any other directory and location you choose.

There are two ways to build LLVM and Clang.

1) Build using make:

This is the standard way of building. To do this, run the following:

$ mkdir Clang-Build
$ cd Clang-Build
$ cmake -G "Unix Makefiles" ../llvm
$ make
$ make install 

2) Build using ninja:

Ninja is a lightweight build tool that promises to be faster than make and other build tools. To do this, run the following:

$ mkdir Clang-Build
$ cd Clang-Build
$ cmake -GNinja ../llvm
$ ninja
$ ninja install 

That's it! With this, Clang and LLVM will be installed. To ensure proper installation, run the following to check:

$ clang --version

Obtaining Boost Libraries

Boost libraries are also required. This process is very simple on Debian/Ubuntu systems. You need Boost version 1.6 to proceed.

Simply run the following command to download and install Boost libraries to your system's include path:

$ sudo apt-get install libboost-all-dev

IMPORTANT NOTE: Boost libraries are also needed on your system even if you are simply running the executable built on another system. Follow the instructions above to get the necessary Boost libraries to run the portable executable.

Building ClangEx

Now that the prerequisties are all satisfied, you can now download and build ClangEx! If all prerequisties are truly satisfied, ClangEx should build without issue.

First, we must checkout the current version of ClangEx from GitHub. This will be downloaded to your current working directory. The ClangEx repository has all required files and libraries.

To download, run the following:

$ git clone https://github.com/bmuscede/ClangEx.git

Next, we want to build the source code. This process may take several minutes due to the heavyweight size of the Clang libraries. This guide will build clang to the ClangEx-Build directory that is adjacent to the ClangEx library. If you want to build to a different directory, replace the following ClangEx-Builds to the directory of your choice.

Next, before ClangEx can be built, two separate environment variables must be set: LLVM_PATH and CLANG\_VER. The LLVM_PATH is a variable that tells ClangEx where LLVM and Clang were built to. The CLANG_VER variable is the version of Clang installed. To set these variables, open up \texttt{.bashrc} located in the home directory and add the following lines to the bottom of the file:

$ export LLVM_PATH=<PATH_TO_CLANG-BUILD>
$ export CLANG_VER=<VERSION_OF_CLANG>

Restart the terminal to ensure these variables have been exported.

To build, run the following command:

$ mkdir Clang-Build
$ cd Clang-Build
$ cmake -G "Unix Makefiles" ../ClangEx
$ make

To verify that ClangEx built, ensure the ClangEx-Build directory contains a include subdirectory and that the ClangEx executable exists. Additionally, run the following to check if it runs:

$ ./ClangEx --help

You should see a help message with all available commands.

Installing Additional Anaylsis Tools

There are two specific tools that are required to perform analysis on TA program models generated by ClangEx. Both of these tools allow for querying and visualizing ClangEx Models. This guide will specify how to install these programs.

First, download the SWAGKit tarball from the University of Waterloo's SWAG website and unzip it to the directory of your choice:

$ wget http://www.swag.uwaterloo.ca/swagkit/distro/swagkit_linux86_bin_v3.03b.tar.gz
$ tar -xvzf swagkit_linux86_bin_v3.03b.tar.gz

Next, we want to add SWAGKit's binary path to your environment variables. To do this, simply do the following. Note: Replace the path to SWAGKit as shown in the following commands with the path to SWAGKit on your system!

$ echo "#SWAGKit Environment Variables:" >> ~/.bashrc
$ echo "export SWAGKIT_PATH=<REPLACE_WITH_SWAGKIT_PATH>" >> ~/.bashrc
$ echo "export PATH=$SWAGKIT_PATH/bin:$PATH" >> ~/.bashrc

That's it! You should be able to test if this worked by doing the following. The bash command will reload bash's environment variables and grok and lsedit are two analysis programs widely used.

$ bash
$ grok
$ lsedit

If both Grok and LSEdit started successfully, SWAGKit was configured on your computer. You are now able to run ClangEx and analyze program models!

Running ClangEx

ClangEx usage instructions can be found in the Appendix of my thesis located here.