/graph_build

ninja-build format based build tool

Primary LanguageRust

Graph Build

Distributed, and cached make-style build tool, using the ninja-build file format.

What is Graph Build?

Graph build is a build tool in the Make-lineage. It is an experiment into the possibility of handling caching and distributed building on the level of the build tool.

In the C/C++ compilation model, subsituing the compiler is a popular choice of a caching. ccache, clcache and sccache are based on this set up. These commands are compiler launchers, that is, their argument is the compiler command itself, which they inspect, and store the results indexed by a filtered version of the original command. If the cache already contains the results from a previous call, it simply hands it out. Analogue could be mnemonization for pure functions. The filtering occurs becase certain compiler options don't change the output file or the messages by compiler at all, and so the cache can safely discard these arguments and make the caching more forgiving in these cases.

Separately there are solutions for distributed compilation but so far I have not encountered a satisfactory solution for all the major platforms and compilers.

Graph build implements capturing of output and corresponding logs on the level of the build tool. Make-style build tools are designed to reduce re-compilation by executing command only for those for modified inputs, and the subsequently invalidated partial results. Compiler caching, distributed compilation and the build tools seems like a natural match: Compiler caches and incremental builds serve the similar purposes.

Make-style build tools can easy the edit-build-test development cycles for the individual developer: a developer often change a single line in a single file that might impact a minimal part of the entire software and therefore based on a dependency graph the tool can reduce the build time. However the state is kept, it is bound to the machine where build happens, often might could contain absolute path references that are not easily transferable to other systems.

As opposed to this incremental benefit with brittle state, compiler caches match the compiler arguments, input files, to cache entries, and allow for each path substitutions to provide solid, transferable solution for speeding up the builds. As a draw back, they still have to engage in a large number of file operations as opposed to a near no-op for untouched nodes in the incremental build tools, which can be still considerable amount of work for the system.

Compiler caches and distributed compilation can be seen as closely related solutions, because distributed compilation requires considerable management of build artifacts. Incremental build tools are generally don't get involved much in how the build artifacts are produced or where they are placed, instead it is more about keeping track of these artifacts. In distributed compilation models the artifacts must be moved around between build nodes and therefore managed by the tool directly.

Design considerations

Use of Rust

Build tools are tools coordinating complicated and large software and therefore should be as light on the operational requirements as possible. This is the main consideration to choose a system programming language. C/C++ and Rust are all good choices for this work, but for this experiment I choose Rust for the personal reason to use it to learn it.

Use of ninja-build format

ninja-build is a great project that I personally benefitted a lot from and found a great source of inspiration for this work. Ninja uses a language that is more purpose driven than Make and therefore can be worked with very efficiently. Also, ninja-build format can be generated by a number of meta- build-systems, including CMake, one of the most popular tool for building C/C++ applications and libraries.