/nautilus

Nautilus is a lightweight tracing JIT compiler for C++

Primary LanguageC++MIT LicenseMIT

Nautilus: A tracing jit compiler for C++

Build Nautilus

Nautilus is a lightweight and adaptable just-in-time (JIT) compiler for C++ projects. It offers:

  1. A high-level code generation API that accommodates C++ control flows.
  2. A tracing JIT compiler that produces a lightweight intermediate representation (IR) from imperative code fragments.
  3. Multiple code-generation backends, allowing users to balance compilation latency and code quality at runtime (see benchmarks).

Nautilus is used for the query compiler of NebulaStream, a data management system from the DIMA group at TU Berlin. Learn more about Nebula Stream at https://www.nebula.stream

Example

The example below demonstrates Nautilus with a simplified aggregation operator, ConditionalSum. This function aggregates integer values based on a boolean mask. Nautilus introduce val<> objects to capture all executed operations in an intermediate representation during tracing. Depending on the execution context, it can utilize a bytecode interpreter or generate efficient MLIR or C++ code. This enables Nautilus to trade of performance characteristics and to optimize the generated code towards the target hardware.

val<int32_t> conditionalSum(val<int32_t> size, val<bool*> mask, val<int32_t*> array) {
	val<int32_t> sum = 0;
	for (val<int32_t> i = 0; i < size; i++) {
		// check mask
		if (mask[i]) {
			// load value from array at position i
			val<int32_t> value = array[i];
			// add value to sum
			sum += value;
		}
	}
	return sum;
}

int main(int, char*[]) {
	engine::Options options;
	options.setOption("engine.backend", "cpp");
	// options.setOption("engine.Compilation", false);
	auto engine = engine::NautilusEngine(options);
	auto function = engine.registerFunction(conditionalSum);
	auto mask = new bool[4] {true, true, false, true};
	auto array = new int32_t[4] {1, 2, 3, 4};
	auto result = function(4, mask, array);
	std::cout << "Result: " << result << std::endl;
	return 0;
}

Build:

To build Nautilus from source execute use cmake:

mkdir build
cd build
cmake ..
cmake --build . --target nautilus

Components:

The codebase is structured in the following components:

Component Description
include Contains the public api of Nautilus, e.g., val objects.
tracing Hosts core functionality for tracing generic C++ code.
compiler Implements the Nautilus compiler, including its IR, optimization passes, and various generation backends.

Publication:

This paper discusses Nautilus's architecture and its usage in the NebulaStream query compiler. Note that it references an earlier version of the code-generation API, which has changed.

@article{10.1145/3654968,
    author = {Grulich, Philipp M. and Lepping, Aljoscha P. and Nugroho, Dwi P. A. and Pandey, Varun and Del Monte, Bonaventura and Zeuch, Steffen and Markl, Volker},
    title = {Query Compilation Without Regrets},
    year = {2024},
    issue_date = {June 2024},
    volume = {2},
    number = {3},
    url = {https://doi.org/10.1145/3654968},
    doi = {10.1145/3654968},
    journal = {Proc. ACM Manag. Data},
    articleno = {165},
    numpages = {28},
}

Related Work:

The following work is related to Nautilus and influenced our design decisions.

  • Tidy Tuples and Flying Start: This paper describes the low-latency query compilation approach of Umbra. This work was one of the main motivations for the creation of the Nautilus project and its use in NebulaStream.

  • Flounder: Flounder is simple low latency jit compiler that based on AsmJit, which is designed for query compilation.

  • Build-It: BuildIt is a framework for developing Domain Specific Languages in C++. It pioneered the capability of extracting control-flow information form imperative C++ code.

  • GraalVM: The GraalVM project provides a framework to implement AST interpreters that can be turned into high-performance code through partial evaluation.

  • MLIR: The MLIR project provides a novel approach to building reusable and extensible compiler infrastructure. Nautilus leverages it as a foundation for its high-performance compilation backend.

  • MIR: The MIR projects provides a lightweight jit compiler that targets low compilation latency. Nautilus leverages MIR as a low latency compilation backend.