/hyrise

Hyrise is a research in-memory database.

Primary LanguageC++MIT LicenseMIT

Build Status Coverage Status CodeFactor

Welcome to Hyrise

This is the repository for the current Hyrise version, which has been rewritten from scratch. The new code base is easier to setup, to understand, and to contribute to. As of now, not all features of the old version are supported yet - we are working on that.

Papers that were published before October 2017 were based on the previous version of Hyrise, which can be found here.

Supported Benchmarks

We support a number of benchmarks out of the box. This makes it easy to generate performance numbers without having to set up the data generation, loading CSVs, and finding a query runner. You can run them using the ./hyriseBenchmark* binaries.

Benchmark Notes
TPC-C In development, no proper optimization done yet
TPC-DS Query Plans
TPC-H Query Plans
Join Order

Getting started

Have a look at our contributor guidelines.

You can find definitions of most of the terms and abbreviations used in the code in the glossary. If you cannot find something that you are looking for, feel free to open an issue.

The Step by Step Guide is a good starting point to get to know Hyrise.

Native Setup

You can install the dependencies on your own or use the install.sh script (recommended) which installs all of the therein listed dependencies and submodules. The install script was tested under macOS Catalina (10.15) and Ubuntu 19.10 (apt-get).

See dependencies for a detailed list of dependencies to use with brew install or apt-get install, depending on your platform. As compilers, we generally use the most recent version of clang and gcc (Linux only). Please make sure that the system compiler points to the most recent version or use cmake (see below) accordingly. Older versions may work, but are neither tested nor supported.

Setup using Docker

To get all dependencies of Hyrise in a docker image, run

docker-compose build

You can start the container via

docker-compose run --rm hyrise

Inside of the container, run ./install.sh to download the required submodules. :whale:

Building and Tooling

It is highly recommended to perform out-of-source builds, i.e., creating a separate directory for the build. Advisable names for this directory would be cmake-build-{debug,release}, depending on the build type. Within this directory call cmake .. to configure the build. Subsequent calls to CMake, e.g., when adding files to the build will not be necessary, the generated Makefiles will take care of that.

Compiler choice

CMake will default to your system's default compiler. To use a different one, call cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ .. in a clean build directory. See dependencies for supported compiler versions.

ccache

For development, we strongly suggest to use ccache, which reduces the time needed for recompiles significantly. Especially when switching branches, this can reduce the time to recompile from several minutes to one or less. To use ccache, simply add -DCMAKE_CXX_COMPILER_LAUNCHER=ccache to your cmake call.

Build

Simply call make -j*, where * denotes the number of threads to use.

Usually debug binaries are created. To configure a build directory for a release build make sure it is empty and call CMake like cmake -DCMAKE_BUILD_TYPE=Release

Lint

./scripts/lint.sh (Google's cpplint is used which needs python 2.7)

Format

./scripts/format.sh (clang-format is used)

Test

Calling make hyriseTest from the build directory builds all available tests. The binary can be executed with ./<YourBuildDirectory>/hyriseTest. Note, that the tests/sanitizers/etc need to be executed from the project root in order for table files to be found.

Coverage

./scripts/coverage.sh will print a summary to the command line and create detailed html reports at ./coverage/index.html

Supports only clang on MacOS and only gcc on linux

Address/UndefinedBehavior Sanitizers

cmake -DENABLE_ADDR_UB_SANITIZATION=ON will generate Makefiles with AddressSanitizer and Undefined Behavior options. Compile and run them as normal - if any issues are detected, they will be printed to the console. It will fail on the first detected error and will print a summary. To convert addresses to actual source code locations, make sure llvm-symbolizer is installed (included in the llvm package) and is available in $PATH. To specify a custom location for the symbolizer, set $ASAN_SYMBOLIZER_PATH to the path of the executable. This seems to work out of the box on macOS - If not, make sure to have llvm installed. The binary can be executed with LSAN_OPTIONS=suppressions=asan-ignore.txt ./<YourBuildDirectory>/hyriseTest.

cmake -DENABLE_THREAD_SANITIZATION=ON will work as above but with the ThreadSanitizer. Some sanitizers are mutually exclusive, which is why we use two configurations for this.

Compile Times

When trying to optimize the time spent building the project, it is often helpful to have an idea how much time is spent where. scripts/compile_time.sh helps with that. Get usage instructions by running it without any arguments.

Maintainers

  • Jan Kossmann
  • Markus Dreseler
  • Martin Boissier
  • Stefan Klauck

Contact: firstname.lastname@hpi.de

Contributors

  • Yannick Bäumer
  • Lawrence Benson
  • Timo Djürken
  • Fabian Dumke
  • Fabian Engel
  • Moritz Eyssen
  • Martin Fischer
  • Christian Flach
  • Pedro Flemming
  • Mathias Flüggen
  • Johannes Frohnhofen
  • Pascal Führlich
  • Adrian Holfter
  • Sven Ihde
  • Jonathan Janetzki
  • Michael Janke
  • Max Jendruk
  • David Justen
  • Marvin Keller
  • Mirko Krause
  • Eva Krebs
  • Sven Lehmann
  • Tom Lichtenstein
  • Alexander Löser
  • Jan Mattfeld
  • Arne Mayer
  • Julian Menzler
  • Torben Meyer
  • Leander Neiß
  • Hendrik Rätz
  • Alexander Riese
  • Johannes Schneider
  • David Schumann
  • Simon Siegert
  • Arthur Silber
  • Toni Stachewicz
  • Daniel Stolpe
  • Jonathan Striebel
  • Nils Thamm
  • Carsten Walther
  • Marcel Weisgut
  • Lukas Wenzel
  • Fabian Wiebe
  • Tim Zimmermann