CNCF Early Stage Sandbox Project
In short, Tremor is an event processing system. It was originally designed as a replacement for software such as Logstash or Telegraf. However tremor has outgrown this singular use case by supporting more complex workflows such as aggregation, rollups, an ETL language, and a query language.
More about the history and architecture can be found in the documentation.
Tremor is built for users that have a high message volume to deal with and want to build pipelines to process, route, or limit this event stream. While Tremor specializes in interacting with Kafka, other message systems should be easily pluggable.
Tremor has been successfully used to replace logstash as a Kafka to Elastic Search ingress. In this scenario, it reduced the required compute resources by about 80% (YMMV) when decoding, classify, and rate-limiting the traffic. A secondary but perhaps more important effect was that tremors dynamic backpressure and rate-limiting allowed the ElasticSearch system to stay healthy and current despite overwhelming amounts of logs during spikes.
Kafka optimizes its connection lifetime for long-lived, persistent connections. The rather long connection negotiation phase is a result of that optimization. For languages that have a short runtime, this can be a disadvantage, such as PHP, or tools that only run for a short period, such as CLI tools. Tremor can be used to provide an HTTP(s) to Kafka bridge that allows putting events on a queue without the need for going through the Kafka connection setup instead, only relying on HTTP as its transport.
- You are currently using software such as Logstash or Telegraf
- You have a high volume of events to handle
- You want to protect a downstream system from overload
- You wish to perform ETL like tasks on data.
Note: Some of those restrictions are subject to change as tremor is a growing project.
We currently do not recommend tremor where:
- Your event structure is not mappable to a JSON-like data structures.
- If in doubt, please reach out and create a ticket so we can assist and advice
- In many cases ( textual formats ) a preprocessor, postprocessor or codec is sufficient and these are relatively easy to contribute.
- You need connectivity to a system, protocol or technology that is not currently supported directly or indirectly by the set existing set of onramps and offramps.
- If in doubt, please reach out and create a ticket so we can assist and advise.
We accept and encourage contributions no matter how small so if tremor is compelling for your use case or project, then please get in touch, reach out, raise a ticket and we're happy to collaborate and guide contributions and contributors.
We provide usage examples of this in the docs/workshop
folder. Those examples include a docker-compose.yaml
for running them and can serve as a starting point for deploying tremor.
Tremor runs in a docker image. If you wish to build a local image, clone this repository, and either run make image
or run docker-compose build
. Both will create an image called tremor-runtime:latest
.
Note that since the image is building tremor in release mode it requires some serious resources. We recommend allowing docker to use at least 12 but better 16 gigabytes of memory and as many cores as there are to spare. Depending on the system building, the image can take up to an hour.
If you are not comfortable with managing library packages on your system or don't have experience with, please use the Docker image provided above. Local builds are not supported and purely at your own risk.
For local builds, tremor requires rust 2018 (version 1.31
or later), along with all the tools needed to build rust programs. Eg: for CentOS, the packages gcc
, make
, cmake
, clang
, openssl
, and libstdc++
are required. For different distributions or operating systems, please install the packages accordingly.
NOTE AVX2, SSE4.2 or NEON are needed to build simd-json used by tremor. So if you are building in vm, check which processor instruction are passed to it. Like lscpu | grep Flags
For a more detailed guide on local builds, please refer to the tremor development docs.
To run tremor
locally and introspect its docker environment, do the following:
make image
docker run tremorproject/tremor:latest
A local shell can be gotten by finding the container id of the running docker container and using that to attach a shell to the image.
docker ps
This returns:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
fa7e3b4cec86 tremor-runtime "/tremor-runtime.sh" 43 seconds ago Up 42 seconds gracious_shannon
Executing a shell on that container will then give you local access:
docker exec -it 838f22d9cb98 sh
Tremor uses YAML, or tremor-query to configure pipelines. For use in docker those should be mounted to /etc/tremor/config
.
Tremor works by chaining operations that have inputs, outputs, and additional configuration. OnRamps - the operations that ingest data - take a unique role in this.
The documentation for different operations can found in the docs. The onramps
and op
modules hold the relevant information.
For each operation, the Config
struct defines the parameters that can be passed to configure it, and the description holds additional details and examples.
A list that defines the OnRamps started in tremor. Each onramp lives in a separate thread. Along with its configuration, it has the key pipeline
that defines the pipeline data from that onramp is sent to. At least one needs to be present.
onramps:
- onramp::file:
file: my-file.json
pipeline: main
Please look at the demo for a fully documented example
To use the configuration file as part of the Docker container mount the configuration files to /etc/tremor/config
.
Note: Docker should run with at least 4GB of memory!
To demo run make demo
, this requires the tremor-runtime image to exist on your machine.
The demo mode logically follows the flow outlined below. It reads the data from data.json.xz, sends it at a fixed rate to the demo
bucket on Kafka and from there reads it into the tremor container to apply classification and bucketing. Finally, it off-ramps statistics of the data based on those steps.
╔════════════════════╗ ╔════════════════════╗ ╔════════════════════╗
║ loadgen ║ ║ Kafka ║ ║ tremor ║
║ ╔════════════════╗ ║ ║ ┌────────────────┐ ║ ║ ┌────────────────┐ ║
║ ║ tremor-runtime ║─╬───╬▶│ bucket: demo │─╬───╬▶│ tremor-runtime │ ║
║ ╚════════════════╝ ║ ║ └────────────────┘ ║ ║ └────────────────┘ ║
║ ▲ ║ ╚════════════════════╝ ║ │ ║
║ │ ║ ║ │ ║
║ │ ║ ║ ▼ ║
║ ┌────────────────┐ ║ ║ ┌────────────────┐ ║
║ │ data.json.xz │ ║ ║ │ tremor │ ║
║ └────────────────┘ ║ ║ └────────────────┘ ║
╚════════════════════╝ ║ │ ║
║ │ ║
║ ▼ ║
║ ┌────────────────┐ ║
║ │ grouping │ ║
║ └────────────────┘ ║
║ │ ║
║ │ ║
║ ▼ ║
║ ┌────────────────┐ ║
║ │ stats output │ ║
║ └────────────────┘ ║
╚════════════════════╝
The demo can be configured in (for example) the demo/configs/tremor/config/config.yaml
file.
Configuration lives in demo/configs
.
The test data is read from the demo/data/data.json.xz
file. This file needs to contain 1 event (in this case, a valid JSON object) per line and be compressed with xz
.
The tremor-runtime supports a micro-benchmarking framework via specialized on-ramp ( blaster ) and off-ramp ( blackhole ) Tremor input and output adapters. Benchmarks ( via blackhole ) output high dynamic range histogram latency reports to standard output that is compatible with HDR Histogram's plot files service
To execute a benchmark, build tremor in release mode and run the examples from the tremor repo base directory:
./bench/run <name>
to run and compile with neon use:
RUSTCFLAGS="-C cpu-target=native" cargo +nightly build --features neon --all