/sf-processor

SysFlow edge processing pipeline

Primary LanguageGoApache License 2.0Apache-2.0

Build Status Docker Pulls GitHub tag (latest by date) Documentation Status GitHub

Supported tags and respective Dockerfile links

Quick reference

What is SysFlow?

The SysFlow Telemetry Pipeline is a framework for monitoring cloud workloads and for creating performance and security analytics. The goal of this project is to build all the plumbing required for system telemetry so that users can focus on writing and sharing analytics on a scalable, common open-source platform. The backbone of the telemetry pipeline is a new data format called SysFlow, which lifts raw system event information into an abstraction that describes process behaviors, and their relationships with containers, files, and network. This object-relational format is highly compact, yet it provides broad visibility into container clouds. We have also built several APIs that allow users to process SysFlow with their favorite toolkits. Learn more about SysFlow in the SysFlow documentation.

The SysFlow framework consists of the following sub-projects:

  • sf-apis provides the SysFlow schema and programatic APIs in go, python, and C++.
  • sf-collector monitors and collects system call and event information from hosts and exports them in the SysFlow format using Apache Avro object serialization.
  • sf-processor provides a performance optimized policy engine for processing, enriching, filtering SysFlow events, generating alerts, and exporting the processed data to various targets.
  • sf-exporter exports SysFlow traces to S3-compliant storage systems for archival purposes.
  • sf-deployments contains deployment packages for SysFlow, including Docker, Helm, and OpenShift.
  • sysflow is the documentation repository and issue tracker for the SysFlow framework.

About this image

The SysFlow processor is a lighweight edge analytics pipeline that can process and enrich SysFlow data. The processor is written in golang, and allows users to build and configure various pipelines using a set of built-in and custom plugins and drivers. Pipeline plugins are producer-consumer objects that follow an interface and pass data to one another through pre-defined channels in a multi-threaded environment. By contrast, a driver represents a data source, which pushes data to the plugins. The processor currently supports two builtin drivers, including one that reads sysflow from a file, and another that reads streaming sysflow over a domain socket. Plugins and drivers are configured using a JSON file.

Please check Sysflow Processor for documentation on deployment and configuration options.

How to use this image

Starting the processor

The easiest way to run the SysFlow Processor is by using docker-compose. The provided docker-compose.processor.yml file deploys the SysFlow processor and collector. The rsyslog endpoint should be configured in ./config/.env.processor. Collector settings can be changed in ./config/.env.collector. Additional settings can be configured directly in the compose file.

docker-compose -f docker-compose.processor.yml up

Instructions for docker-compose, helm, and oc operator deployments are available here. Alternatively, you can install the SysFlow Processor using its binary installers available in the release pages.

License

View license information for the software contained in this image.

As with all Docker images, these likely also contain other software which may be under other licenses (such as Bash, etc from the base distribution, along with any direct or indirect dependencies of the primary software being contained).

As for any pre-built image usage, it is the image user's responsibility to ensure that any use of this image complies with any relevant licenses for all software contained within.