dragonfly-mle: A C repository from Veshialle

DEPRECATED

This project is no longer active. Please refer to: https://github.com/counterflow-ai/dragonfly2

NAME

dragonfly-mle - Dragonfly Machine Learning Engine (MLE)

SYNOPSIS

dragonfly-mle [-p ] [ -v] [ -r root directory ] [ -c chroot directory ]

DESCRIPTION

A scalable, scriptable, streaming application engine for network threat detection built on Redis and LuaJIT. MLE provides a powerful framework for operationalizing anomaly detection algorithms, threat intelligence lookups, and machine learning predictions with trained models. MLE is lightweight, fast, and flexible. It is designed to run in tandem with a deep packet inspection engine like Suricata. Executing user-defined analyzers implemented in Lua, it can process hundreds of thousands of events per second.

OPTIONS

-p Drop privilege -v Verbose mode

-r root directory The base directory for dragonfly

-c chroot directory Change root directory by invoking the chroot() system call

FEATURES

Designed to integrate with Suricata
Implemented in C with scalable multi-threaded execution paths
User-defined LuaJIT scripting with native support for json and redis
Native support for Redis ML operations ( https://oss.redislabs.com/redisml/ )
Able to run as a Dockerized application

ARCHITECTURE

The MLE pipeline implemented as a user-configurable system of queues with three types of event processors:

Input processor - pulls messages out of a source, normalizes the data into JSON format, and routes it to the appropriate analyzer queue for processing. Message sources are either files, Unix sockets, or kafka brokers. Normalization and ETL operations are performed by a user-defined Lua script.
Analysis processor - pulls messages out of the queue, analyzes the event, and routes results to the appropriate output queue for processing. Analyzers are implemented as user-defined Lua scripts.
Output processor - pulls messages out of the queue and delivers it to the appropriate sink. Message sinks are either file, Unix sockets, or Kafka brokers.

CONFIGURATION

The MLE pipeline is defined in a file named config.lua, which is located in the scripts sub directory under the dragonfly root directory.

${DRAGONFLY_ROOT}/scripts/config.lua

This file requires three constructs implemented as Lua tables.

The input table contains configuration for message sources. Messages can be alerts and/or network security monitoring events. Valid source types include file, tails, kafka, and ipc.

inputs = {
  { tag="eve", uri="tail:///var/log/suricata/eve.json", script="eve-etl-lua"},
  { tag="flow", uri="file:///var/log/suricata/flow.json", script="flow-etl-lua"},
  { tag="dns", uri="ipc:///opt/var/log/suricata/dns.json", script="dns-etl-lua"}
}

The analyzer table contains configuration for user-definable analyzers.

analyzers = {
   {tag="flow", script="example-flow.lua"},
   {tag="http", script="example-http.lua"},
   {tag="tls", script="example-tls.lua"}
}

The output table contains configuration for output sinks. Valid sink types include file, kafka, and ipc.

outputs = {
   {tag="eve", uri="file://eve-alerts.log"},
   {tag="tls", uri="ipc://tls-alerts.log"},
}

DIRECTORY STRUCTURE

To operate successfully, MLE requires a root directory that includes the following structure:

Directory	Description
${DRAGONFLY_ROOT}	base directory
${DRAGONFLY_ROOT}/config	location of config.lua file
${DRAGONFLY_ROOT}/filter	directory for filtering scripts
${DRAGONFLY_ROOT}/analyzer	directory for analyzer scripts
${DRAGONFLY_ROOT}/logs	directory used by the output processor
${DRAGONFLY_ROOT}/bin	location of dragonfly-mle program

QUICK START

Using Docker, this example assumes there is an instance of Suricata already installed and running on the host and it is logging to eve.json in directory /var/log/suricata/log.

 $ git clone https://github.com/counterflow-ai/dragonfly-mle.git
 $ cd dragonfly-mle
 $ docker build -t dragonfly .
 $ docker run -it -v /var/log/suricata/log:/var/log/suricata dragonfly

For better grasp on how things function, be sure to study the Dockerfile, config.lua and the example scripts referenced. Remember to rebuild the Docker image whenever any changes are made to any of the scripts.

EXAMPLES

DNS processing
Flow processing
TLS processing

BUILTIN MLE LUA FUNCTIONS

analyzer_event ()
output_event ()
timer_event ()
http_get ( )

TODO

Implement Kafka consumer and producer
Implement Parquet output/index
documentation @ https://readme.io

LICENSE

GNU General Public License, version 2

Veshialle/dragonfly-mle