/graylog-guide-syslog-kafka

This Guide will give you little help on using Graylog with Kafka Input to get Syslog Data

Apache License 2.0Apache-2.0

Sending Syslog via Kafka into Graylog

If your setup needs to buffer log messages during the transport to Graylog or Graylog is not accessible from all network segments, you can use Apache Kafka as a message broker from which Graylog will pull messages, once they are available.

Please be aware that Graylog will connect to Apache ZooKeeper and fetch the topics defined by the configured regular expressing. Adding SSL/TLS or authentification information is not possible with the latest stable version of Graylog (2.1.0 at the time of writing).


NOTE: This Guide will not give you a complete copy & paste howto, but it will guide you through the setup process and provide additional information if necessary.

Please do not follow the described steps blindly if you don't know how to deal with common issues yourself.


In the scenario used in this guide, a syslog message will run through the following stages:

  • Message sent from rsyslog to Logstash via TCP or UDP
  • Message sent from Logstash to Apache Kafka
  • Message pulled and consumed from Apache Kafka by Graylog (via Kafka input)
  • Structured syslog information extracted from JSON payload by Graylog

If you run rsyslog 8.7.0 or higher with support for Apache Kafka, the message can run through the following stages:

  • Message sent from rsyslog to Apache Kafka
  • Message pulled and consumed from Apache Kafka by Graylog (via Kafka input)
  • Structured syslog information extracted from JSON payload by Graylog

We assume that there is an Apache Kafka instance running on kafka.int.example.org (192.168.100.10) and a Graylog instance is running on graylog.int.example.org (192.168.1.10). Additionally, the logs will be generated by Linux systems syslog.o1.example.org (192.168.50.30) and syslog.o2.example.org (192.168.2.30).

All Systems are running Ubuntu Linux, so you might need to adjust some configuration path settings on different operating systems.

Prepare Apache Kafka

If you do not have a running Apache Kafka cluster, you can follow the quickstart guide, but be aware that this is not a hardened production-ready setup!

Send messages with rsyslog

With rsyslog, you can use templates to format messages. Formatting the messages directly at the source will help to have a clean, predictable workflow.

In order to be able to identify log messages via the fully qualified domain name (FQDN) of the system that created the log message, we're use the configuration option PreserveFQDN - but you will need to have a working DNS resolution for this to work.

rsyslog will send the log message via UDP to the local Logstash instance (listening on 127.0.0.1:5514).

PreserveFQDN on
template(name="ls_json"
         type="list"
         option.json="on") {
           constant(value="{")
             constant(value="\"@timestamp\":\"")     property(name="timereported" dateFormat="rfc3339")
             constant(value="\",\"@version\":\"1")
             constant(value="\",\"message\":\"")     property(name="msg")
             constant(value="\",\"host\":\"")        property(name="hostname")
             constant(value="\",\"severity\":\"")    property(name="syslogseverity-text")
             constant(value="\",\"facility\":\"")    property(name="syslogfacility-text")
             constant(value="\",\"programname\":\"") property(name="programname")
             constant(value="\",\"procid\":\"")      property(name="procid")
           constant(value="\"}\n")
         }

*.* @127.0.0.1:5514;ls_json

The configuration above needs to be saved to /etc/rsyslog.d/90-logstash.conf on the syslog hosts, syslog.o1.example.org and syslog.o2.example.org in our example. Additionally, rsyslog must be restarted with the command service rsyslog restart to read the new configuration.

Route messages with rsyslog

If you have rsyslog 8.7.0 or higher you can use the rsyslog Kafka output module omkafka to send the messages from rsyslog directly to Apache Kafka:

$ModLoad omkafka
action(type="omkafka" topic="logs" broker=["192.168.100.10:9092"] template="ls_json")

Route messages with Logstash

If your rsyslog does not support the Kafka output module, you can use Logstash to forward messages to Graylog.

Logstash will listen on localhost (127.0.0.1) on port 5514/udp for messages that are coming from rsyslog and will forward them to the Apache Kafka cluster.

input {
    UDP {
        port => 5514
        host => "127.0.0.1"
        type => syslog
        codec => "json"
        }
}

output {
           kafka {
               bootstrap_server => "192.168.100.10:9092"
               topic_id => "logs"
           }
}

Additional information about the configuration options can be found in the Kafka output module documentation of Logstash.

Consume messages with Graylog

Now the log messages need to be pulled and consumed by Graylog.

Create a Syslog Kafka input and configure it according to information from the previous steps in this guide (exchange name, username, password, and hostname). Also set the option Allow overwrite date.

Start the newly created Syslog Kafka input to consume the first messages and create a JSON extractor. Additionally create a second extractor on the field host and the type Copy input, and store it in the field source. You might want a third Copy input to store Logstash's @timestamp field into the timestamp message field used by Graylog.

What's next?

You could use the rsyslog Linux systems as Syslog proxies for every possible source in the same network and add more systems to your setup.

Credits