nested message field in message field gets lost when overlaying parsed message
warmfusion opened this issue · 4 comments
Given a json structured rsyslog event which has a field named message
which includes a jsonformatted json string from an application.
And fluent configuration such that the message
field from rsyslog is parsed as json and overlayed over the original event.
When an event is emitted by an app which includes message
as part of its log,
Then the burrow plugin parses the json, overlays it onto the main record but removes the `message field (which was parsed out of the message.
Rsyslog Configuration
# A lovely template
template(name="json-photon"
type="list") {
constant(value="{")
constant(value="\"@timestamp\":\"") property(name="timereported" dateFormat="rfc3339")
constant(value="\",\"type\":\"syslog")
constant(value="\",\"tag\":\"") property(name="syslogtag" format="json")
constant(value="\",\"relayhost\":\"") property(name="fromhost")
constant(value="\",\"relayip\":\"") property(name="fromhost-ip")
constant(value="\",\"logsource\":\"") property(name="source")
constant(value="\",\"hostname\":\"") property(name="hostname" caseconversion="lower")
constant(value="\",\"program\":\"") property(name="programname")
constant(value="\",\"priority\":\"") property(name="pri")
constant(value="\",\"severity\":\"") property(name="syslogseverity")
constant(value="\",\"facility\":\"") property(name="syslogfacility")
constant(value="\",\"severity_label\":\"") property(name="syslogseverity-text")
constant(value="\",\"facility_label\":\"") property(name="syslogfacility-text")
constant(value="\",\"message\":\"") property(name="msg" format="json")
constant(value="\"}\n")
}
# This line sends all lines to defined IP address at port 10034,
# using the "json-photon" format template
*.* @fluent:10034;json-photon
FluentD Configuration
<system>
log_level trace
</system>
<source>
port 10034
bind 0.0.0.0
tag syslog
format json
@type udp
@id syslog-source
</source>
<filter>
@type stdout
@id debug-filter
</filter>
<match syslog>
@type rewrite_tag_filter
@id syslog-matcher
<rule>
key tag
pattern /^(\w+)/
tag syslog.$1
</rule>
</match>
# Attempt to filter all events as if they have json
# and when they do, expand the object and
# overlay to the base of this record
# then reemit without the `syslog` prefix for processing
# like normal things
<match syslog.**>
@type burrow
key_name message
format json
remove_prefix syslog
action overlay
</match>
<match **>
@type stdout
@id debug-matcher
</match>
Python Test Code
from logging.handlers import SysLogHandler
import logging
logger = logging.getLogger()
#add handler to the logger
handler = logging.handlers.SysLogHandler()
#add formatter to the handler
formatter = logging.Formatter('python: { "loggerName":"%(name)s", "timestamp":"%(asctime)s", "pathName":"%(pathname)s", "logRecordCreationTime":"%(created)f", "functionName":"%(funcName)s", "levelNo":"%(levelno)s", "lineNo":"%(lineno)d", "time":"%(msecs)d", "levelName":"%(levelname)s", "message":"%(message)s"}')
handler.formatter = formatter
logger.addHandler(handler)
#logger.info("Test INFO Message")
#logging.warn("Hello WARNING message")
#logging.error("Hello ERROR message")
logging.critical("Hello CRITICAL message")
Results
Fluent logs show (Formatted json for readability);
fluent_1 | 2019-11-23 12:24:02.203101000 +0000 syslog.python:
{
"@timestamp": "2019-11-23T12:24:02.202400+00:00",
"type": "syslog",
"tag": "python:",
"relayhost": "172.18.0.1",
"relayip": "172.18.0.1",
"logsource": "172.18.0.1",
"hostname": "172.18.0.1",
"program": "python",
"priority": "10",
"severity": "2",
"facility": "1",
"severity_label": "crit",
"facility_label": "user",
"message": " { \"loggerName\":\"root\", \"timestamp\":\"2019-11-23 12:24:02,191\", \"pathName\":\"send_syslog_message.py\", \"logRecordCreationTime\":\"1574511842.191565\", \"functionName\":\"<module>\", \"levelNo\":\"50\", \"lineNo\":\"22\", \"time\":\"191\", \"levelName\":\"CRITICAL\", \"message\":\"Hello CRITICAL message\"}"
}
which includes the message
from rsyslog, and the json formatted string.
Once it parses the event we then get
fluent_1 | 1970-01-01 00:03:11.000000000 +0000 python:
{
"@timestamp": "2019-11-23T12:24:02.202400+00:00",
"type": "syslog",
"tag": "python:",
"relayhost": "172.18.0.1",
"relayip": "172.18.0.1",
"logsource": "172.18.0.1",
"hostname": "172.18.0.1",
"program": "python",
"priority": "10",
"severity": "2",
"facility": "1",
"severity_label": "crit",
"facility_label": "user",
"loggerName": "root",
"timestamp": "2019-11-23 12:24:02,191",
"pathName": "send_syslog_message.py",
"logRecordCreationTime": "1574511842.191565",
"functionName": "<module>",
"levelNo": "50",
"lineNo": "22",
"levelName": "CRITICAL"
}
Which does not include the message
key anymore.
Expected
The message
key from the origin event is replaced with the message
key parsed from the json-string the origin record contained.
Hello,
It looks like you want to set keep_key
to true
in your burrow configuration. You can read about that in the readme here https://github.com/vanilla/fluent-plugin-burrow/blob/master/README.md
action
can be one of 4 values: inplace
, overlay
, replace
, and prefix
. When using overlay
, the default behaviour is to remove the original encoded key, but keep_key
disables that.
Hope this fixes your issue.
That isn't what I need, that'd presumably keep the 'message' block for all events even if the nested value included a message
key or not.
The key being kept is the one from the object inside the message, not the original message itself.
I now see what you're saying. This might be a bug.
Could be that https://github.com/vanilla/fluent-plugin-burrow/blob/master/lib/fluent/plugin/filter_burrow.rb#L88-L93 is moved before the merger.
It still means that if you have a nested key you might overwrite the original value, but not sure theres anything that can be done about that if overlay
is chosen.
def filter(tag, time, record)
raw_value = record[@key_name]
if raw_value then
new_time, new_values = nil, nil
@parser.parse(raw_value) do |parsed_time, parsed_values|
new_time = parsed_time
new_values = parsed_values
end
if new_values then
original_time = record[@record_time_key]
new_time ||= original_time
# Remove the key before we merge the new values in case the merger re-creates the key
if ['overlay','replace','prefix'].include? @action
if not @keep_key and record.has_key?(@key_name)
record.delete(@key_name)
end
end
# Overlay new record on top of original record?
new_record = case @action
when 'inplace'
record.merge({@key_name => new_values})
when 'overlay'
record.merge(new_values)
when 'replace'
new_values
when 'prefix'
record.merge({@data_prefix => new_values})
end
# Preserve 'time' key?
if @keep_time
new_record[@record_time_key] = original_time
end
new_record
end
end
end