Graylog2/graylog-plugin-beats

Generic beats support broke existing setups

bernd opened this issue · 11 comments

bernd commented

Since we merged #29, my existing Graylog setup with packetbeat breaks with index failures.

2018-02-15 19:37:46,228 WARN : org.graylog2.indexer.messages.Messages - Failed to index message: index=<testgraylog_2> id=<5067b300-127f-11e8-a7e7-02427ac964ae
> error=<{"type":"mapper_parsing_exception","reason":"failed to parse [status]","caused_by":{"type":"number_format_exception","reason":"For input string: \"OK\
""}}>

Before this PR, the packetbeat status field has been indexed as packetbeat_status field and now Graylog tries to index it as status. This fails in my setup because status is a long and not a string.

bernd commented

I am also seeing fields like the following which have not been there before. (with a @ character)
image

(The screenshot has fields with a packetbeat_ prefix because I changed the code to add the prefix again by default)

@bernd What's the complete configuration of your Beats input, especially the beats_prefix setting?

Regarding the fields with leading "@", they're there because the Beats codec now doesn't just emit a few hand picked attributes from the Beats messages, but all of them.

bernd commented

@joschi The beats_prefix setting does not exist because the input has been created before that setting has been introduced in #29.

{
	"_id" : ObjectId("59d23b632e28122698486bf9"),
	"creator_user_id" : "bernd",
	"configuration" : {
		"recv_buffer_size" : 1048576,
		"port" : 5044,
		"tls_key_file" : "",
		"tls_enable" : false,
		"tls_key_password" : "",
		"tcp_keepalive" : false,
		"tls_client_auth_cert_file" : "",
		"tls_client_auth" : "disabled",
		"override_source" : null,
		"bind_address" : "0.0.0.0",
		"tls_cert_file" : ""
	},
	"name" : "Beats",
	"created_at" : ISODate("2017-12-13T10:57:21.596Z"),
	"global" : true,
	"type" : "org.graylog.plugins.beats.BeatsInput",
	"title" : "Beats",
	"content_pack" : null
}

@bernd Try enabling the Beats prefix setting (the default is false).

bernd commented

@bernd @kroepke Existing users can still use the old naming scheme by enabling the beats_prefix setting, new users will get the "prefix-less" variant by default.

This was part of the review discussion in #29 and was approved, so I don't know why this is a surprise.

bernd commented

I think the problem is that filebeat data was never prefixed, but others always were.
For some reason I missed that both during the discussion and during the review.

Now the problem is: There's no way to unify handling prefixes while staying backwards compatible for all beats types, without creating a "legacy mode" flag. I'd really hate that :(

But yes, you are right, the intention was to stay compatible.

Off the top of my head I can think of only one option here:
When migrating existing inputs, add a combination of flags that

  • turn off prefixes for filebeats
  • turn on prefixes for all others

When starting a fresh input:

  • turn off prefixes

Essentially this would be a "legacy behavior" flag. If we don't do this most existing integrations with dashboards and streams are likely to break silently, I'm afraid.

bernd commented

Another option would be to create a new input/codec with the new behavior and change the name of the existing one to "legacy beats" (or something similar). That way we don't have to add any migrations and can avoid creating complicated conditions in the input and even more options for the user.

bernd commented

@joschi @kroepke Also the facility in my setup changed from packetbeat to beats.

Now that the beats plugin got merged into Graylog server, we should come up with a fix for this situation.