
Allow the creation of custom data streams

Currently the Elasticsearch output in Logstash does not allow the creation of a custom data stream type, if you use the data_stream_* settings of the output it will validate the data_stream_type and it will only allow the following values:

  • logs
  • metrics
  • synthetics
  • traces

All of those types are also used by Elastic Agent and have system managed templates and lifecycle policies, so to use data stream now in logstash you would need to create some template for the type you want but make sure that this template will not override the system templates, this makes things more complex and there is always the risk of human error that would override the templates used by Elastic Agent and break things.

To be able to use custom data streams in logstash you need a trick on the output like the example below:

output {
    elasticsearch {
        hosts => ["HOSTS"]
        index => "data-stream-name"
        action => "create"
        http_compression => true
        data_stream => false
        manage_template => false
        ilm_enabled => false
        cacert => 'ca.crt'
        user => 'USER'
        password => 'PASSWORD'

While this works, Logstash should allow the creation of data streams of custom types, which is not possible now.

Thoughts on implementation options:

  1. migrate config option's validation to a regexp like \A(?!\.{1,2}$)[[:lower:][:digit:]][[:lower:][:digit:]\._+]{0,252}\Z the successfully rejects known-invalid index prefixes while letting likely-valid ones through (limitation: composed index name length cannot be validated solely from a single component).
  2. add a validator to Validator Support mixin that does the same as (1) more readably/efficiently
  3. validate the composed index name for data stream in #initialize or #register if and only if data_stream is effectively true