fluent/fluent-bit

Docker logs in tail input - no metadata

konstantin-kornienko opened this issue ยท 6 comments

Is your feature request related to a problem? Please describe.
Docker logs collected by fluentbit are missing containers metadata.

Describe the solution you'd like
I think filter enriching logs with docker containers' metadata will solve the issue.

Describe alternatives you've considered
Currently switched to filebeat.

As some possible workaround, developed a lua script to enrich records with docker metadata:

[INPUT]
  Name   tail
  Path   /var/lib/docker/containers/*/*.log
  Parser docker
  Refresh_Interval 30
  Ignore_Older 6h
  Docker_Mode  On
  Tag source.docker.<file_name>
  Tag_Regex (?<file_name>.+)

[FILTER]
  Name   lua
  Match  source.docker.*
  script docker-metadata.lua
  call   encrich_with_docker_metadata

docker-metadata.lua:

DOCKER_VAR_DIR = '/var/lib/docker/containers/'
DOCKER_CONTAINER_CONFIG_FILE = '/config.v2.json'
CACHE_TTL_SEC = 300
DOCKER_CONTAINER_METADATA = {
  ['docker.container_name'] = '\"Name\":\"/?(.-)\"',
  ['docker.container_image'] = '\"Image\":\"/?(.-)\"',
  ['docker.container_started'] = '\"StartedAt\":\"/?(.-)\"'
}
cache = {}

-- record tag should countain docker log file path
-- last part of log file name is container id
function get_container_id_from_tag(tag)
  return tag:match'^.+/(.-)-json_log$'
end

-- Gets metadata from config.v2.json file for container
function get_container_metadata_from_disk(container_id)
  local docker_config_file = DOCKER_VAR_DIR .. container_id .. DOCKER_CONTAINER_CONFIG_FILE
  fl = io.open(docker_config_file, 'r')
  if fl == nil then
    return nil
  end

  -- parse json file and create record for cache
  local data = { time = os.time() }
  for line in fl:lines() do
    for key, regex in pairs(DOCKER_CONTAINER_METADATA) do
      local match = line:match(regex)
      if match then
        data[key] = match
      end
    end
  end
  fl:close()

  if next(data) == nil then
    return nil
  else
    return data
  end
end

function encrich_with_docker_metadata(tag, timestamp, record)
  -- Get container id from tag
  container_id = get_container_id_from_tag(tag)
  if not container_id then
    return 0, 0, 0
  end

  -- Add container_id to record
  new_record = record
  new_record['docker.container_id'] = container_id

  -- Check if we have fresh cache record for container
  local cached_data = cache[container_id]
  if cached_data == nil or ( os.time() - cached_data['time'] > CACHE_TTL_SEC) then
    cached_data = get_container_metadata_from_disk(container_id)
    cache[container_id] = cached_data
    new_record['source'] = 'disk' -- for troubleshooting only
  else
    new_record['source'] = 'cache' -- for troubleshooting only
  end

  -- Metadata found in cache or got from disk, enrich record
  if cached_data then
    for key, regex in pairs(DOCKER_CONTAINER_METADATA) do
      new_record[key] = cached_data[key]
    end
  end

  return 1, timestamp, new_record
end

Addition: looks like os.time() isn't working in Fluentbit's LUA by some reason, so caching isn't working as expected - it caches information about container forever.

I would also like to have this. Kubernetes is covered, but most of our services are still on AWS ECS, so we've had to switch to Fluentd in the meantime.

Fixed code to work for latest version in case anyone needs it...

(Also took out cache ttl code since os.time wasn't working)

fluent-bit.conf:

[INPUT]
    Name   tail
    Path   /var/lib/docker/containers/*/*.log
    Parser docker
    Refresh_Interval 30
    Ignore_Older 6h
    Docker_Mode  On
    Tag source.docker.<container_id>
    Tag_Regex (.*\/(?<container_id>.*)-json\.log)

[FILTER]
    Name   lua
    Match  source.docker.*
    script /fluent-bit/bin/docker-metadata.lua
    call   encrich_with_docker_metadata

docker-metadata.lua:

DOCKER_VAR_DIR = '/var/lib/docker/containers/'
DOCKER_CONTAINER_CONFIG_FILE = '/config.v2.json'
DOCKER_CONTAINER_METADATA = {
  ['docker.container_name'] = '\"Name\":\"/?(.-)\"',
  ['docker.container_image'] = '\"Image\":\"/?(.-)\"',
  ['docker.container_started'] = '\"StartedAt\":\"/?(.-)\"'
}

cache = {}

-- Gets metadata from config.v2.json file for container
function get_container_metadata_from_disk(container_id)
  local docker_config_file = DOCKER_VAR_DIR .. container_id .. DOCKER_CONTAINER_CONFIG_FILE
  fl = io.open(docker_config_file, 'r')

  if fl == nil then
    return nil
  end

  -- Parse json file and create record for cache
  local data = {}
  for line in fl:lines() do
    for key, regex in pairs(DOCKER_CONTAINER_METADATA) do
      local match = line:match(regex)
      if match then
        data[key] = match
      end
    end
  end
  fl:close()

  if next(data) == nil then
    return nil
  else
    return data
  end
end

function encrich_with_docker_metadata(tag, timestamp, record)
  -- Get container id from tag
  container_id = tag:match'.*%.(.*)'
  if not container_id then
    return 0, 0, 0
  end

  -- Add container_id to record
  new_record = record
  new_record['docker.container_id'] = container_id

  -- Check if we have fresh cache record for container
  local cached_data = cache[container_id]
  if cached_data == nil then
    cached_data = get_container_metadata_from_disk(container_id)
  end

  -- Metadata found in cache or got from disk, enrich record
  if cached_data then
    for key, regex in pairs(DOCKER_CONTAINER_METADATA) do
      new_record[key] = cached_data[key]
    end
  end

  return 1, timestamp, new_record
end

Here's a derivative of @konstantin-kornienko's idea using lua-cjson for metadata and the json parser for structured log messages:

fluent-bit.conf:

[SERVICE]
    daemon                  false
    flush                   1
    log_level               warning
    parsers_file            parsers.conf
    http_server             true
    http_listen             0.0.0.0
    http_port               2020

[INPUT]
    name                    tail
    tag                     docker.<container_id>
    tag_regex               (?<container_id>[^/]+)-json\.log$
    path                    /var/lib/docker/containers/*/*-json.log
    db                      /var/log/fluent-bit-docker.pos
    parser                  docker
    docker_mode             true
    buffer_chunk_size       64k
    buffer_max_size         64k
    mem_buf_limit           16m
    skip_long_lines         true
    refresh_interval        10

[FILTER]
    name                    parser
    match                   docker.*
    key_name                log
    parser                  json

[FILTER]
    name                    lua
    match                   docker.*
    script                  filters.lua
    call                    enrich

parsers.conf:

[PARSER]
    name                    docker
    format                  json
    time_key                time
    time_format             %Y-%m-%dT%H:%M:%S.%L
    time_keep               false

[PARSER]
    name                    json
    format                  json
    time_key                time
    time_format             %d/%b/%Y:%H:%M:%S %z

filters.lua:

cjson = require("cjson")

cache = {}

local function get_metadata(container_id)
    -- Read config file
    local config_file_path = "/var/lib/docker/containers/" .. container_id .. "/config.v2.json"
    local config_file = io.open(config_file_path, "rb")
    if not config_file then
        return nil
    end
    local config_json = config_file:read("*a")
    config_file:close()

    -- Map json config
    local config = cjson.decode(config_json)

    return {
        id = config.ID,
        name = config.Name:gsub("^/", ""),
        hostname = config.Config.Hostname,
        image = config.Config.Image,
        image_id = config.Image,
        labels = config.Config.Labels
    }
end

function enrich(tag, timestamp, record)
    -- Get container id from tag
    local container_id = tag:match("docker%.(.+)")
    if not container_id then
        return 0, timestamp, record
    end

    -- Get metadata from cache or config
    local metadata = cache[container_id]
    if not metadata then
        metadata = get_metadata(container_id)
        if metadata then
            cache[container_id] = metadata
        end
    end

    if metadata then
        record.docker = metadata
    end

    return 2, timestamp, record
end

Install the native lua module using LuaRocks:

luarocks install lua-cjson

Or build your own docker image and throw in lrexlib-pcre2 for native regexes: ๐Ÿ˜›

FROM fluent/fluent-bit:1.5.4 as fluent-bit

FROM ubuntu:focal as lua-libs

RUN apt-get update \
    && DEBIAN_FRONTEND=noninteractive apt-get install -y libpcre2-dev luarocks

RUN luarocks install lua-cjson \
    && luarocks install lrexlib-pcre2

# https://github.com/fluent/fluent-bit/blob/master/Dockerfile#L60
FROM ubuntu:focal

COPY --from=fluent-bit \
    /usr/lib/x86_64-linux-gnu/libsasl*.so* \
    # /usr/lib/x86_64-linux-gnu/libz* \
    # /lib/x86_64-linux-gnu/libz* \
    /usr/lib/x86_64-linux-gnu/libssl.so* \
    /usr/lib/x86_64-linux-gnu/libcrypto.so* \
    /usr/lib/x86_64-linux-gnu/

COPY --from=fluent-bit \
    /usr/lib/x86_64-linux-gnu/libpq.so* \
    /usr/lib/x86_64-linux-gnu/libgssapi* \
    /usr/lib/x86_64-linux-gnu/libldap* \
    /usr/lib/x86_64-linux-gnu/libkrb* \
    /usr/lib/x86_64-linux-gnu/libk5crypto* \
    /usr/lib/x86_64-linux-gnu/liblber* \
    # /usr/lib/x86_64-linux-gnu/libgnutls* \
    # /usr/lib/x86_64-linux-gnu/libp11-kit* \
    # /usr/lib/x86_64-linux-gnu/libidn2* \
    # /usr/lib/x86_64-linux-gnu/libunistring* \
    # /usr/lib/x86_64-linux-gnu/libtasn1* \
    # /usr/lib/x86_64-linux-gnu/libnettle* \
    # /usr/lib/x86_64-linux-gnu/libhogweed* \
    # /usr/lib/x86_64-linux-gnu/libgmp* \
    # /usr/lib/x86_64-linux-gnu/libffi* \
    # /lib/x86_64-linux-gnu/libcom_err* \
    /lib/x86_64-linux-gnu/libkeyutils* \
    /lib/x86_64-linux-gnu/

COPY --from=fluent-bit /fluent-bit/bin/ /fluent-bit/bin/
COPY --from=lua-libs /usr/local/lib/lua/ /usr/local/lib/lua/
COPY *.conf *.lua /fluent-bit/etc/

RUN ldd /fluent-bit/bin/fluent-bit | sort \
    && /fluent-bit/bin/fluent-bit --version

ENTRYPOINT [ "/fluent-bit/bin/fluent-bit" ]
CMD [ "-c", "/fluent-bit/etc/fluent-bit.conf" ]

I created image https://github.com/jidckii/fluent-bit
Plus added rule for multiline.
Suddenly someone will come in handy.

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

This issue was closed because it has been stalled for 5 days with no activity.