kiwigrid/helm-charts

[fluentd-elasticsearch] parses dates wrong when creating indices

Closed this issue · 15 comments

Is this a request for help?:


Is this a BUG REPORT or FEATURE REQUEST? (choose one):
This might be a BUG

** ES Version**
6.7

Version of Helm and Kubernetes:
Kubernetes v1.11.7 - Argocd v1.2.0

Which chart in which version:
fluentd-elasticsearch 4.9.1

What happened:
elasticsearch indices get created with wrong date. Over the course of 2 minutes the following indices are created
x-logstash-2064.03.25
x-logstash-2064.03.27
x-logstash-2064.03.08
x-logstash-2019.10.09
x-logstash-1970.01.01
x-logstash-2064.03.26

Likely that datetime is being parsed wrongly

What you expected to happen:
One index should be created for the day with the current timestamp
x-logstash-2019.10.09 (today)

How to reproduce it (as minimally and precisely as possible):
Values that are applied can be accessed here:
https://gist.github.com/davideagle/dd2844807df904331f23bec045697451

Anything else we need to know:

Does it work with chart verion 4.9.0?
If so it might be related to: #169

@monotek i'm sorry I grabbed the fluentd version instead of the chart version. I downgraded from 4.9.1 to 4.9.0 and I see the same behavior as before, multiple indices get created within minutes

Please check which apps send these dates in Elasticsearch / Kibana.
After that please check the actual logs for the timestamps please.

Here is a log entry from kubernetes-external-secrets pod where timestamp translates to Thu, 10 Oct 2019 10:08:13 GMT how ever when it is inserted into elasticsearch it has a timestamp from "2064-04-22T19:23:08.000Z"

{"level":50,"time":1570702172674,"pid":17,"hostname":"kubernetes-external-secrets-5cb85dbf8d-rr69n","msg":"failure while polling the secrets","v":1}

{
  "_index": "x-logstash-2064.04.22",
  "_type": "_doc",
  "_id": "e8UotW0B-qgoVCfSdZbS",
  "_version": 1,
  "_score": null,
  "_source": {
    "stream": "stdout",
    "docker": {
      "container_id": "6b0292b13fcfb15a4f0050f32719e257813b1928b0c5843382663712e4e61938"
    },
    "kubernetes": {
      "container_name": "kubernetes-external-secrets",
      "namespace_name": "operations",
      "pod_name": "kubernetes-external-secrets-5cb85dbf8d-rr69n",
      "container_image": "godaddy/kubernetes-external-secrets:1.3.1",
      "container_image_id": "docker-pullable://godaddy/kubernetes-external-secrets@sha256:a2708a12128e4129457ca1162b6cef5801fa5e38e6a5f408ac15c34c35775629",
      "pod_id": "8a2d452a-df7d-11e9-b426-06887de226b0",
      "host": "ip-10-252-195-177.eu-west-1.compute.internal",
      "labels": {
        "pod-template-hash": "1764186948",
        "app_kubernetes_io/instance": "kubernetes-external-secrets",
        "app_kubernetes_io/name": "kubernetes-external-secrets"
      },
      "master_url": "https://100.64.0.1:443/api",
      "namespace_id": "76200989-de01-11e9-a941-0a20d608e880",
      "namespace_labels": {
        "app_kubernetes_io/instance": "namespaces"
      }
    },
    "level": 50,
    "pid": 17,
    "hostname": "kubernetes-external-secrets-5cb85dbf8d-rr69n",
    "msg": "failure while polling the secrets",
    "v": 1,
    "@timestamp": "2064-04-22T19:23:08.000000000+00:00",
    "tag": "kubernetes.var.log.containers.kubernetes-external-secrets-5cb85dbf8d-rr69n_operations_kubernetes-external-secrets-6b0292b13fcfb15a4f0050f32719e257813b1928b0c5843382663712e4e61938.log"
  },
  "fields": {
    "@timestamp": [
      "2064-04-22T19:23:08.000Z"
    ]
  },
  "sort": [
    2976117788000
  ]
}```

Hmm... weird... never seen this before.

All entries that end up in x-logstash-1970-01-01 have the same timestamp of January 1st 1970, 00:33:39.000

Example log entry

{"name":"micro.doctor","hostname":"micro-doctor-5f585b8c46-gggp6","pid":27,"level":"DEBUG","msg":"Error retrieving cached copy of service health for service customer-analytics. Unable to send health checks for this service to DataDog! Error:Health endpoint http://customer-analytics.default.svc.cluster.local:10000/health/ returned an empty body or unexpected status code 404","time":"2019-10-10T13:56:59.669Z","v":0}

{
  "_index": "x-logstash-1970.01.01",
  "_type": "_doc",
  "_id": "l80otW0B5bOudMZKqmFj",
  "_version": 1,
  "_score": null,
  "_source": {
    "stream": "stdout",
    "docker": {
      "container_id": "88b9565e2743826074980f78f83bbf718154ef9601530367d7ab2e198bd23575"
    },
    "kubernetes": {
      "container_name": "micro-doctor",
      "namespace_name": "default",
      "pod_name": "micro-doctor-5f585b8c46-rfnb4",
      "container_image": "docker/micro-doctor:sha-a63926e",
      "container_image_id": "docker-pullable://docker/micro-doctor@sha256:1218f0dd63de9ce7796956447a5b1a9b0b5fb7e9521573b41203c527738fe07e",
      "pod_id": "10484748-d314-11e9-b426-06887de226b0",
      "host": "ip-10-252-196-24.eu-west-1.compute.internal",
      "labels": {
        "name": "micro-doctor",
        "pod-template-hash": "1914164702",
        "tier": "frontend",
        "type": "micro"
      },
      "master_url": "https://100.64.0.1:443/api",
      "namespace_id": "844f446d-4186-11e8-a9e2-0a5d6f615bb2"
    },
    "name": "micro.doctor",
    "hostname": "micro-doctor-5f585b8c46-rfnb4",
    "pid": 27,
    "level": "DEBUG",
    "msg": "Error retrieving cached copy of service health for service customer-analytics. Unable to send health checks for this service to DataDog! Error:Health endpoint http://customer-analytics.default.svc.cluster.local:10000/health/ returned an empty body or unexpected status code 404",
    "v": 0,
    "@timestamp": "1970-01-01T00:33:39.000000730+00:00",
    "tag": "kubernetes.var.log.containers.micro-doctor-5f585b8c46-rfnb4_default_micro-doctor-88b9565e2743826074980f78f83bbf718154ef9601530367d7ab2e198bd23575.log"
  },
  "fields": {
    "@timestamp": [
      "1970-01-01T00:33:39.000Z"
    ]
  },
  "sort": [
    2019000
  ]
}
´´´

I think I've identified the problem to some extent at least

Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError error="400 - Rejected by Elasticsearch [error type]:
mapper_parsing_exception [reason]:
failed to parse field [fields.time] of type [long] in document
time=1970-01-01 00:33:39.000000000 +0000
"fields.time"=>"2019-10-11T14:24:33Z"

containersInputConf: source time_format %Y-%m-%dT%H:%M:%S.%NZ does not match this time_format of %Y-%m-%dT%H:%M:%SZ where subseconds are missing from the source log.

Any suggestions on how one would accomplish handling both 2019-10-11T14:24:33Z and 2019-10-11T14:24:33.001Z time fields?

I get this same error. I am almost positive its when logs (like external secrets) which I am running have their own timestamp field. That seems to cause some issues. I will look into fluentd patches.

same error here, a index with 1970.01.01 is created and timestamps are the same :
"@timestamp" : "1970-01-01T00:33:39.644558916+00:00"

i can correctly inject a time_key with parsed time (which im using as time field):

@id fluentd-containers.log @type tail path /var/log/containers/stuff*.log pos_file /var/log/containers.log.pos tag raw.kubernetes.* read_from_head true @type json time_key time time_format "%Y-%m-%dT%H:%M:%S.%N%Z"

I have got the same issue, a index with 1970.01.01 is created with the same timestamp January 1st 1970, 01:33:39.000 along with other index. Any help how to get only the updated time!

i solved my problem, since i had a filter parser that didnt set the time key, - and i guess overwrote the timestamp and set back to default, so either parse it once, or remember to set it again/later in subsequent "parses"

<source>
      @id fluentd-containers.log
      @type tail
      path /var/log/containers/vault*.log
      pos_file /var/log/containers.log.pos
      tag kubernetes.*
      read_from_head true
      <parse>
        @type json
        time_format "%Y-%m-%dT%H:%M:%S.%N%Z"
      </parse>
</source>
<filter kubernetes.**>
      @id filter_parser
      @type parser
      key_name log
      reserve_data true
      remove_key_name_field true
      <parse>                                   # <---
        @type json                             # <--- this section will overwrite timestamps if you already set it previously
        time_format "%Y-%m-%dT%H:%M:%S.%N%Z"    # <--- if needed you can parse it here instead
      </parse>                                   # <---
</filter>

@StefanMadsen Thanks solved my issue.

To summarize....
You all havent used the default configs but used some override?
If so, i tend to close the issue.

@monotek I'm revisiting this issue now. Installed without any overrides and these two services where one gets parsed correctly but the other does not.

2019-12-20T13:04:32.156Z vs 2019-12-20T12:50:30Z

January 1st 1970, 00:33:39.000

{
  "correlationId": "0a66d68a-e815-4de3-b374-e827f46b068e",
  "languageId": "",
  "level": "info",
  "methodName": "AddToCache",
  "msg": "Done adding item to cache, key: %!(EXTRA string=health:weather)",
  "responseTime": "",
  "time": "2019-12-20T12:50:30Z",
  "version": ""
}

December 20th 2019, 13:04:32.156

{
  "name": "micro.booking-api",
  "hostname": "booking-api-794d79fc97-qjzn5",
  "pid": 27,
  "level": "WARN",
  "msg": "msg",
  "time": "2019-12-20T13:04:32.156Z",
  "v": 0
}

Could you guys test if the below pr fixes the problem for you?

#257