[fluentd-elasticsearch] parses dates wrong when creating indices
Closed this issue · 15 comments
Is this a request for help?:
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
This might be a BUG
** ES Version**
6.7
Version of Helm and Kubernetes:
Kubernetes v1.11.7 - Argocd v1.2.0
Which chart in which version:
fluentd-elasticsearch 4.9.1
What happened:
elasticsearch indices get created with wrong date. Over the course of 2 minutes the following indices are created
x-logstash-2064.03.25
x-logstash-2064.03.27
x-logstash-2064.03.08
x-logstash-2019.10.09
x-logstash-1970.01.01
x-logstash-2064.03.26
Likely that datetime is being parsed wrongly
What you expected to happen:
One index should be created for the day with the current timestamp
x-logstash-2019.10.09 (today)
How to reproduce it (as minimally and precisely as possible):
Values that are applied can be accessed here:
https://gist.github.com/davideagle/dd2844807df904331f23bec045697451
Anything else we need to know:
@monotek i'm sorry I grabbed the fluentd version instead of the chart version. I downgraded from 4.9.1 to 4.9.0 and I see the same behavior as before, multiple indices get created within minutes
Please check which apps send these dates in Elasticsearch / Kibana.
After that please check the actual logs for the timestamps please.
Here is a log entry from kubernetes-external-secrets pod where timestamp translates to Thu, 10 Oct 2019 10:08:13 GMT how ever when it is inserted into elasticsearch it has a timestamp from "2064-04-22T19:23:08.000Z"
{"level":50,"time":1570702172674,"pid":17,"hostname":"kubernetes-external-secrets-5cb85dbf8d-rr69n","msg":"failure while polling the secrets","v":1}
{
"_index": "x-logstash-2064.04.22",
"_type": "_doc",
"_id": "e8UotW0B-qgoVCfSdZbS",
"_version": 1,
"_score": null,
"_source": {
"stream": "stdout",
"docker": {
"container_id": "6b0292b13fcfb15a4f0050f32719e257813b1928b0c5843382663712e4e61938"
},
"kubernetes": {
"container_name": "kubernetes-external-secrets",
"namespace_name": "operations",
"pod_name": "kubernetes-external-secrets-5cb85dbf8d-rr69n",
"container_image": "godaddy/kubernetes-external-secrets:1.3.1",
"container_image_id": "docker-pullable://godaddy/kubernetes-external-secrets@sha256:a2708a12128e4129457ca1162b6cef5801fa5e38e6a5f408ac15c34c35775629",
"pod_id": "8a2d452a-df7d-11e9-b426-06887de226b0",
"host": "ip-10-252-195-177.eu-west-1.compute.internal",
"labels": {
"pod-template-hash": "1764186948",
"app_kubernetes_io/instance": "kubernetes-external-secrets",
"app_kubernetes_io/name": "kubernetes-external-secrets"
},
"master_url": "https://100.64.0.1:443/api",
"namespace_id": "76200989-de01-11e9-a941-0a20d608e880",
"namespace_labels": {
"app_kubernetes_io/instance": "namespaces"
}
},
"level": 50,
"pid": 17,
"hostname": "kubernetes-external-secrets-5cb85dbf8d-rr69n",
"msg": "failure while polling the secrets",
"v": 1,
"@timestamp": "2064-04-22T19:23:08.000000000+00:00",
"tag": "kubernetes.var.log.containers.kubernetes-external-secrets-5cb85dbf8d-rr69n_operations_kubernetes-external-secrets-6b0292b13fcfb15a4f0050f32719e257813b1928b0c5843382663712e4e61938.log"
},
"fields": {
"@timestamp": [
"2064-04-22T19:23:08.000Z"
]
},
"sort": [
2976117788000
]
}```
Hmm... weird... never seen this before.
All entries that end up in x-logstash-1970-01-01 have the same timestamp of January 1st 1970, 00:33:39.000
Example log entry
{"name":"micro.doctor","hostname":"micro-doctor-5f585b8c46-gggp6","pid":27,"level":"DEBUG","msg":"Error retrieving cached copy of service health for service customer-analytics. Unable to send health checks for this service to DataDog! Error:Health endpoint http://customer-analytics.default.svc.cluster.local:10000/health/ returned an empty body or unexpected status code 404","time":"2019-10-10T13:56:59.669Z","v":0}
{
"_index": "x-logstash-1970.01.01",
"_type": "_doc",
"_id": "l80otW0B5bOudMZKqmFj",
"_version": 1,
"_score": null,
"_source": {
"stream": "stdout",
"docker": {
"container_id": "88b9565e2743826074980f78f83bbf718154ef9601530367d7ab2e198bd23575"
},
"kubernetes": {
"container_name": "micro-doctor",
"namespace_name": "default",
"pod_name": "micro-doctor-5f585b8c46-rfnb4",
"container_image": "docker/micro-doctor:sha-a63926e",
"container_image_id": "docker-pullable://docker/micro-doctor@sha256:1218f0dd63de9ce7796956447a5b1a9b0b5fb7e9521573b41203c527738fe07e",
"pod_id": "10484748-d314-11e9-b426-06887de226b0",
"host": "ip-10-252-196-24.eu-west-1.compute.internal",
"labels": {
"name": "micro-doctor",
"pod-template-hash": "1914164702",
"tier": "frontend",
"type": "micro"
},
"master_url": "https://100.64.0.1:443/api",
"namespace_id": "844f446d-4186-11e8-a9e2-0a5d6f615bb2"
},
"name": "micro.doctor",
"hostname": "micro-doctor-5f585b8c46-rfnb4",
"pid": 27,
"level": "DEBUG",
"msg": "Error retrieving cached copy of service health for service customer-analytics. Unable to send health checks for this service to DataDog! Error:Health endpoint http://customer-analytics.default.svc.cluster.local:10000/health/ returned an empty body or unexpected status code 404",
"v": 0,
"@timestamp": "1970-01-01T00:33:39.000000730+00:00",
"tag": "kubernetes.var.log.containers.micro-doctor-5f585b8c46-rfnb4_default_micro-doctor-88b9565e2743826074980f78f83bbf718154ef9601530367d7ab2e198bd23575.log"
},
"fields": {
"@timestamp": [
"1970-01-01T00:33:39.000Z"
]
},
"sort": [
2019000
]
}
´´´
I think I've identified the problem to some extent at least
Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError error="400 - Rejected by Elasticsearch [error type]:
mapper_parsing_exception [reason]:
failed to parse field [fields.time] of type [long] in document
time=1970-01-01 00:33:39.000000000 +0000
"fields.time"=>"2019-10-11T14:24:33Z"
containersInputConf: source time_format %Y-%m-%dT%H:%M:%S.%NZ does not match this time_format of %Y-%m-%dT%H:%M:%SZ where subseconds are missing from the source log.
Any suggestions on how one would accomplish handling both 2019-10-11T14:24:33Z and 2019-10-11T14:24:33.001Z time fields?
I get this same error. I am almost positive its when logs (like external secrets) which I am running have their own timestamp field. That seems to cause some issues. I will look into fluentd patches.
same error here, a index with 1970.01.01 is created and timestamps are the same :
"@timestamp" : "1970-01-01T00:33:39.644558916+00:00"
i can correctly inject a time_key with parsed time (which im using as time field):
I have got the same issue, a index with 1970.01.01 is created with the same timestamp January 1st 1970, 01:33:39.000
along with other index. Any help how to get only the updated time!
i solved my problem, since i had a filter parser that didnt set the time key, - and i guess overwrote the timestamp and set back to default, so either parse it once, or remember to set it again/later in subsequent "parses"
<source>
@id fluentd-containers.log
@type tail
path /var/log/containers/vault*.log
pos_file /var/log/containers.log.pos
tag kubernetes.*
read_from_head true
<parse>
@type json
time_format "%Y-%m-%dT%H:%M:%S.%N%Z"
</parse>
</source>
<filter kubernetes.**>
@id filter_parser
@type parser
key_name log
reserve_data true
remove_key_name_field true
<parse> # <---
@type json # <--- this section will overwrite timestamps if you already set it previously
time_format "%Y-%m-%dT%H:%M:%S.%N%Z" # <--- if needed you can parse it here instead
</parse> # <---
</filter>
@StefanMadsen Thanks solved my issue.
To summarize....
You all havent used the default configs but used some override?
If so, i tend to close the issue.
@monotek I'm revisiting this issue now. Installed without any overrides and these two services where one gets parsed correctly but the other does not.
2019-12-20T13:04:32.156Z vs 2019-12-20T12:50:30Z
January 1st 1970, 00:33:39.000
{
"correlationId": "0a66d68a-e815-4de3-b374-e827f46b068e",
"languageId": "",
"level": "info",
"methodName": "AddToCache",
"msg": "Done adding item to cache, key: %!(EXTRA string=health:weather)",
"responseTime": "",
"time": "2019-12-20T12:50:30Z",
"version": ""
}
December 20th 2019, 13:04:32.156
{
"name": "micro.booking-api",
"hostname": "booking-api-794d79fc97-qjzn5",
"pid": 27,
"level": "WARN",
"msg": "msg",
"time": "2019-12-20T13:04:32.156Z",
"v": 0
}