kubernetes-sigs/karpenter

Errors when pushing logs from kinesis to opensearch

andrewhibbert opened this issue · 1 comments

Description

Observed Behavior:

Similar to aws/karpenter-provider-aws#7191

We are getting multiple errors on different things though, with either message or pods:

[ERROR] BulkIndexError: ('1 document(s) failed to index.', [{'index': {'_index': 'logstash-2024.10.30', '_type': '_doc', '_id': 'Fgjx25IB7LJa9YX0S7VB', 'status': 400, 'error': {'type': 'mapper_parsing_exception', 'reason': "failed to parse field [message] of type [text] in document with id 'Fgjx25IB7LJa9YX0S7VB'. Preview of field's value: '{controller=node.termination, level=INFO, Node={name=ip-10-138-108-179.eu-west-1.compute.internal}, logger=controller, commit=0f8788c, name=ip-10-138-108-179.eu-west-1.compute.internal, namespace=, controllerGroup=, reconcileID=9d73b284-23c1-4765-8c28-ff04c62caaa5, time=2024-10-29T05:41:19.898Z, message=deleted node, controllerKind=Node}'", 'caused_by': {'type': 'illegal_state_exception', 'reason': "Can't get text on a START_OBJECT at 1:502"}}, 'data': {'log': '2024-10-29T05:41:19.898637365Z stdout F {"level":"INFO","time":"2024-10-29T05:41:19.898Z","logger":"controller","message":"deleted node","commit":"0f8788c","controller":"node.termination","controllerGroup":"","controllerKind":"Node","Node":{"name":"ip-10-138-108-179.eu-west-1.compute.internal"},"namespace":"","name":"ip-10-138-108-179.eu-west-1.compute.internal","reconcileID":"9d73b284-23c1-4765-8c28-ff04c62caaa5"}', 'logtag': 'F', 'message': {'Node': {'name': 'ip-10-138-108-179.eu-west-1.compute.internal'}, 'commit': '0f8788c', 'controller': 'node.termination', 'controllerGroup': '', 'controllerKind': 'Node', 'level': 'INFO', 'logger': 'controller', 'message': 'deleted node', 'name': 'ip-10-138-108-179.eu-west-1.compute.internal', 'namespace': '', 'reconcileID': '9d73b284-23c1-4765-8c28-ff04c62caaa5', 'time': '2024-10-29T05:41:19.898Z'}, 'stream': 'stdout', 'time': '2024-10-29T05:41:19.898637365Z', '@logging_platform.processed_by': 'tio-logging-platform/processor/logging-pipeline-processor', '@logging_platform.processed_at': '2024-10-30T05:41:16.706062', '@logging_platform.kinesis_timestamp': '2024-10-29T05:41:20.822000', '@logging_platform.kinesis_sequence_number': '49657065793124131575734976410801005501323173181012316306', '@logging_platform.kinesis_compressed': False, '@logging_platform.processing_time': 2.8135999855294358e-05}}}])
[ERROR] BulkIndexError: ('1 document(s) failed to index.', [{'index': {'_index': 'logstash-2024.10.30', '_type': '_doc', '_id': 'lOF83pIBoOWmGoh4sRVZ', 'status': 400, 'error': {'type': 'mapper_parsing_exception', 'reason': "failed to parse field [pods] of type [long] in document with id 'lOF83pIBoOWmGoh4sRVZ'. Preview of field's value: 'enrichment-services-np-00/enrichment-services-np-00-streaming-entityfilt-service-asy4stkb, enrichment-services-np-00/enrichment-services-np-00-classification-filter-service-as6l6l7, enrichment-services-np-00/enrichment-services-np-00-entity-filter-service-async-6dd9pjxwb, enrichment-services-np-00/enrichment-services-np-00-rx-reg-client-service-async-6dc9gnlq7'", 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'For input string: "enrichment-services-np-00/enrichment-services-np-00-streaming-entityfilt-service-asy4stkb, enrichment-services-np-00/enrichment-services-np-00-classification-filter-service-as6l6l7, enrichment-services-np-00/enrichment-services-np-00-entity-filter-service-async-6dd9pjxwb, enrichment-services-np-00/enrichment-services-np-00-rx-reg-client-service-async-6dc9gnlq7"'}}, 'data': {'commit': '0f8788c', 'controller': 'provisioner', 'kubernetes': {'annotations': {'CapacityProvisioned': '1vCPU 3GB', 'Logging': 'LoggingEnabled', 'kubectl_kubernetes_io/restartedAt': '2024-10-29T19:27:08Z'}, 'container_hash': 'public.ecr.aws/karpenter/controller@sha256:32258259250f675a09f91d63991657e366f98e6dbfaa953e4f3a92f01ac5ecca', 'container_image': 'sha256:d94f9c61ef09e9b887a815fdeb9b3f70f3333046b24a949f0f4c5d8b0bc945ba', 'container_name': 'controller', 'docker_id': '76c6c9177cbddcbda696693a059090b1848c8c7f08ddf68d249bfc3e277d1090', 'host': 'fargate-ip-10-138-108-204.eu-west-1.compute.internal', 'labels': {'app_kubernetes_io/instance': 'karpenter', 'app_kubernetes_io/name': 'karpenter', 'eks_amazonaws_com/fargate-profile': 'karpenter', 'pod-template-hash': 'c655494bd'}, 'namespace_name': 'kube-system', 'pod_id': '472c45a7-8acb-40e4-8905-d837b83f0f8e', 'pod_name': 'karpenter-c655494bd-llbsd'}, 'level': 'INFO', 'logger': 'controller', 'logtag': 'F', 'message': 'pod(s) have a preferred Anti-Affinity which can prevent consolidation', 'name': '', 'namespace': '', 'pods': 'enrichment-services-np-00/enrichment-services-np-00-streaming-entityfilt-service-asy4stkb, enrichment-services-np-00/enrichment-services-np-00-classification-filter-service-as6l6l7, enrichment-services-np-00/enrichment-services-np-00-entity-filter-service-async-6dd9pjxwb, enrichment-services-np-00/enrichment-services-np-00-rx-reg-client-service-async-6dc9gnlq7', 'reconcileID': 'd4bde5c1-a35c-4d4b-9928-e3395985c243', 'stream': 'stdout', 'time': '2024-10-30T02:09:14.5615', '@logging_platform.processed_by': 'tio-logging-platform/processor/logging-pipeline-processor', '@logging_platform.processed_at': '2024-10-30T17:32:46.725534', '@logging_platform.kinesis_timestamp': '2024-10-30T02:09:15.109000', '@logging_platform.kinesis_sequence_number': '49656920125370118620058707135358308705122856611505770386', '@logging_platform.kinesis_compressed': False, '@logging_platform.processing_time': 3.749799998331582e-05}}}])

So two things really:

  1. How do we avoid the message error, karpenter is logging in json and other pods are not so need a fluent-bit config that works for both
  2. Seems like as per the other issue sometimes pods is a string, sometimes it is a number

Expected Behavior:

See above

Reproduction Steps (Please include YAML):

Versions:

  • Chart Version:
  • Kubernetes Version (kubectl version):
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

/triage accepted