RichardoC/kube-audit-rest

full-elastic-stack vector error messages not being able to push data to elasticsearch

fjellvannet opened this issue · 7 comments

I have deployed the full-elastic-stack example in my microk8s-cluster with three nodes. It works in a way that data is arriving in elasticsearch, however regularly the following error messages come up:

vector 2024-06-26T08:27:25.641854Z ERROR sink{component_kind="sink" component_id=elastic-sink component_type=elasticsearch component_name=elastic-sink}:request{request_id=17701}: vector_common::internal_event::component_events_dropped: Events dropped intentional=false count=3 reason="Service call failed. No retries or retries exhausted." internal_log_rate_limit=true                                                                vector 2024-06-26T08:28:21.686515Z ERROR sink{component_kind="sink" component_id=elastic-sink component_type=elasticsearch component_name=elastic-sink}:request{request_id=17728}: vector::sinks::util::retries: Not retriable; dropping the request. reason="error type: illegal_argument_exception, reason: Limit of total fields [1000] has been exceeded" internal_log_rate_limit=true                                                      vector 2024-06-26T08:28:21.686613Z ERROR sink{component_kind="sink" component_id=elastic-sink component_type=elasticsearch component_name=elastic-sink}:request{request_id=17728}: vector_common::internal_event::service: Service call failed. No retries or retries exhausted. error=None request_id=17728 error_type="request_failed" stage="sending" internal_log_rate_limit=true                                                           vector 2024-06-26T08:28:21.686638Z ERROR sink{component_kind="sink" component_id=elastic-sink component_type=elasticsearch component_name=elastic-sink}:request{request_id=17728}: vector_common::internal_event::component_events_dropped: Events dropped intentional=false count=12 reason="Service call failed. No retries or retries exhausted." internal_log_rate_limit=true

So it seems like some data can be pushed, while other data cannot be pushed. Is there a way to further debug this or figure out what exactly is happening? Unfortunately the error messages themselves are very unclear.

The issue is most likely this line here https://github.com/RichardoC/kube-audit-rest/blob/main/examples/full-elastic-stack/k8s/kube-audit-rest.yaml#L23 where max_depth is 5, try using 2 instead and see if that fixes the issue. You'll have to drop the existing indicies as this changes the index schema. The logs will be less nice to read but should all be ingested.

This issue is likely happening because elastic tries to guess the schema of your messages based on what you send it. This can mean that if you later send a more complicated message Elastic rejects it because your message has fields it wasn't expecting.

If this fixes your issue, mind opening a PR for changing this default?

This fixed it indeed! I actually spent half a day fixing exactly the same issue in Logstash (with the request- and responseobject not being regular). I solved it in the same way there.

I'd love to make a PR, but I don't have write access to this repository, so I was not allowed to upload it.

I also have two other PR's ready, that fix some other smaller issues I had in the example.

Also wanted to say that this tool seems awesome! Now that it works it is so much better than not having any access to audit logs at least.

I actually currently evaluated this product in my master's thesis about intrusion detection in Kubernetes. I can gladly tell you that you and this tool are cited in there!

That's great to hear! Mind sharing the link if it's open access? Would be great to add that citation as a testimonial!

For the permission issues, you should be able to fork the repository, make the changes and then open a PR. If you've any issues let me know as it'd be great to get those issues fixed.

It will be published soon when it's delivered, yet it's work in progress :)

I'll drop by the link when it has been published ;)

It will be published soon when it's delivered, yet it's work in progress :)

I'll drop by the link when it has been published ;)

Thanks, good luck with the writeup!

Mind if I close out this issue now we've fixed the issue with the dropped messages?

The issue is more than fixed! Thank you

Still impressed by the lightning fast answer