openshift/cluster-logging-operator

Vector application log syslog forwarding getting socket_send writer_failed errors on OpenShift worker nodes

Opened this issue · 2 comments

Describe the bug
Hello, We are seeing below errors in vector collector pods

2024-06-12T18:27:49.383056Z WARN vector::internal_events::file::source: Currently ignoring file too small to fingerprint. file=/var/log/pods/impakt_impakt-strapi-7cd85864bf-tsd8f_e9a49eb6-9674-4bdd-b1d7-276eecb0b36e/init/0.log
2024-06-12T18:27:51.790074Z ERROR sink{component_kind="sink" component_id=output_siem_logpoint component_type=socket}: vector::internal_events::socket: Error sending data. error=Connection refused (os error 111) error_code="socket_send" error_type="writer_failed" stage="sending" mode=udp internal_log_rate_limit=true
2024-06-12T18:27:51.790110Z ERROR sink{component_kind="sink" component_id=output_siem_logpoint component_type=socket}: vector_common::internal_event::component_events_dropped: Events dropped intentional=false count=1 reason="Error sending data." internal_log_rate_limit=true
2024-06-12T18:27:52.051340Z ERROR sink{component_kind="sink" component_id=output_siem_logpoint component_type=socket}: vector::internal_events::socket: Internal log [Error sending data.] is being suppressed to avoid flooding.
2024-06-12T18:27:52.051371Z ERROR sink{component_kind="sink" component_id=output_siem_logpoint component_type=socket}: vector_common::internal_event::component_events_dropped: Internal log [Events dropped] is being suppressed to avoid flooding.
2024-06-12T18:28:01.851747Z ERROR sink{component_kind="sink" component_id=output_siem_logpoint component_type=socket}: vector::internal_events::socket: Internal log [Error sending data.] has been suppressed 5 times.

Environment

  • OKD 4;13
  • ClusterLogging instance 5.9.0

Logs

2024-06-12T18:27:49.383056Z WARN vector::internal_events::file::source: Currently ignoring file too small to fingerprint. file=/var/log/pods/impakt_impakt-strapi-7cd85864bf-tsd8f_e9a49eb6-9674-4bdd-b1d7-276eecb0b36e/init/0.log
2024-06-12T18:27:51.790074Z ERROR sink{component_kind="sink" component_id=output_siem_logpoint component_type=socket}: vector::internal_events::socket: Error sending data. error=Connection refused (os error 111) error_code="socket_send" error_type="writer_failed" stage="sending" mode=udp internal_log_rate_limit=true
2024-06-12T18:27:51.790110Z ERROR sink{component_kind="sink" component_id=output_siem_logpoint component_type=socket}: vector_common::internal_event::component_events_dropped: Events dropped intentional=false count=1 reason="Error sending data." internal_log_rate_limit=true
2024-06-12T18:27:52.051340Z ERROR sink{component_kind="sink" component_id=output_siem_logpoint component_type=socket}: vector::internal_events::socket: Internal log [Error sending data.] is being suppressed to avoid flooding.
2024-06-12T18:27:52.051371Z ERROR sink{component_kind="sink" component_id=output_siem_logpoint component_type=socket}: vector_common::internal_event::component_events_dropped: Internal log [Events dropped] is being suppressed to avoid flooding.
2024-06-12T18:28:01.851747Z ERROR sink{component_kind="sink" component_id=output_siem_logpoint component_type=socket}: vector::internal_events::socket: Internal log [Error sending data.] has been suppressed 5 times.

Expected behavior
Collector pods should forward logs to external SIEM

Actual behavior
Getting errors

@imdmahajankanika Hello, error_type="writer_failed" means that Vector was unable to write log data. Can you please double-check your network connection? Would you be able to send some test data to the same SEIM instance via other solution, for example: netcat ?

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale