bloomberg/goldpinger

Seting up HTTP_TARGETS_TIMEOUT value make results unstable

tuxerrante opened this issue · 1 comments

Describe the bug
After setting this value the target appears and disappears from the UI.
I've tried also with and without double quotes, with and wihout 'ms'.
Logs from the daemonset don't show much and we see no option to increase logging level through an ENV variable.

To Reproduce
Steps to reproduce the behavior:

  1. Set the value in the values.yaml like
extraEnv:
  - name: DISPLAY_NODENAME
    value: "true"
  - name: HTTP_TARGETS
    value: https://my.website/en
  - name: HTTP_TARGETS_TIMEOUT
    value: "1000ms"
  1. Rollout
  2. Wait a few minutes
  3. See error on the UI

Expected behavior
The target website should appear green after increasing the timeout from the default "500ms" to "1000ms"
https://github.com/bloomberg/goldpinger/blob/95363554e4078ce20e4fe746ce98b332e472e469/pkg/goldpinger/config.go#LL56C1-L56C1

Environment (please complete the following information):

  • AKS v1.25.5
  • Goldpinger v3.7.0
Found 25 pods, using pod/pau-monitor-goldpinger-gcvpf
{"level":"info","ts":"2023-06-01T07:59:43.427Z","caller":"goldpinger/main.go:114","message":"Goldpinger","version":"v3.7.0","build":"Tue Oct 25 19:39:28 UTC 2022"}
{"level":"info","ts":"2023-06-01T07:59:43.427Z","caller":"goldpinger/main.go:125","message":"Kubeconfig not specified, trying to use in cluster config"}
{"level":"info","ts":"2023-06-01T07:59:43.428Z","caller":"goldpinger/main.go:147","message":"PodIP not set: pinging all pods"}
{"level":"info","ts":"2023-06-01T07:59:43.428Z","caller":"goldpinger/main.go:150","message":"--ping-number set to 0: pinging all pods"}
{"level":"info","ts":"2023-06-01T07:59:43.428Z","caller":"goldpinger/main.go:153","message":"IPVersions not set: settings to 4 (IPv4)"}
{"level":"info","ts":"2023-06-01T07:59:43.624Z","caller":"goldpinger/main.go:183","message":"All good, starting serving the API"}

"error": "Get "http://10.0.3.145:80/check\": context deadline exceeded"

image

After removing the http timeout, the nodes come back green and the target website red
image

Using goldpinger 3.9, I see the same behavior, we had some http targets that took longer than 500ms, and when trying to increase the timeout, the UI basically disappears. Just the presence of the http_targets_timeout seems to cause this issue. Any update on a possible solution or workaround?