Kong/kong

abnormal route occurred on the kong upstream

Closed this issue · 6 comments

Is there an existing issue for this?

  • I have searched the existing issues

Kong version ($ kong version)

Kong 3.4.0

Current Behavior

I used Kong's upstream load balancing to load balance two targets. I did not configure health checks(use default). When one of the targets went offline, then I called the target's API through Kong.

Expected Behavior

the request was occasionally forwarded to the offline target, resulting in two 502 status codes and a - being returned, causing the request to be abnormal. The message in access.log is:
{"clientip":"127.0.0.1","time":"2024-08-15T19:42:33+08:00","request_uri":"/innerService/dataService/exportData","http_version":"","method":"POST","status":"499","body_bytes_sent":"0","http_referer":"","UA":"python-requests/2.9.1","http_x_forwarded_for":"","request_time":"300.001","upstream_response_time":"127.310,127.232,45.440","domain":"127.0.0.1","upstream_addr":"10.10.52.141:20080, 10.10.52.141:20080, 10.10.52.141:20080","upstream_status":"502, 502, -"}

When the response is normal,the request would be forwarded to the normal target, and the message in access.log is like this(there is 200 in upstream_status):
{"clientip":"127.0.0.1","time":"2024-08-15T19:44:42+08:00","request_uri":"/innerService/dataService/exportData","http_version":"","method":"POST","status":"200","body_bytes_sent":"77","http_referer":"","UA":"python-requests/2.9.1","http_x_forwarded_for":"","request_time":"127.368","upstream_response_time":"127.315,0.034","domain":"127.0.0.1","upstream_addr":"10.10.52.141:20080,10.10.2.228:20080","upstream_status":"502, 200"}

Can you provide some ideas, please

Steps To Reproduce

No response

Anything else?

the config info in kong db is:
table(upstream):
id | 14547d8e-a6d8-41d0-ae8e-c946a275a73c
created_at | 2024-08-12 12:02:05+08
name | rest-service-upstream
hash_on | none
hash_fallback | none
hash_on_header |
hash_fallback_header |
hash_on_cookie |
hash_on_cookie_path | /
slots | 10000
healthchecks | {"active": {"type": "http", "headers": null, "healthy": {"interval": 0, "successes": 0, "http_statuses": [200, 302]}, "timeout": 1, "http_path": "/", "https_sni": null, "unhealthy": {"interval": 0, "timeouts": 0, "tcp_failures": 0, "http_failures": 0, "http_statuses": [429, 404, 500, 501, 502, 503, 504, 505]}, "concurrency": 10, "https_verify_certificate": true}, "passive": {"type": "http", "healthy": {"successes": 0, "http_statuses": [200, 201, 202, 203, 204, 205, 206, 207, 208, 226, 300, 301, 302, 303, 304, 305, 306, 307, 308]}, "unhealthy": {"timeouts": 0, "tcp_failures": 0, "http_failures": 0, "http_statuses": [429, 500, 503]}}, "threshold": 0}
tags |
algorithm | round-robin
host_header |
client_certificate_id |
ws_id | 7b069e34-9542-48cb-8a16-15eb144a3e5d
hash_on_query_arg |
hash_fallback_query_arg |
hash_on_uri_capture |
hash_fallback_uri_capture |
use_srv_name | f
updated_at | 2024-08-12 12:02:05+08

table(target):
-[ RECORD 1 ]----------------------------------------------------------------------------------------------------
id | 1450be12-4598-4c43-a811-514db9a78610
created_at | 2024-08-12 12:02:05.866+08
upstream_id | 14547d8e-a6d8-41d0-ae8e-c946a275a73c
target | 10.10.2.228:20080
weight | 100
tags |
ws_id | 7b069e34-9542-48cb-8a16-15eb144a3e5d
cache_key | targets:14547d8e-a6d8-41d0-ae8e-c946a275a73c:10.10.2.228:20080::::7b069e34-9542-48cb-8a16-15eb144a3e5d
updated_at | 2024-08-12 12:02:05.866+08
-[ RECORD 2 ]----------------------------------------------------------------------------------------------------
id | 751e80b0-706f-4939-8673-5259477d28dc
created_at | 2024-08-12 12:02:05.971+08
upstream_id | 14547d8e-a6d8-41d0-ae8e-c946a275a73c
target | 10.10.52.141:20080
weight | 100
tags |
ws_id | 7b069e34-9542-48cb-8a16-15eb144a3e5d
cache_key | targets:14547d8e-a6d8-41d0-ae8e-c946a275a73c:10.10.52.141:20080::::7b069e34-9542-48cb-8a16-15eb144a3e5d
updated_at | 2024-08-12 12:02:05.971+08

I configured two targets, one online and one offline. Probabilistic routing failures also occurred. By the way, I did not configure any health checks.
same log like this "upstream_addr":"offline:8080,offline:8080,offline:8080","upstream_status":"502, 502, -"
It makes me confused.

#8948
Does round-robin mode(default) also have this problem?

the request was occasionally forwarded to the offline target, resulting in two 502 status codes and a - being returned, causing the request to be abnormal. The message in access.log is:

If you dont configure active healtch check, for first time, kong/nginx doesn't know which peers are active, they could only try every peer to detect which one is available, this is a passive health check mechanism. If you don't want Kong to automatically perform trial-and-error with the backend, it's recommended to configure the health check module.

the request was occasionally forwarded to the offline target, resulting in two 502 status codes and a - being returned, causing the request to be abnormal. The message in access.log is:

If you dont configure active healtch check, for first time, kong/nginx doesn't know which peers are active, they could only try every peer to detect which one is available, this is a passive health check mechanism. If you don't want Kong to automatically perform trial-and-error with the backend, it's recommended to configure the health check module.

ok, when I configured the health check, it worked fine,thank you!

hi @shilaidun

#8948 Does round-robin mode(default) also have this problem?

No, round-robin has no such problem, it could try different upstream addresses.

I tried to send a request to kong with route that has two targets :8080 and :8081 and get different upstream connect error:

2024/08/22 18:08:49 [error] 40852#0: *652 kevent() reported that connect() failed (61: Connection refused) while connecting to upstream, client: 127.0.0.1, server: kong, request: "GET /mock HTTP/1.1", upstream: "http://127.0.0.1:8080/", host: "localhost:8000", request_id: "1a757800da107e6b449a5e0b9206cf7f"                                                                 

2024/08/22 18:08:49 [error] 40852#0: *652 kevent() reported that connect() failed (61: Connection refused) while connecting to upstream, client: 127.0.0.1, server: kong, request: "GET /mock HTTP/1.1", upstream: "http://127.0.0.1:8081/", host: "localhost:8000", request_id: "1a757800da107e6b449a5e0b9206cf7f"

hi @shilaidun , I think this problem from the original author has been resolved. If you still have any other problem, I recommend you to file another new one. So I'm closing this one.