Blutgang exiting with 137 when one of many endpoint is in pending state
Opened this issue · 0 comments
barnabasbusa commented
Describe the bug
This bug seems to happen on kubernetes only. When a specific rpc endpoint is in pending state, for example if the node that was running the given pod has gone offline, and the pod can't be rescheduled to any other pod.
To Reproduce
Steps to reproduce the behavior:
- Create a kubernetes cluster with 2 nodes
- Schedule 2 rpc endpoints in each node (with node selector)
- Setup blutgang to load balance between those two rpc endpoints
- Shut down the physical node - let one rpc endpoint to be in pending state - while ingress is still active
Expected behavior
A clear and concise description of what you expected to happen.
I would expect this endpoint to be excluded from the list of load balancable rpc endpoints.
Specs:
- k3s with kube version 1.127.10
- physical nodes run debian 12, x86
- blutgang version: 0.3.5
Blutgang options:
[blutgang]
do_clear = true
address = "0.0.0.0:3000"
ma_length = 100
sort_on_startup = true
health_check = true
header_check = true
ttl = 300
max_retries = 32
expected_block_time = 13000
health_check_ttl = 2000
supress_rpc_check = false
[admin]
enabled = true
address = "0.0.0.0:5715"
readonly = true
jwt = false
key = ""
[sled]
db_path = "/data/blutgang-cache"
mode = "HighThroughput"
cache_capacity = 1000000000
compression = false
print_profile = false
flush_every_ms = 240
[mainnet-besu-teku]
url = "http://mainnet-besu-teku:8545"
ws_url = "ws://mainnet-besu-teku:8545"
max_consecutive = 150
max_per_second = 200
# pending pod
[mainnet-besu-lighthouse]
url = "http://mainnet-besu-lighthouse:8545"
ws_url = "ws://mainnet-besu-lighthouse:8545"
max_consecutive = 150
max_per_second = 200