CPU/socket queue/response times spikes every ~8sec
Yukigaru opened this issue · 2 comments
I send transactions from my trading app with /broadcast_tx_sync
to my fully operational dydx node and measure response roundtrip time. The problem is that at some times the metric is very high:
Min duration is ~1-2ms. Note: trading app is on the same host with the node, so network latency is minimal.
Max duration: up to 1-2 seconds.
I checked dydx log and found out that in such cases Received new short term order
message appears 1-2sec after sending the order, which means the node haven't been receiving the Tx for long time. I opened ss
tool and found out that every ~8 sec read queue
sizes grow big (up to hundreds of kylobytes) for around half of node network sockets and stay big for 1-2 sec, which means goroutines doesn't read the data from sockets in time. Go perf shows that there are following notable time consumers at that moment:
- goleveldb functions (~30%)
- checkTx (~20%)
- consensus functions (~40%)
To summarize: dydx node has performance spikes every ~8 sec, which can be seen in CPU utilization, in method response times and in socket statistics.
-
Are those spikes known issue?
-
Are there obvious reasons node can respond so long (from code/config standpoint)? (1-2 sec)
-
Can I safely use
rocksdb
for DB (instead of goleveldb)? -
Are there typical optimization advice for configs? (I've seen tendermint guides, but not sure they are applicable)
-
Can [mempool] recheck be disabled?
Thank you!
OS: ubuntu 20.04, kernel 5.15.0
Dydx chain release: from around January
Fixed with the new update.