Trader service gets stuck at `decision_receive_round` due to RPC request size when using recommended endpoint
Closed this issue · 4 comments
When running the service (with ./run_service.sh
), I see in the logs that it gets stuck at the decision_receive_round
stage due to an error when making a request to the RPC endpoint:
[2024-01-05 11:32:44,952] [ERROR] [agent] Could not successfully interact with the valory/mech:0.1.0 contract using 'get_response': Message(sender=valory/ledger:0.19.0,to=valory/trader_abci:0.1.0,code=500,data=b'',dialogue_reference=('522ce329d4b9643eed513f5b872dba5fe073f96afee1b16fab15b37063f1d84a', '81b5e949f79fda0bdd2aa2f0617ad5619ff6cf6915c99c4f4f4deba7f6c308ee'),message=Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 413 Client Error: Request Entity Too Large for url: https://solitary-evocative-breeze.xdai.quiknode.pro/<REDACTED_TOKEN>/
,message_id=-1,performative=error,target=1)
I set up a free QuickNode endpoint, as suggested in the README. I can see that I have not exceeded the request limit for the endpoint, though there is a credit limit of 330/second. According to the quicknode docs, the problem could either be a rate limit issue, or just the size of the request.
Without visibility in the logs of the request being sent at this point, I'm not sure how to debug this issue.
For reference, here's what I see when I run log analysis:
% cd trader; poetry run autonomy analyse logs --from-dir trader_service/abci_build/persistent_data/logs/ --agent aea_0 --fsm --reset-db; cd ..
Agent aea_0
|_ Period 0
| |_ registration_startup_round | Event.DONE
| |_ update_bets_round | Event.DONE
| |_ sampling_round | Event.DONE
| |_ tool_selection_round | Event.DONE
| |_ decision_request_round | Event.DONE
| |_ pre_tx_settlement_round | Event.CHECKS_PASSED
| |_ randomness_transaction_submission_round | Event.DONE
| |_ select_keeper_transaction_submission_a_round | Event.DONE
| |_ collect_signature_round | Event.DONE
| |_ finalization_round | Event.DONE
| |_ validate_transaction_round | Event.DONE
| |_ post_tx_settlement_round | Event.DECISION_REQUESTING_DONE
| |_ decision_receive_round | Event.ROUND_TIMEOUT
| |_ decision_receive_round | Event.ROUND_TIMEOUT
| |_ decision_receive_round | Event.ROUND_TIMEOUT
| |_ decision_receive_round | Event.ROUND_TIMEOUT
|_ End
Here's what I see when I generate a report:
% cd trader; poetry run python ../report.py; cd ..
==============
Service report
==============
Performance
================================================================================
Staking
--------------------------------------------------------------------------------
Is service staked? No
Prediction market trading
--------------------------------------------------------------------------------
ROI on closed markets 0.00 %
Trades on last 3 days 0 trades on 0 markets
Service
================================================================================
ID 265
Agent
--------------------------------------------------------------------------------
Status Running
Address 0x3319092f74d16F63b1e33054CBe215559725b984
xDAI Balance 0.07 xDAI
Safe
--------------------------------------------------------------------------------
Address 0x89E31da13E33a21A5f26FE957048ec40f4d852fB
xDAI Balance 0.59 xDAI
WxDAI Balance 0.00 WxDAI
Owner/Operator
--------------------------------------------------------------------------------
Address 0xfdc79F9d63373d52BAca4896F091b105cbB862cA
xDAI Balance 0.08 xDAI
And these are the docker logs from around the error:
% docker logs trader_abci_0
[2024-01-05 11:32:35,104] [INFO] [agent] Entered in the 'decision_receive_behaviour' behaviour
[2024-01-05 11:32:35,262] [INFO] [agent] Filtering the mech's events from block 31796032 for a response to our request with id 41369436667226394143474032173796524152512667375537417353188963147475296024191.
[2024-01-05 11:32:35,476] [INFO] [agent] The mech (0x77af31De935740567Cf4fF1986D04B2c964A786a) has not delivered a response yet for request with id 41369436667226394143474032173796524152512667375537417353188963147475296024191.
[2024-01-05 11:32:35,476] [INFO] [agent] Retrying in 1 seconds.
[2024-01-05 11:32:36,100] [INFO] [agent] arrived block with timestamp: 2024-01-05 11:32:35.068709
[2024-01-05 11:32:36,101] [INFO] [agent] current AbciApp time: 2024-01-05 11:32:34.054765
[2024-01-05 11:32:36,102] [INFO] Created a new local deadline for the next `begin_block` request from the Tendermint node: 2024-01-05 11:33:36.102435
[2024-01-05 11:32:36,478] [INFO] [agent] Filtering the mech's events from block 31796032 for a response to our request with id 41369436667226394143474032173796524152512667375537417353188963147475296024191.
[2024-01-05 11:32:36,621] [INFO] [agent] The mech (0x77af31De935740567Cf4fF1986D04B2c964A786a) has not delivered a response yet for request with id 41369436667226394143474032173796524152512667375537417353188963147475296024191.
[2024-01-05 11:32:36,622] [INFO] [agent] Retrying in 1 seconds.
[2024-01-05 11:32:37,120] [INFO] [agent] arrived block with timestamp: 2024-01-05 11:32:36.088339
[2024-01-05 11:32:37,121] [INFO] [agent] current AbciApp time: 2024-01-05 11:32:35.068709
[2024-01-05 11:32:37,122] [INFO] Created a new local deadline for the next `begin_block` request from the Tendermint node: 2024-01-05 11:33:37.122051
Repeats 'Filtering' -> 'Retrying' loop a few more times
[2024-01-05 11:32:40,991] [INFO] [agent] Filtering the mech's events from block 31796032 for a response to our request with id 41369436667226394143474032173796524152512667375537417353188963147475296024191.
[2024-01-05 11:32:41,206] [INFO] [agent] arrived block with timestamp: 2024-01-05 11:32:40.172447
[2024-01-05 11:32:41,207] [INFO] [agent] current AbciApp time: 2024-01-05 11:32:39.149972
[2024-01-05 11:32:41,207] [INFO] Created a new local deadline for the next `begin_block` request from the Tendermint node: 2024-01-05 11:33:41.207887
[2024-01-05 11:32:42,223] [INFO] [agent] arrived block with timestamp: 2024-01-05 11:32:41.190873
[2024-01-05 11:32:42,224] [INFO] [agent] current AbciApp time: 2024-01-05 11:32:40.172447
[2024-01-05 11:32:42,225] [INFO] Created a new local deadline for the next `begin_block` request from the Tendermint node: 2024-01-05 11:33:42.224975
[2024-01-05 11:32:42,476] [ERROR] [agent] Could not successfully interact with the valory/mech:0.1.0 contract using 'get_response': Message(sender=valory/ledger:0.19.0,to=valory/trader_abci:0.1.0,code=500,data=b'',dialogue_reference=('c3575489e0eec073910a8a13c39c1bad732b0212e9cbd0f41e5ee9b6e99e8b0e', '424b4f7c17f642800c279bf1803fc1490157476e8181022c1b90208b88a8ce9b'),message=Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 413 Client Error: Request Entity Too Large for url: https://solitary-evocative-breeze.xdai.quiknode.pro/1a5fe57c2e7a83cf685c6bdcd72b9eb6f597cdcf/
,message_id=-1,performative=error,target=1)
[2024-01-05 11:32:42,476] [INFO] [agent] Retrying in 1 seconds.
Now stuck in this^ loop forever.
Can you try with the nodies RPC (https://www.nodies.app/) and see if the error persists? Would help in understanding if thisis RPC specific.
Hi @DavidMinarsch, thanks for the suggestion. I don't get these errors when using the nodies RPC, and the agent runs to completion of a full Period
now. Would you like me to put up a PR changing the README to suggest using this RPC client instead?
Thanks for confirming. There's a PR open to fix this in the README already - should be in later today. Thanks for offering. Closing here.