valory-xyz/trader-quickstart

Trader service gets stuck at `decision_receive_round` due to RPC request size when using recommended endpoint

Closed this issue · 4 comments

When running the service (with ./run_service.sh), I see in the logs that it gets stuck at the decision_receive_round stage due to an error when making a request to the RPC endpoint:

[2024-01-05 11:32:44,952] [ERROR] [agent] Could not successfully interact with the valory/mech:0.1.0 contract using 'get_response': Message(sender=valory/ledger:0.19.0,to=valory/trader_abci:0.1.0,code=500,data=b'',dialogue_reference=('522ce329d4b9643eed513f5b872dba5fe073f96afee1b16fab15b37063f1d84a', '81b5e949f79fda0bdd2aa2f0617ad5619ff6cf6915c99c4f4f4deba7f6c308ee'),message=Traceback (most recent call last):

  File "/usr/local/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)

requests.exceptions.HTTPError: 413 Client Error: Request Entity Too Large for url: https://solitary-evocative-breeze.xdai.quiknode.pro/<REDACTED_TOKEN>/
,message_id=-1,performative=error,target=1)

I set up a free QuickNode endpoint, as suggested in the README. I can see that I have not exceeded the request limit for the endpoint, though there is a credit limit of 330/second. According to the quicknode docs, the problem could either be a rate limit issue, or just the size of the request.

Without visibility in the logs of the request being sent at this point, I'm not sure how to debug this issue.

For reference, here's what I see when I run log analysis:

% cd trader; poetry run autonomy analyse logs --from-dir trader_service/abci_build/persistent_data/logs/ --agent aea_0 --fsm --reset-db; cd ..

Agent aea_0
|_ Period 0
| |_ registration_startup_round | Event.DONE
| |_ update_bets_round | Event.DONE
| |_ sampling_round | Event.DONE
| |_ tool_selection_round | Event.DONE
| |_ decision_request_round | Event.DONE
| |_ pre_tx_settlement_round | Event.CHECKS_PASSED
| |_ randomness_transaction_submission_round | Event.DONE
| |_ select_keeper_transaction_submission_a_round | Event.DONE
| |_ collect_signature_round | Event.DONE
| |_ finalization_round | Event.DONE
| |_ validate_transaction_round | Event.DONE
| |_ post_tx_settlement_round | Event.DECISION_REQUESTING_DONE
| |_ decision_receive_round | Event.ROUND_TIMEOUT
| |_ decision_receive_round | Event.ROUND_TIMEOUT
| |_ decision_receive_round | Event.ROUND_TIMEOUT
| |_ decision_receive_round | Event.ROUND_TIMEOUT
|_ End

Here's what I see when I generate a report:

% cd trader; poetry run python ../report.py; cd ..

==============
Service report
==============


Performance
================================================================================

Staking
--------------------------------------------------------------------------------
Is service staked?            No 

Prediction market trading
--------------------------------------------------------------------------------
ROI on closed markets         0.00 %     
Trades on last 3 days         0 trades on 0 markets 


Service
================================================================================
ID                            265        

Agent
--------------------------------------------------------------------------------
Status                        Running 
Address                       0x3319092f74d16F63b1e33054CBe215559725b984 
xDAI Balance                  0.07 xDAI  

Safe 
--------------------------------------------------------------------------------
Address                       0x89E31da13E33a21A5f26FE957048ec40f4d852fB 
xDAI Balance                  0.59 xDAI  
WxDAI Balance                 0.00 WxDAI 

Owner/Operator
--------------------------------------------------------------------------------
Address                       0xfdc79F9d63373d52BAca4896F091b105cbB862cA 
xDAI Balance                  0.08 xDAI

And these are the docker logs from around the error:

% docker logs trader_abci_0

[2024-01-05 11:32:35,104] [INFO] [agent] Entered in the 'decision_receive_behaviour' behaviour
[2024-01-05 11:32:35,262] [INFO] [agent] Filtering the mech's events from block 31796032 for a response to our request with id 41369436667226394143474032173796524152512667375537417353188963147475296024191.
[2024-01-05 11:32:35,476] [INFO] [agent] The mech (0x77af31De935740567Cf4fF1986D04B2c964A786a) has not delivered a response yet for request with id 41369436667226394143474032173796524152512667375537417353188963147475296024191.
[2024-01-05 11:32:35,476] [INFO] [agent] Retrying in 1 seconds.
[2024-01-05 11:32:36,100] [INFO] [agent] arrived block with timestamp: 2024-01-05 11:32:35.068709
[2024-01-05 11:32:36,101] [INFO] [agent] current AbciApp time: 2024-01-05 11:32:34.054765
[2024-01-05 11:32:36,102] [INFO] Created a new local deadline for the next `begin_block` request from the Tendermint node: 2024-01-05 11:33:36.102435
[2024-01-05 11:32:36,478] [INFO] [agent] Filtering the mech's events from block 31796032 for a response to our request with id 41369436667226394143474032173796524152512667375537417353188963147475296024191.
[2024-01-05 11:32:36,621] [INFO] [agent] The mech (0x77af31De935740567Cf4fF1986D04B2c964A786a) has not delivered a response yet for request with id 41369436667226394143474032173796524152512667375537417353188963147475296024191.
[2024-01-05 11:32:36,622] [INFO] [agent] Retrying in 1 seconds.
[2024-01-05 11:32:37,120] [INFO] [agent] arrived block with timestamp: 2024-01-05 11:32:36.088339
[2024-01-05 11:32:37,121] [INFO] [agent] current AbciApp time: 2024-01-05 11:32:35.068709
[2024-01-05 11:32:37,122] [INFO] Created a new local deadline for the next `begin_block` request from the Tendermint node: 2024-01-05 11:33:37.122051

Repeats 'Filtering' -> 'Retrying' loop a few more times

[2024-01-05 11:32:40,991] [INFO] [agent] Filtering the mech's events from block 31796032 for a response to our request with id 41369436667226394143474032173796524152512667375537417353188963147475296024191.
[2024-01-05 11:32:41,206] [INFO] [agent] arrived block with timestamp: 2024-01-05 11:32:40.172447
[2024-01-05 11:32:41,207] [INFO] [agent] current AbciApp time: 2024-01-05 11:32:39.149972
[2024-01-05 11:32:41,207] [INFO] Created a new local deadline for the next `begin_block` request from the Tendermint node: 2024-01-05 11:33:41.207887
[2024-01-05 11:32:42,223] [INFO] [agent] arrived block with timestamp: 2024-01-05 11:32:41.190873
[2024-01-05 11:32:42,224] [INFO] [agent] current AbciApp time: 2024-01-05 11:32:40.172447
[2024-01-05 11:32:42,225] [INFO] Created a new local deadline for the next `begin_block` request from the Tendermint node: 2024-01-05 11:33:42.224975
[2024-01-05 11:32:42,476] [ERROR] [agent] Could not successfully interact with the valory/mech:0.1.0 contract using 'get_response': Message(sender=valory/ledger:0.19.0,to=valory/trader_abci:0.1.0,code=500,data=b'',dialogue_reference=('c3575489e0eec073910a8a13c39c1bad732b0212e9cbd0f41e5ee9b6e99e8b0e', '424b4f7c17f642800c279bf1803fc1490157476e8181022c1b90208b88a8ce9b'),message=Traceback (most recent call last):

  File "/usr/local/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)

requests.exceptions.HTTPError: 413 Client Error: Request Entity Too Large for url: https://solitary-evocative-breeze.xdai.quiknode.pro/1a5fe57c2e7a83cf685c6bdcd72b9eb6f597cdcf/
,message_id=-1,performative=error,target=1)
[2024-01-05 11:32:42,476] [INFO] [agent] Retrying in 1 seconds.

Now stuck in this^ loop forever.

Can you try with the nodies RPC (https://www.nodies.app/) and see if the error persists? Would help in understanding if thisis RPC specific.

Hi @DavidMinarsch, thanks for the suggestion. I don't get these errors when using the nodies RPC, and the agent runs to completion of a full Period now. Would you like me to put up a PR changing the README to suggest using this RPC client instead?

Thanks for confirming. There's a PR open to fix this in the README already - should be in later today. Thanks for offering. Closing here.