bug: RLNv2 rate limit imposed sooner than expected
Opened this issue · 8 comments
Problem
During interop test_valid_payloads_at_slow_rate execution, 20th message sent over LRN relay was rejected with error
ERROR tests.relay.test_rln:test_rln.py:32 Payload A very long string failed: Error: 400 Client Error: Bad Request for url: http://127.0.0.1:63912/relay/v1/messages/%2Fwaku%2F2%2Frs%2F1%2F0 with response: b'Failed to publish: RLN validation failed'
Impact
Low (revised from High) occurrence, high impact.
To reproduce
- Please checkout waku-org/waku-interop-tests@6b10c18
- cd waku-interop-tests
- python -m venv .venv
- source .venv/bin/activate
- pip install -r requirements.txt
- pre-commit install
- pytest tests/relay/test_rln.py -k 'test_valid_payloads_at_slow_rate'
Expected behavior
All 100 messages delivered as expected by configuration rln_relay_user_message_limit=100, rln_relay_epoch_sec=600.
Screenshots/logs
test.log
node1_2024-07-31_18-06-09__bb5caa43-2be1-4c92-b006-c674bb4c2814__wakuorg_nwaku:latest.log
node1_2024-07-31_18-06-09__eba792d6-3fc8-4662-8afe-2c127d016536__wakuorg_nwaku:latest.log
node1_2024-07-31_18-06-24__42b05617-7bc8-4300-bd23-317e2a311bfc__wakuorg_nwaku:latest.log
node2_2024-07-31_18-06-09__eba792d6-3fc8-4662-8afe-2c127d016536__wakuorg_nwaku:latest.log
nwaku version/commit hash
NWaku - wakunode2-v0.31.0-rc.1-10-ge4e01f (e4e01fa)
Thanks, I managed to reproduce it with the interop tests.
The problem is here:
WRN 2024-07-31 11:59:51.671+00:00 invalid message: invalid proof topics="waku rln_relay" tid=1 file=rln_relay.nim:240 payloadLen=26
https://github.com/waku-org/nwaku/blob/v0.31.1-rc.0/waku/waku_rln_relay/rln_relay.nim#L240
I have seen everything in the past (invalid root, spam, etc) but not sure under which circumstances a proof can be invalid, assuming honest nwaku nodes. Quite interesting I can confirm it happens always at message index 20. I have even tried modifying the content of such message, but still the same problem.
Will further investigate.
Seems the issue is the following:
- Since
rln-relay-dynamic
is not set, "static RLN" is used. Which is a hardcoded set of memberships where20
is hardcoded as the max limit.
Solution:
- Use
rln-relay-dynamic=true
to use the onchain RLN version, which allows to configure parameters such as user message limit. - Or if you want to use static RLN with a configurable user message limit, we would need to add support for this, since currently is hardcoded to
20
.
Changes for it to work with onchain rln aka rln-relay-dynamic=true
If you want to have a static RLN with configurable message size, we can add a flag for this:
(right now is hardcoded to 20)
To close this issue as well:
- This would have been prevented if nwaku errored when providing both a contract/eth-endpoint and using static RLN. This configuration does not make sense and should fail.
To close this issue as well:
- This would have been prevented if nwaku errored when providing both a contract/eth-endpoint and using static RLN. This configuration does not make sense and should fail.
I agree with your findings. May I propose we re-qualify this issue to improvement ? In the meantime, I will change my tests to filter out unnecessary params.
If you want to have a static RLN with configurable message size, we can add a flag for this:
(right now is hardcoded to 20)
This proposal will self document itself. Looks good to me.
So to be sure, we should:
1- return an error if a node is run with static RLN + providing a contract/endpoint
2- make static RLN message limit configurable
Is it correct? @romanzac
So to be sure, we should:
1- return an error if a node is run with static RLN + providing a contract/endpoint 2- make static RLN message limit configurable
Is it correct? @romanzac
Yes, if static RLN is not useful for troubleshooting of dynamic RLN normally used in production. When there is an additional need to remove config params while troubleshooting, it adds work. I would suggest warning message and ignore the contract/endpoint. Plus make static RLN message limit configurable.