Errors: `InvalidSequenceTokenException` and `DataAlreadyAcceptedException`
danilvalov opened this issue · 3 comments
We use several instances and send all logs to one CloudWatch log group. And the last time we see the following errors:
Stack: InvalidSequenceTokenException: The given sequenceToken is invalid. The next expected sequenceToken is: 49556355883619415453973615294685985237216996688226435970
at Request.extractError (/project/node_modules/aws-sdk/lib/protocol/json.js:43:27)
at Request.callListeners (/project/node_modules/aws-sdk/lib/sequential_executor.js:105:20)
at Request.emit (/project/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
at Request.emit (/project/node_modules/aws-sdk/lib/request.js:668:14)
at Request.transition (/project/node_modules/aws-sdk/lib/request.js:22:10)
at AcceptorStateMachine.runTo (/project/node_modules/aws-sdk/lib/state_machine.js:14:12)
at /project/node_modules/aws-sdk/lib/state_machine.js:26:10
at Request.<anonymous> (/project/node_modules/aws-sdk/lib/request.js:38:9)
at Request.<anonymous> (/project/node_modules/aws-sdk/lib/request.js:670:12)
at Request.callListeners (/project/node_modules/aws-sdk/lib/sequential_executor.js:115:18)
at Request.emit (/project/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
at Request.emit (/project/node_modules/aws-sdk/lib/request.js:668:14)
at Request.transition (/project/node_modules/aws-sdk/lib/request.js:22:10)
at AcceptorStateMachine.runTo (/project/node_modules/aws-sdk/lib/state_machine.js:14:12)
at /project/node_modules/aws-sdk/lib/state_machine.js:26:10
at Request.<anonymous> (/project/node_modules/aws-sdk/lib/request.js:38:9)
at Request.<anonymous> (/project/node_modules/aws-sdk/lib/request.js:670:12)
at Request.callListeners (/project/node_modules/aws-sdk/lib/sequential_executor.js:115:18)
at callNextListener (/project/node_modules/aws-sdk/lib/sequential_executor.js:95:12)
at IncomingMessage.onEnd (/project/node_modules/aws-sdk/lib/event_listeners.js:211:11)
at emitNone (events.js:91:20)
at IncomingMessage.emit (events.js:185:7)
and
Stack: DataAlreadyAcceptedException: The given batch of log events has already been accepted. The next batch can be sent with sequenceToken: 49556355883619415453973615474902976018709125113130140546
at Request.extractError (/project/node_modules/aws-sdk/lib/protocol/json.js:43:27)
at Request.callListeners (/project/node_modules/aws-sdk/lib/sequential_executor.js:105:20)
at Request.emit (/project/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
at Request.emit (/project/node_modules/aws-sdk/lib/request.js:668:14)
at Request.transition (/project/node_modules/aws-sdk/lib/request.js:22:10)
at AcceptorStateMachine.runTo (/project/node_modules/aws-sdk/lib/state_machine.js:14:12)
at /project/node_modules/aws-sdk/lib/state_machine.js:26:10
at Request.<anonymous> (/project/node_modules/aws-sdk/lib/request.js:38:9)
at Request.<anonymous> (/project/node_modules/aws-sdk/lib/request.js:670:12)
at Request.callListeners (/project/node_modules/aws-sdk/lib/sequential_executor.js:115:18)
at Request.emit (/project/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
at Request.emit (/project/node_modules/aws-sdk/lib/request.js:668:14)
at Request.transition (/project/node_modules/aws-sdk/lib/request.js:22:10)
at AcceptorStateMachine.runTo (/project/node_modules/aws-sdk/lib/state_machine.js:14:12)
at /project/node_modules/aws-sdk/lib/state_machine.js:26:10
at Request.<anonymous> (/project/node_modules/aws-sdk/lib/request.js:38:9)
at Request.<anonymous> (/project/node_modules/aws-sdk/lib/request.js:670:12)
at Request.callListeners (/project/node_modules/aws-sdk/lib/sequential_executor.js:115:18)
at callNextListener (/project/node_modules/aws-sdk/lib/sequential_executor.js:95:12)
at IncomingMessage.onEnd (/project/node_modules/aws-sdk/lib/event_listeners.js:211:11)
at emitNone (events.js:91:20)
at IncomingMessage.emit (events.js:185:7)
How to fix it?
This happens because of concurrency.
CloudWatch won't let you append to the logs without obtaining the stream's sequence token, which changes everytime you append. By default, the logger will fetch the sequence token right before appending. That still leaves a tiny amount of time between fetching the token and using it to post the logs, and if another client appends to the same log stream in that time, the token that has just been fetched will already be invalid.
The token cannot be omitted when posting the logs, so the initial fetch always needs to happen. Conversely, there's no way to obtain the token and immediately use it in an atomic operation. Thus, the only way to circumvent this would be to retry the fetch-and-post sequence.
To reduce the probability of running into this, you could also reduce the logging interval. Currently, logs are posted every two seconds. Unfortunately, the interval isn't exposed as an option yet, so your best bet is to go into relay.js in node_modules and change the number manually. Of course, you could also use separate streams, but you shouldn't have to change your architecture because of this.
I'll definitely put the retry logic on my to-do list. It should also help if there's a temporary network outage or some other unexpected event.
Just published 1.4.0 to hopefully improve the situation. Closing this issue; please reopen if the issue reappears!
@timdp
Thank you! It works very good now!