[bug][app] box app may timeout if retry kicked in
chunyong-lin opened this issue · 2 comments
Background
We observe that box app may timed out when the retry kicked in during when the lambda timeout is almost ended. The close code is
streamalert/streamalert/apps/_apps/box.py
Lines 134 to 140 in d31c03c
Following lambda logs from the box app say that the lambda still has 17.67 seconds left and it continued pulling more logs which the retries happened and lambda function timed out.
[INFO] 2020-02-20 22:53:07,020.20Z 33333333-4444-5555-6666-777777777777 [_gather] Function executed in 7.5856 seconds.
[INFO] 2020-02-20 22:53:07,020.20Z 33333333-4444-5555-6666-777777777777 Lambda remaining seconds: 17.67
[WARNING] 2020-02-20 22:53:10,132.132Z 33333333-4444-5555-6666-777777777777 �[31mRequest "GET https://api.box.com/2.0/events" failed with ReadTimeout exception: ReadTimeout(ReadTimeoutError("HTTPSConnectionPool(host='api.box.com', port=443): Read timed out. (read timeout=3.05)"))�[0m
[WARNING] 2020-02-20 22:53:16,351.351Z 33333333-4444-5555-6666-777777777777 �[31mRequest "GET https://api.box.com/2.0/events" failed with ReadTimeout exception: ReadTimeout(ReadTimeoutError("HTTPSConnectionPool(host='api.box.com', port=443): Read timed out. (read timeout=6.1)"))�[0m
END RequestId: 33333333-4444-5555-6666-777777777777
REPORT RequestId: 33333333-4444-5555-6666-777777777777 Duration: 300100.39 ms Billed Duration: 300000 ms Memory Size: 128 MB Max Memory Used: 128 MB
2020-02-20T22:53:24.794Z 33333333-4444-5555-6666-777777777777 Task timed out after 300.10 seconds
Steps to Reproduce
It is not very easy to reproduce this issue only if you have many logs to pull and the retries would be happened during very last pull in a single lambda invocation.
Desired Change
I am thinking we may need to check remaining time before retries happened and give up the future pull if the remaining time is not enough in a single lambda invocation.
I don't love getting rid of the alert especially since it doesn't page. Would be nice to have this top of mind. I'm on-call next so I can take a look at fixing this.
Sounds good @blakemotl I will close this PR.