retries and timeouts: Add a new test suite for retries and timeouts
Opened this issue · 0 comments
This test is responsible for verifying if Linkerd's Retries and timeouts features are working correctly. The test shall mostly run through the instructions covered in the docs.
Setting up
- Install
booksapp
sample application - Install the required ServiceProfiles
Retries
- Execute the
routes
command to verify the success rates for various routes. These may be lower than expected due to the deliberately introduced failures (which shall be rectified with retries) - Enable Retries by Unmarshalling the ServiceProfile object, and setting
isRetryable: true
for various routes - Execute
routes
command to verify that the"effective_success"
is greater than before
Timeouts
- Testing timeouts shall work similar to Retries. The tests execute
routes
command and note the value of "effective_success" for any of the routes depending on the edge selected. - The ServiceProfile YAML for
deploy/voting
is unmarshalled into a ServiceProfile object and a Timeout value is set under RouteSpec (say, "25s"). The object is then marshalled back to YAML and piped tokubectl apply
- Finally, from the
routes
command must verify that the value for "effective_success"
Additionally, it would be nice to have the sample application configured to monitor the occurrences of retries and timeouts. For example, a service or a set of services may be configured to have routes dedicated for testing retries and timeouts; a service that accepts a request with 3 parameters - succeed-after-retries
, id
, and delay
. The service returns 200 OK only after succeed-after-retries
, and also keeps track of how many times the service was called before that. The service may also sleep for delay
before servicing a request, as a way of validating timeouts.