Disruptor test exceeding time limit when combined with a browser test
Closed this issue · 5 comments
During last week's k6 Office Hours, we demonstrated how to do hybrid testing with k6 browser and xk6-disruptor. One of the issues found was the disruptor test was exceeding the time limit specified. After the browser test has been completed, the disruptor scenario is still in progress.
Script used:
import { LoadAndCheck } from "../foundations/lib/frontend/basic.js";
import { ServiceDisruptor } from "k6/x/disruptor";
const BASE_URL = __ENV.BASE_URL || "http://localhost:3333";
const scenarios = {
disrupt: {
executor: "shared-iterations",
iterations: 1,
vus: 1,
exec: "disrupt",
},
browser: {
executor: "constant-vus",
vus: 1,
duration: "10s",
startTime: "10s",
exec: "browser"
},
};
if (__ENV.DISABLE_DISRUPT) {
delete scenarios["disrupt"];
}
export const options = {
scenarios,
};
const fault = {
averageDelay: "1000ms",
// errorRate: 0.1,
// errorCode: 500,
}
export function disrupt() {
const disruptor = new ServiceDisruptor("pizza-info", "pizza-ns");
const targets = disruptor.targets();
if (targets.length == 0) {
throw new Error("expected list to have one target");
}
disruptor.injectHTTPFaults(fault, "20s");
}
export async function browser() {
await LoadAndCheck(BASE_URL, false);
}
The script can also be accessed here.
Expected result:
The test should have been completed after 30 seconds.
Actual result:
The test was completed in 01m16.5s
Please take a look at the video clip below.
k6-browser-and-disruptor.mov
One possibility is that creating the disruptor and injecting the agent in the target pod(s) takes too long. It would be interesting to measure this time or at least print a log after the disruptor is created to see how long it takes.
To add a little bit of context from the k6 browser side:
In the video that you posted @mdcruz , the execution is a little bit different than what I'm getting in my local executions, in your case seems like the browser actually finishes its execution and is the disruptor the one that is stuck. Also the browser errors are a little bit misleading (we've had some issues with that).
In my case I'm running the test with k6 browser v0.10.0
and also with the current main
HEAD (212dabc) adapting the test syntax, which you can build such as:
xk6 build --with github.com/grafana/xk6-browser@v0.10.0 --with github.com/grafana/xk6-disruptor
And
xk6 build --with github.com/grafana/xk6-browser@212dabc --with github.com/grafana/xk6-disruptor
In these cases I see the opposite behavior, where the disruptor finishes, but it takes longer for the browser to finish (~40s total test run), which makes sense as I will explain later.
This is an example of output when running the test:
/\ |‾‾| /‾‾/ /‾‾/
/\ / \ | |/ / / /
/ \/ \ | ( / ‾‾\
/ \ | |\ \ | (‾) |
/ __________ \ |__| \__\ \_____/ .io
execution: local
script: advanced/01.browser-with-disruptor-new.js
output: -
scenarios: (100.00%) 2 scenarios, 2 max VUs, 10m30s max duration (incl. graceful stop):
* disrupt: 1 iterations shared among 1 VUs (maxDuration: 10m0s, exec: disrupt, gracefulStop: 30s)
* browser: 1 looping VUs for 10s (exec: browser, startTime: 10s, gracefulStop: 30s)
ERRO[0013] Failed to load resource: the server responded with a status of 500 (Internal Server Error) browser_source=network line_number=0 source=browser stacktrace="<nil>" url="http://localhost:3333/_app/immutable/chunks/_layout.da46b06b.js"
ERRO[0015] {"message":"Failed to fetch dynamically imported module: http:…ocalhost:3333/_app/immutable/chunks/0.42b5aad3.js","stack":"TypeError: Failed to fetch dynamically imported mo…ocalhost:3333/_app/immutable/chunks/0.42b5aad3.js"} source=browser-console-api
ERRO[0015] Failed to load resource: the server responded with a status of 500 (Internal Server Error) browser_source=network line_number=0 source=browser stacktrace="<nil>" url="http://localhost:3333/_app/immutable/chunks/1.a917d3d6.js"
ERRO[0015] Failed to load resource: the server responded with a status of 500 (Internal Server Error) browser_source=network line_number=0 source=browser stacktrace="<nil>" url="http://localhost:3333/_app/immutable/entry/error.svelte.ecbfdf8a.js"
ERRO[0043] Uncaught (in promise) GoError: getting text content of "h1": timed out after 30s
running at github.com/grafana/xk6-browser/api.Locator.TextContent-fm (native)
disrupt at LoadAndCheck (file:///home/ka3de/devel/go/src/github.com/grafana/quickpizza/k6/foundations/lib/frontend/basic-new.js:13:16(34))
browser at browser (file:///home/ka3de/devel/go/src/github.com/grafana/quickpizza/k6/advanced/01.browser-with-disruptor-new.js:55:21(8)) executor=constant-vus scenario=browser
browser_data_received.......: 90 kB 2.1 kB/s
browser_data_sent...........: 6.4 kB 150 B/s
browser_http_req_duration...: avg=1.37s min=1s med=1.03s max=2.02s p(90)=2.02s p(95)=2.02s
browser_http_req_failed.....: 0.00% ✓ 0 ✗ 14
browser_web_vital_ttfb......: avg=1.05s min=1.05s med=1.05s max=1.05s p(90)=1.05s p(95)=1.05s
data_received...............: 0 B 0 B/s
data_sent...................: 0 B 0 B/s
iteration_duration..........: avg=26.26s min=20.29s med=26.26s max=32.23s p(90)=31.03s p(95)=31.63s
iterations..................: 2 0.04709/s
vus.........................: 1 min=1 max=2
vus_max.....................: 2 min=2 max=2
running (00m42.5s), 0/2 VUs, 2 complete and 0 interrupted iterations
disrupt ✓ [======================================] 1 VUs 00m20.3s/10m0s 1/1 shared iters
browser ✓ [======================================] 1 VUs 10s
The key point to understand the extra time that it takes for the browser module to finish is the line:
ERRO[0043] Uncaught (in promise) GoError: getting text content of "h1": timed out after 30s
This is a consequence of this locator used inside a check in the basic.js
script:
check(page, {
'header': page.locator('h1').textContent() == 'Looking to break out of your pizza routine?',
});
Because the disruptor is making the website return random 500 HTTP status code in responses, some of the page.goto
actions fail with the log error messages that we see in the output, but still this locator action is performed after the [attempted] navigation. The Locator
will automatically wait looking for the specified selector up until the default timeout (30s), but because in this case the page content was not returned from the server, it fails after the TO, which is what makes the whole test run for longer time.
I can confirm that I can reproduce the behavior described by @ka3de: the disruptor
scenario ends in 20s
and the browser
takes ~40s
to finish, which is consistent with the 10s
+ 30s
timeout hypothesis.
Thanks, both! In this case, I don't think there is anything we can do since the timings are expected. I'm also getting similar results when running this test with k6 v0.45.0.
@mdcruz As the results could not be reproduced. We are closing this issue. If this situation ever happens again, please feel free to open another issue and reference this one.