KhronosGroup/WebGL

Standardize tests to support asynchronous excecution

Closed this issue · 7 comments

Background
Chromium has historically used a fixed timeout when running the WebGL conformance tests. This works, but some of our hardware + software combinations can result in tests taking a rather long time (sometimes multiple minutes), which meant that we had to use a large (5 minute) fixed timeout.

We somewhat recently switched to using a heartbeat mechanism instead, as we've had luck with that when running the WebGPU conformance tests. This has been a partial success for our WebGL tests, as most tests play fairly nice with it and we're able to detect issues such as hung renderer processes much faster. However, it has also resulted in flaky timeouts for tests that aren't able to send heartbeats very frequently, such as conformance/uniforms/no-over-optimization-on-uniform-array-*.

In order to help address this, we were planning on wrapping some relevant function calls with heartbeats similar to what is done for WebGPU. However, while implementing that, we found another issue causing flakiness that is caused by the way the WebGL tests are written.

Problem Description
Chromium's heartbeat mechanism relies on setting up a websocket connection with our Python test harness by injecting our JavaScript harness code into the page before other scripts can run. However, since establishing the websocket connection is asynchronous, it is possible for long-running test code on the page to start running before we are ready. This results in us not being able to send heartbeats during the test, effectively making us use a fixed timeout again.

We've tried a couple of Chromium-side solutions to address this, but neither is satisfactory. We first tried performing the websocket setup and communication in a worker, but because worker setup is also asynchronous, we ran into the same issue of test code blocking setup until the test was finished. We also tried using modified copies of the HTML files that temporarily replace <script> tags with something else and restore them once we're done with our setup. This works, but requires quite a bit of hacky/fragile code that we would prefer to avoid, partially due to the fact that there is currently a mix of synchronous and asynchronous WebGL conformance tests that make things more complicated.

As concrete examples, see conformance/uniforms/no-over-optimization-on-uniform-array-00.html and conformance/context/context-hidden-alpha.html. The uniforms test is synchronous, running the test as part of an inline script as part of page load and relying on the loading of js-test-post.js afterwards to signal that the test is done. On the other hand, the context test is asynchronous, as it defines the test code inline, but relies on body.onload to actually start the test. Instead of relying on the inclusion of js-test-post.js, it instead calls finishTest, which effectively does the same thing.

Proposed Solution
The proposed solution is to move all WebGL conformance tests to be asynchronous while also updating the WebGL test harness functionality to support delaying test execution until the harness is ready to start. These changes will be backwards-compatible so that existing uses of the tests will still function without changes to test runners.

js-test-pre.js would need to be updated to have a new startTest function that takes a test function and any arguments to pass to it. If webglTestHarness is defined and indicates that it has asynchronous setup, this information will be passed on to the harness so that it can start the test when it is ready. Otherwise, the test function is registered to be run on page load. Example implementation:

function startTest(testFunction, ...args) {
  if (webglTestHarness !== undefined && webglTestHarness.hasAsynchronousSetup) {
    webglTestHarness.registerTest(testFunction, args);
  } else {
    addEventListener('load', (event) => { testFunction(...args); });
  }
}

Existing synchronous tests can be converted by doing the following:

  1. Adding a finishTest() call to the end of existing inline code
  2. Wrapping existing inline test code + the new call in a function
  3. Adding a startTest() call at the end of the inline code, outside of the new function
  4. Removing the js-test-post.js <script> and the manual successfullyParsed assignment

As an example, here's a sample before and after.

<!-- Before -->
...
<script>
// Some test code
var successfullyParsed = true;
</script>
<script src="../../js/js-test-post.js"></script>
...

<!-- After -->
...
<script>
function init() {
  // Some test code
}
startTest(init);
</script>
...

Existing asynchronous tests can be converted by simply moving the onload function into startTest. As an example, here's a sample before and after.

<!-- Before -->
...
<script>
function init() {
  // Some test code
}
</script>
</body onload="init()">
...

<!-- After -->
...
<script>
function init() {
  // Some test code
}
startTest(init);
</script>
</body>
...

Input on this would be appreciated, both on whether this solution would be acceptable and the proposed names/implementation.

Could also perhaps be

<script>
function startTest(testFunction, ...args) {
  if (webglTestHarness !== undefined && webglTestHarness.registerTest !== undefined) {
    webglTestHarness.registerTest(testFunction, args);
  } else {
    testFunction(...args);
  }
}
function init() {
  // Some test code
}
</script>
<body onload="startTest(init)">

Making startTest synchronous and relying on something else calling it asynchronously SGTM. Probably slightly more flexible that way in case there is a legitimate reason to try to run the test synchronously.

kdashg commented

Firefox's CI harness integration for webgl-cts is where we would handle this, since we can delay loading of the single-test iframe until setup is done. Is something like this onerous for you?

I think there are minor advantages to having the harness be based around some await harnessSetup(); await test() ordering, but it's a lot of tests to refactor!

kdashg commented

In fact, looking again, our CI must do async setup before starting the harness because we have prefs to set.

Here's the test listing that CI sees:
https://searchfox.org/mozilla-central/source/dom/canvas/test/webgl-conf/generated-mochitest.ini#11203

Here is that actual test page that our CI runs:
https://searchfox.org/mozilla-central/source/dom/canvas/test/webgl-conf/generated/test_2_conformance__context__context-hidden-alpha.html

First nested iframe:
https://searchfox.org/mozilla-central/source/dom/canvas/test/webgl-conf/mochi-single.html
We have this iframe because our CI can't apply ?foo=bar to test urls.

Second nested iframe:
https://searchfox.org/mozilla-central/source/dom/canvas/test/webgl-conf/checkout/conformance/context/context-hidden-alpha.html
We have this iframe because our CI's pref-setting API is callback-based.

Changing our test framework to do everything in an iframe would probably be possible, although probably a fair bit of work. Right now, we simply load the WebGL conformance HTML test pages as-is without any sort of modification.

My thinking is that if we're going to have to make some substantial changes somewhere, we might as well do so upstream so that things are more consistent while also maybe making things easier for some other downstream consumers. Am I correct in believing that Firefox's tests won't be negatively impacted by the proposed changes? Seems to me like the tests should just run normally as soon as the iframe loads like they do now.

Here's a sample commit that makes the proposed changes and updates a handful of tests that Chromium has had the most issues with bsheedy-work@cdf557d

The tests continue to work properly in Chromium without any Chromium-side changes, and we see the same behavior as we do without any changes (websocket connection often established after the test completes). With the Chromium-side changes to support registering tests, though, the tests continue to work properly, but we consistently set up the websocket connection before starting the test.

After some additional discussions internally and some hammering on our code to make sure everything looks okay, we've decided to go with Firefox's approach and run our tests in an iframe.