bUnit-dev/bUnit

Multi-threaded test execution appears to cause a race condition when using SetParametersAndRender

Skintkingle opened this issue · 21 comments

Our CIs that run our automated component tests run in a parallel way using NUnit, grouped by component.
I've spent a few hours today trying to sniff out (and replicate) the issue in the BUnit repo without success - so this is a partial bug report partial "can anyone lend a hand trying to replicate it in BUnit". It smells like there's a bug here, but I can't put my finger on it.

Something about the execution environment in a parallel test run causes different threads to be used between intial render and subsequent renders. My suspicion is this recent change and the use of Dispatcher.InvokeAsync. When the SynchronizationContext's don't match it goes async and returns the call. This is what I'm suspicious our parallel test execution is somehow causing.

This is causing inconsistent test failures as assertions we make straight after SetParametersAndRender is called sometimes work, and sometimes don't work.

Further, I can work around the issue we're seeing in our codebase with an extension method like this:

public static async Task SetParametersAndRenderAsync<TComponent>(this IRenderedComponentBase<TComponent> renderedComponent, Action<ComponentParameterCollectionBuilder<TComponent>> parameterBuilder)
    where TComponent : IComponent
{
    renderedComponent.SetParametersAndRender(parameterBuilder);
    await renderedComponent.InvokeAsync(() => { });
}

Simply put forces our test to wait for the Dispatcher.InvokeAsync to finish, before our no-op one runs.

The time I have spent trying to write a failing test within the BUnit tests has only given me success. No matter how contrived the test becomes.

I have something that looks like this at the moment (but it passes, so is relatively useless :)).

[Fact(DisplayName = "SetParametersAndRender from non-UI thread waits until render completion before returning")]
public async Task Test209()
{
	var cut = RenderComponent<SlowComponent>();
	cut.RenderCount.ShouldBe(1);

	await Renderer.Dispatcher.InvokeAsync(() =>
	{
		cut.SetParametersAndRender();
	});

	cut.RenderCount.ShouldBe(2);
}

It's gone through a few iterations, but this is the point where I've stopped trying and thought reaching out for some help from the community to identify the issue may be my best course of action :)

Expected behavior:

SetParametersAndRender, when it branches due to a SynchronizationContext mismatch, to still be blocking call until render completes.

Version info:

  • bUnit version: 1.22.19 (and current main for test attempt)
  • .NET Runtime and Blazor version: Net6
  • OS type and version: Windows 11
egil commented

Hey @Skintkingle, could this be fixed by using one of the WaitForX methods, that exists exactly for the purpose of helping with components that do async work during their life cycles?

See more here: https://bunit.dev/docs/verification/async-assertion.html

If I understand your case correctly, the test would look like this instead:

[Fact(DisplayName = "SetParametersAndRender from non-UI thread waits until render completion before returning")]
public async Task Test209()
{
	var cut = RenderComponent<SlowComponent>();
	cut.RenderCount.ShouldBe(1);

	cut.SetParametersAndRender();

	cut.WaitForAssertion(() => cut.RenderCount.ShouldBe(2));
}

Hi @egil Thanks for lending a hand. The components themselves aren't doing anything async, so the tests shouldn't need any async-y waiting. The issue is (I think) to do with the executing environment of the test. When run in parallel (in NUnit), the SynchronizationContext the Dispatcher starts with must be different from the one the test is executed with, so the test goes async when it hits the Dispatcher.InvokeAsync in the SetDirectParameters call.

The example test I was playing with is doing Task-y stuff purely to try and replicate a test executing in parallel. I will have a little more of a play today, read up on XUnit and work out how to run tests in parallel. I think Collections? I'll give it a go and see if I can get across what I'm seeing more clearly.

I injected some Console.WriteLine(System.Threading.Thread.CurrentThread.ManagedThreadId); lines at the beginning of some of our tests, in OnParameterSet of the components under test, and in the rendered part of those Components. During initial rendering, all 3 Ids match, during a SetParametersAndRender call, the threadId from "OnParameterSet" and the render body, is different to the one the test started with (and did the initial render with).

On reflection, it might be an NUnit thing. I'll try a bit today to replicate it using XUnit in the bunit repo. If that fails I'll make a really simple test using NUnit as the test framework, and see what happens.

egil commented

The inner workings of the Blazor renderer can be surprising. The runtime will actively attempt not to execute code asynchronously if it can, so when Dispatcher.InvokeAsync is used to invoke SetDirectParameters, it will actually run it synchronously on the same thread for as long as it can, until it gets to async code. Since that is not the case, the idea is that it should actually continue to execute synchronously on the same thread which calls the method.

Now, to my knowledge NUnit doesn't set up a synchronization context itself as xUnit does, so there may be something related to that we need to investigate.

Just to add my two cents: Also to my knowledge NUnit does not have a SynchronizationContext.
But also we don't have any(more).

Just for the slight off-chance: Does your component in question override SetParametersAsync?

Thanks for the thoughts :). These are similar thoughts I've been having. Especially when looking at the AspNetCore Dispatcher code and seeing its async decision making is based on the SynchronizationContext instance.

In our real components there's a couple that seem to exibit the issue more regularly than the others. one of them overrides OnParametersSetAsync. The other overrides OnParametersSet, ShouldRender, and OnAfterRenderAsync, which does some JSInterop stuff in an async way.

Otherwise I can't seem to reproduce the issue using XUnit. I've made 2 classes with similar tests in them for simplicity sake, and run them in parallel. It all seems fine so far. I'll try overriding OnParametersSetAsync and see if different decisions are made by the renderer

I think you might be on to something with the async overrides. And I seem to be able to reproduce it in NUnit now (And curiously, even when NUnit isn't set to run in parallel, too!) but still can't get XUnit to fail reliably. If I run any 1 of the tests with "Run until failure", it never seems to fail. If I select 2 tests, and "Run until failure". Even though I haven't yet turned on parallelisation in NUnit, the 2nd one fails, after only 1 or 2 runs.

The XUnit tests sometimes fail and sometimes don't and I really don't understand what causes them to fail or not. I've tried a Task.Delay(1000), and they don't fail, but a Task.Yield() makes them fail first time. It follows a similar pattern to NUnit though. If I run just 1 of them it never fails. But if I run both of them, it fails.

I have a branch I can push if either of you are interested in seeing it. My test component looks like this:

<span>@Text</span>
@{
    Console.WriteLine($"Rendering Thread: {Environment.CurrentManagedThreadId}");
    System.Threading.Thread.Sleep(1000); // The render takes some time, to help encourage the race condition out!
}

@code {
    [Parameter]
    public string Text { get; set; }

    protected override async Task OnParametersSetAsync()
    {
        Console.WriteLine($"OnParameterSet Thread: {Environment.CurrentManagedThreadId}");
        await Task.Yield();
    }
}

And each of my tests look almost exactly the same, like this:

[Test]
public void NUnitTest001()
{
	Console.WriteLine($"Test Thread: {Environment.CurrentManagedThreadId}");
	var cut = RenderComponent<SlowComponent>();
	Assert.That(cut.RenderCount, Is.EqualTo(1));

	cut.SetParametersAndRender();

	Assert.That(cut.RenderCount, Is.EqualTo(2));
}

NUnit and XUnit tests differ only in the logging. NUnit uses the Consoles output, XUnit I passed it the ITestOutputHelper, otherwise the components are identical.

egil commented

Can you share the complete test classes?

Sure thing.

I've made a bunit.core.nunit.tests project, Within it are 2 test classes with 1 test each.
I will only provide one, because they are exactly the same, save for one is called ParallelTestExecution1, and the other is called ParallelTestExecution2

using Bunit;
using Assert = NUnit.Framework.Assert;

namespace bunit.core.nunit.tests.Rendering;

[TestFixture]
public class ParallelExecutionTests1 : TestContextWrapper
{
    [SetUp]
    public void SetUp()
    {
	    TestContext = new Bunit.TestContext();
    }

    [Test]
    public void NUnitTest001()
    {
	    Console.WriteLine($"Test Thread: {Environment.CurrentManagedThreadId}");
	    var cut = RenderComponent<SlowComponent>();
	    Assert.That(cut.RenderCount, Is.EqualTo(1));

	    cut.SetParametersAndRender();

	    cut.WaitForAssertion(() => Assert.That(cut.RenderCount, Is.EqualTo(2)));
    }
}

SlowComponent.cs, within bunit.testassets/SampleComponents looks like this:

<span>@Text</span>
@{
    Console.WriteLine($"Rendering Thread: {Environment.CurrentManagedThreadId}");
    System.Threading.Thread.Sleep(1000); // The render takes some time, to help encourage the race condition out!
}

@code {
    [Parameter]
    public string Text { get; set; }

    protected override async Task OnParametersSetAsync()
    {
        Console.WriteLine($"OnParameterSet Thread: {Environment.CurrentManagedThreadId}");
        await Task.Yield();
    }
}

The XUnit test classes look like this (again, only providing one because they are basically copy pastes with one saying 1, and the other saying 2. :))

namespace Bunit.Rendering;

public class ParallelExecutionTests1 : TestContext
{
    private ITestOutputHelper _outputHelper { get; set; }
    public ParallelExecutionTests1(ITestOutputHelper outputHelper)
    {
	    DefaultWaitTimeout = TimeSpan.FromSeconds(30);
	    Services.AddXunitLogger(outputHelper);
	    _outputHelper = outputHelper;
    }

    [Fact(DisplayName = "When run In Parallel one, SetParametersAndRender from non-UI thread waits until render completion before returning")]
    public void Test001Parallel()
    {
	    _outputHelper.WriteLine($"Test1 Execution Thread: {Environment.CurrentManagedThreadId}");
	    var cut = RenderComponent<SlowXUnitComponent>(parameters => parameters.Add(c => c.OutputHelper, _outputHelper));
	    cut.RenderCount.ShouldBe(1);

	    cut.SetParametersAndRender();

	    cut.WaitForAssertion(() => cut.RenderCount.ShouldBe(2));
    }
}

And finally SlowXUnitComponent looks like this

@using Xunit.Abstractions;
<span>@Text</span>
@{
    OutputHelper.WriteLine($"Rendering Thread: {Environment.CurrentManagedThreadId}");
    System.Threading.Thread.Sleep(1000); // The render takes some time, to help encourage the race condition out!
}

@code {
    [Parameter]
    public string Text { get; set; }

    [Parameter]
    public ITestOutputHelper OutputHelper { get; set; }

    protected override async Task OnParametersSetAsync()
    {
        OutputHelper.WriteLine($"OnParametersSetAsync Thread: {Environment.CurrentManagedThreadId}");
        await Task.Yield();
    }
}
egil commented

I have actually been thinking if it was possible for us to use source generators to automatically create a copy of all our tests for both NUnit, XUnit, and MSTest.

It does require us to only use features available in all frameworks, but should be doable. That way we can verify bUnit with all major test frameworks.

egil commented

A general note: there is no shared state/statics between TestContext and other bUnit types, so if there are issues with running things in parallel, the issues are showing up due to the sharing of CPU time/memory/system resources.

I think the Parallel aspect of my original statement is not quite the whole picture. Parallel execution seems to be needed with XUnit to make the failure happen. But just running the NUnit tests in series causes it to fail. (I never even got to configuring parallelisation in NUnit, and started seeing the issue).

egil commented

Sorry, I am trying to understand what the issue is. Is the issue that you are not seeing the expected RenderCount or that there is something in the rendered markup that is not correct?

The change in PR you reference in the issue does change the behaviour slightly by allowing the Blazor renderer to batch together multiple synchronous renders into one, just as it happens in production when parameters are updated on a child component. The motivation from our side was to make sure we have like the production runtime, and I do see that it could cause the expected RenderCount to change, if you were using that in your tests.

The render count I was using in these tests just as a contrived example of inconsistent results. Our tests were actually using assertions that Events were un/subscribed. The test error said it wasn't there, Followed by the output clearly showing that it was there led me to the initial investigation driven by the fact something odd is going on and it felt very racey. (Moq checked the event subscriptions, and it wasn't there. and when it came to logging out all the event subscriptions to help understand what happened, it had actually registered, so it was in that subsequent output that facilitates the failed Assert).

It also happened when we upgraded BUnit from 1.20.8 to 1.22.19 so I went to investigate what changed in BUnit between those 2 versions that could have started causing a race condition in our test execution.

Any other information from then on has been from generally investigating how bunit operates when tests run in parallel because that's where I thought the issue stems from.
When I run our tests locally on my machine the managed thread Ids were always the same. When the CI ran the tests in parallel the logged managed thread Ids were different, which further re-enforced (confirmation bias, maybe) the fact it was an async re-entrance race condition-y thing.
From investigating in the BUnit repo. Running any 1 of those tests I mentioned above over and over again (using run until failure in VS) never fails. But running 2 tests at the same time, they start to fail. I have no concrete explanation as to why this is the case... It might be something in how Visual Studio does "Run until Failure" such that something isn't "leaking" when only 1 is repeated.

All that being said, Unfortunately, I'm not being given masses of time to play with this anymore, even though I want to really understand what's going on here. As the workaround we came up with (Do an await Dispatcher.InvokeAsync(() => { }); straight after the SetParametersAndRender call) made our tests consistent again.

egil commented

Interesting. We will have a look. Can you confirm that you have shared the test cases with us that causes the issue when running in parallel?

Yessir! (Sorry for the late response). The tests provided above were causing the issue in the bunit repo. NUnit/XUnit manifest it in different ways, but both manifest it!

Hello @egil

We have encountered the same issue with the SetParametersAndRender method. Some of our xUnit tests are failing sporadically.

While investigating the issue, we noticed that in the SetDirectParameters method of the TestRenderer class, the task returned from the Dispatcher.InvokeAsync is not being awaited. Shouldn't the method SetDirectParameters do an await on the task returned by the Dispatcher? Such that that the error handling works on the lines following the invocation. Also this way the code does not run in parallel and cause synchronization problems during test execution.

As a test we have adapted the code of bUnit so that the task is awaited in the SetDirectParameter method. The fix has solved our problem. The tests now run stable in a loop and no longer fail sporadically. The change is documented in the following patch file: fix_async_issue.patch

I hope this helps. Please don't hesitate to ask for further clarifications. Hopefully this fix gets released soon.

egil commented

Hey all,

Sorry for the radio silence. I have been travelling last week so there has been no time to follow up on this.

@Schaeri, thanks for the patch, however, we probably have to resort to calling GetAwaiter().GetResult() for now since we cannot do a breaking change to the API with bUnit v1.

We missed this behavior change in the release. Originally, we were awaiting the InvokeAsync task like so:

var result = renderedComponent.InvokeAsync(() =>
	renderedComponent.Instance.SetParametersAsync(parameters));

if (result.IsFaulted && result.Exception is not null)
{
	if (result.Exception.InnerExceptions.Count == 1)
	{
		ExceptionDispatchInfo.Capture(result.Exception.InnerExceptions[0]).Throw();
	}
	else
	{
		ExceptionDispatchInfo.Capture(result.Exception).Throw();
	}
}
else if (!result.IsCompleted)
{
	result.GetAwaiter().GetResult();
}

I will adjust our new implementation to match this behavior and that should hopefully fix the issues you are seeing. PR incoming.

Thanks a lot for your support.

egil commented

There should be a 1.23.8-preview of bUnit available on nuget.org in a few minutes. Please take it for a spin and let me know if it fixes your issues.

This change is almost exactly what I was hoping for. Thank you. Just taking it for a spin now!

Our tests now pass, with our workaround removed, on the latest 1.23.8-preview. Great stuff, thanks @egil