Race Condition when managing the Groups collection
Closed this issue · 2 comments
It looks like reading/writing to the Groups collection in ServerSentEventsService
is not properly synchronized/thread-safe. Although writing is properly synchronised with a semaphore, it is still possible for other threads to be reading & enumerating the dictionary during a write, which causes them to throw, eg on line this: https://github.com/tpeczek/Lib.AspNetCore.ServerSentEvents/blob/main/Lib.AspNetCore.ServerSentEvents/ServerSentEventsService.cs#L392.
In the real world this can happen when eg a client is disconnecting, while at the same time another client is connecting and a new group is being registered for them. The thread handling the disconnect will see a System.InvalidOperationException: Collection was modified; enumeration operation may not execute.
exception while trying to clean up the resources for that client. I believe this will then result in a memory leak for that client since a reference to them will remain in the groups collection.
Heres a super contrived reproduction application:
Program.cs
using System;
using System.Linq;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;
using Lib.AspNetCore.ServerSentEvents;
using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.Http;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Options;
namespace SSERaceConditionDemo
{
public class Program
{
public static async Task Main(string[] args)
{
var host = Host.CreateDefaultBuilder(args)
.ConfigureWebHostDefaults(webBuilder => webBuilder.UseStartup<Startup>());
var webHostTask = host.Build().RunAsync();
var httpClient = new HttpClient();
var random = new Random();
// Create 10 clients which constantly connect & disconnect from the service
var tasks = Enumerable.Range(0, 10)
.Select(i =>
{
return Task.Run(async () =>
{
while (true)
{
var timeout = random.Next(100, 200);
try
{
await httpClient.GetAsync("http://localhost:5000/", new CancellationTokenSource(timeout).Token);
}
catch (OperationCanceledException)
{
}
}
});
}).ToArray();
await webHostTask;
}
public class Startup
{
public void ConfigureServices(IServiceCollection services)
{
services.AddAuthentication();
services.AddAuthorization();
services.AddSingleton<DemoServerSentEventsService>();
}
public void Configure(IApplicationBuilder app, IWebHostEnvironment env)
{
app.UseAuthentication();
app.UseAuthorization();
app.UseMiddleware<ServerSentEventsMiddleware<DemoServerSentEventsService>>();
}
}
public class DemoServerSentEventsService : ServerSentEventsService
{
public DemoServerSentEventsService(IOptions<ServerSentEventsServiceOptions<ServerSentEventsService>> options) : base(options)
{
}
public override async Task OnConnectAsync(HttpRequest request, IServerSentEventsClient client)
{
// Add the client to a silly number of groups - This makes enumeration take longer during DC's, increasing the chance of the race condition.
for (var i = 0; i < 10000; i++)
{
await AddToGroupAsync(Guid.NewGuid().ToString(), client);
}
await base.OnConnectAsync(request, client);
}
}
}
}
Csproj:
<Project Sdk="Microsoft.NET.Sdk.Web">
<PropertyGroup>
<TargetFramework>net5.0</TargetFramework>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Lib.AspNetCore.ServerSentEvents" Version="6.0.0" />
</ItemGroup>
</Project>
Run this in release mode and it'll spew the exception repeatedly.
I think the simplest solution is to simply swap out the Dictionary with a ConcurrentDictionary
and drop the Semaphore
for synchronising it since it'll be unnecessary
Thank you for bringing this up with so many details. I'll take a look as soon as possible (probably next week).