dotnet/runtime

Services resolving incorrectly in Development environment but not Production

jbogard opened this issue ยท 15 comments

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

For an IEnumerable<> of services resolved using GetServices, we found that the list of services returned is different when using --environment=Development versus any other values including blank.

We have two service implementations defined. When in Production, service A and B are in the enumerable. In development, service A is in the list twice, without B.

Expected Behavior

Should always resolve service A and B regardless of the enviroment.

Steps To Reproduce

Given types:

public interface IMarker { }
public interface IBaseService<T> { }

public class BaseService : IBaseService<A> { }

public class GenericService<T> : IBaseService<T>
    where T : IMarker
{

}

public class A : IMarker { }

This test passes (I use Shouldly for assertions, sorry):

[Fact]
public void Should_resolve_correctly_directly()
{
    var services = new ServiceCollection();

    services.AddTransient<IBaseService<A>, BaseService>();
    services.AddTransient(typeof(IBaseService<>), typeof(GenericService<>));

    var serviceProvider = services.BuildServiceProvider();
    var handlers = serviceProvider
        .GetServices<IBaseService<A>>()
        .ToList();

    handlers.Count.ShouldBe(2);
    var handlersTypes = handlers
        .Select(h => h.GetType())
        .ToList();
    handlersTypes.ShouldContain(typeof(BaseService));
    handlersTypes.ShouldContain(typeof(GenericService<A>));
}

This test passes using the WebApplication.CreateBuilder method:

[Fact]
public void Should_resolve_correctly_with_no_env_set()
{
    var builder = WebApplication.CreateBuilder();

    builder.Services.AddTransient<IBaseService<A>, BaseService>();
    builder.Services.AddTransient(typeof(IBaseService<>), typeof(GenericService<>));

    var serviceProvider = builder.Services.BuildServiceProvider();
    var handlers = serviceProvider
        .GetServices<IBaseService<A>>()
        .ToList();

    handlers.Count.ShouldBe(2);
    var handlersTypes = handlers
        .Select(h => h.GetType())
        .ToList();
    handlersTypes.ShouldContain(typeof(BaseService));
    handlersTypes.ShouldContain(typeof(GenericService<A>));

    var app = builder.Build();

    var appHandlers = app.Services
        .GetServices<IBaseService<A>>()
        .ToList();

    appHandlers.Count.ShouldBe(2);
    var appHandlersTypes = appHandlers
        .Select(h => h.GetType())
        .ToList();
    appHandlersTypes.ShouldContain(typeof(BaseService));
    appHandlersTypes.ShouldContain(typeof(GenericService<A>));
}

This test fails:

[Fact]
public void Should_resolve_correctly_in_development()
{
    var builder = WebApplication.CreateBuilder(new WebApplicationOptions
    {
        EnvironmentName = Environments.Development
    });

    builder.Services.AddTransient<IBaseService<A>, BaseService>();
    builder.Services.AddTransient(typeof(IBaseService<>), typeof(GenericService<>));

    var serviceProvider = builder.Services.BuildServiceProvider();
    var handlers = serviceProvider
        .GetServices<IBaseService<A>>()
        .ToList();

    handlers.Count.ShouldBe(2);
    var handlersTypes = handlers
        .Select(h => h.GetType())
        .ToList();
    handlersTypes.ShouldContain(typeof(BaseService));
    handlersTypes.ShouldContain(typeof(GenericService<A>));

    var app = builder.Build();

    var appHandlers = app.Services
        .GetServices<IBaseService<A>>()
        .ToList();

    appHandlers.Count.ShouldBe(2);
    var appHandlersTypes = appHandlers
        .Select(h => h.GetType())
        .ToList();
    appHandlersTypes.ShouldContain(typeof(BaseService));
    appHandlersTypes.ShouldContain(typeof(GenericService<A>));
}

The first set of assertions succeeds when I build the service provider directly from builder.Services. Once I call builder.Build and use the app.Services to resolve, it gives me the incorrect set of results. Whether or not I build the service provider does not change the second assertion.

Exceptions (if any)

No response

.NET Version

6.0.101

Anything else?

No response

Setting EnvironmentName = Environments.Development causes ValidateOnBuild = true to be set on the ServiceContainer. Here's a reduced repro without web components:

        [Fact]
        public void Should_resolve_correctly_directly()
        {
            var services = new ServiceCollection();

            services.AddTransient<IBaseService<A>, BaseService>();
            services.AddTransient(typeof(IBaseService<>), typeof(GenericService<>));

            var serviceProvider = services.BuildServiceProvider(new ServiceProviderOptions() { ValidateOnBuild = true });
            var handlers = serviceProvider
                .GetServices<IBaseService<A>>()
                .ToList();

            handlers.Count.ShouldBe(2);
            var handlersTypes = handlers
                .Select(h => h.GetType())
                .ToList();
            handlersTypes.ShouldContain(typeof(BaseService));
            handlersTypes.ShouldContain(typeof(GenericService<A>));
        }

Tagging subscribers to this area: @dotnet/area-extensions-dependencyinjection
See info in area-owners.md if you want to be subscribed.

Issue Details

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

For an IEnumerable<> of services resolved using GetServices, we found that the list of services returned is different when using --environment=Development versus any other values including blank.

We have two service implementations defined. When in Production, service A and B are in the enumerable. In development, service A is in the list twice, without B.

Expected Behavior

Should always resolve service A and B regardless of the enviroment.

Steps To Reproduce

Given types:

public interface IMarker { }
public interface IBaseService<T> { }

public class BaseService : IBaseService<A> { }

public class GenericService<T> : IBaseService<T>
    where T : IMarker
{

}

public class A : IMarker { }

This test passes (I use Shouldly for assertions, sorry):

[Fact]
public void Should_resolve_correctly_directly()
{
    var services = new ServiceCollection();

    services.AddTransient<IBaseService<A>, BaseService>();
    services.AddTransient(typeof(IBaseService<>), typeof(GenericService<>));

    var serviceProvider = services.BuildServiceProvider();
    var handlers = serviceProvider
        .GetServices<IBaseService<A>>()
        .ToList();

    handlers.Count.ShouldBe(2);
    var handlersTypes = handlers
        .Select(h => h.GetType())
        .ToList();
    handlersTypes.ShouldContain(typeof(BaseService));
    handlersTypes.ShouldContain(typeof(GenericService<A>));
}

This test passes using the WebApplication.CreateBuilder method:

[Fact]
public void Should_resolve_correctly_with_no_env_set()
{
    var builder = WebApplication.CreateBuilder();

    builder.Services.AddTransient<IBaseService<A>, BaseService>();
    builder.Services.AddTransient(typeof(IBaseService<>), typeof(GenericService<>));

    var serviceProvider = builder.Services.BuildServiceProvider();
    var handlers = serviceProvider
        .GetServices<IBaseService<A>>()
        .ToList();

    handlers.Count.ShouldBe(2);
    var handlersTypes = handlers
        .Select(h => h.GetType())
        .ToList();
    handlersTypes.ShouldContain(typeof(BaseService));
    handlersTypes.ShouldContain(typeof(GenericService<A>));

    var app = builder.Build();

    var appHandlers = app.Services
        .GetServices<IBaseService<A>>()
        .ToList();

    appHandlers.Count.ShouldBe(2);
    var appHandlersTypes = appHandlers
        .Select(h => h.GetType())
        .ToList();
    appHandlersTypes.ShouldContain(typeof(BaseService));
    appHandlersTypes.ShouldContain(typeof(GenericService<A>));
}

This test fails:

[Fact]
public void Should_resolve_correctly_in_development()
{
    var builder = WebApplication.CreateBuilder(new WebApplicationOptions
    {
        EnvironmentName = Environments.Development
    });

    builder.Services.AddTransient<IBaseService<A>, BaseService>();
    builder.Services.AddTransient(typeof(IBaseService<>), typeof(GenericService<>));

    var serviceProvider = builder.Services.BuildServiceProvider();
    var handlers = serviceProvider
        .GetServices<IBaseService<A>>()
        .ToList();

    handlers.Count.ShouldBe(2);
    var handlersTypes = handlers
        .Select(h => h.GetType())
        .ToList();
    handlersTypes.ShouldContain(typeof(BaseService));
    handlersTypes.ShouldContain(typeof(GenericService<A>));

    var app = builder.Build();

    var appHandlers = app.Services
        .GetServices<IBaseService<A>>()
        .ToList();

    appHandlers.Count.ShouldBe(2);
    var appHandlersTypes = appHandlers
        .Select(h => h.GetType())
        .ToList();
    appHandlersTypes.ShouldContain(typeof(BaseService));
    appHandlersTypes.ShouldContain(typeof(GenericService<A>));
}

The first set of assertions succeeds when I build the service provider directly from builder.Services. Once I call builder.Build and use the app.Services to resolve, it gives me the incorrect set of results. Whether or not I build the service provider does not change the second assertion.

Exceptions (if any)

No response

.NET Version

6.0.101

Anything else?

No response

Author: jbogard
Assignees: -
Labels:

untriaged, area-Extensions-DependencyInjection, area-runtime

Milestone: -

I can't help but think this is partially my fault from #39540

Likely another caching bug. We had one like this already.

The initial fix we attempted in the closed PR #68053 would introduce a breaking change.

The alternative fix I tried with @davidfowl was to get the callsite in a different way within ServiceProvider.ValidateService(..) rather than skipping validation for generic types by making the following change:

-ServiceCallSite? callSite = CallSiteFactory.GetCallSite(descriptor, new CallSiteChain());
+ServiceCallSite? callSite = CallSiteFactory.GetCallSite(typeof(IEnumerable<>).MakeGenericType(descriptor.ServiceType), new CallSiteChain());

But this approach introduces 3 failing tests (one in ClosedServicesPreferredOverOpenGenericServices test) which were concerning. It would be good to try debugging the test ClosedServicesPreferredOverOpenGenericServices using this alternative fix before moving out of 7.0 just in case that ends up hinting to an underlying bug we don't yet know about

I was just bitten by this... We recently moved from a web host builder to host builder, and the fact that ValidateOnBuild was only set when HostingEnvironment.IsDevelopment() (CreateDefaultServiceProviderOptions) is tricky. In our case, the class had some string types in the ctor but it was never resolved by the DI container at runtime so it didn't matter that it couldn't resolve it. After the migration to the generic host provider, on dev boxes the service stopped starting up. But our CI pipeline/prod don't catch the error.

I'd prefer if ValidateOnBuild was always on or always off, not depending on the HostingEnvironment.

Or always on

Always on is fine with me. We're always trying to minimize differences between environments.

Another bug repro, ran into this recently:

void Main()
{
    // BUG https://github.com/dotnet/runtime/issues/65145
    void Check(bool registerOpenGenericFirst, bool validateOnBuild, bool validateScopes)
    {
        var services = new ServiceCollection();
        if (registerOpenGenericFirst)
        {
            services.AddSingleton(typeof(IOpenGeneric<>), typeof(OpenGeneric<>));
            services.AddSingleton<IOpenGeneric<Foo>>(_ => new OpenGeneric<Foo>() { Value = new Foo("specific Foo") });
        }
        else
        {
            services.AddSingleton<IOpenGeneric<Foo>>(_ => new OpenGeneric<Foo>() { Value = new Foo("specific Foo") });
            services.AddSingleton(typeof(IOpenGeneric<>), typeof(OpenGeneric<>));
        }
        using var sp = services.BuildServiceProvider(new ServiceProviderOptions() { ValidateOnBuild = validateOnBuild, ValidateScopes = validateScopes, });
        sp.GetService<IEnumerable<IOpenGeneric<Foo>>>().Dump($"RegisterOpenGenericFirst={registerOpenGenericFirst}, ValidateOnBuild={validateOnBuild}, ValidateScopes={validateScopes}");

    }
    Check(registerOpenGenericFirst: true, validateOnBuild: true, validateScopes: true);
    Check(registerOpenGenericFirst: true, validateOnBuild: false, validateScopes: true);
    Check(registerOpenGenericFirst: false, validateOnBuild: true, validateScopes: true);
    Check(registerOpenGenericFirst: false, validateOnBuild: false, validateScopes: true);
}
interface IOpenGeneric<T> where T: new() { T? Value { get; }}
record OpenGeneric<T> : IOpenGeneric<T> where T: new() { public T? Value { get; set; } = new(); }
record Foo 
{ 
    public Foo(): this("Bar") {}
    public Foo(string test) => Test = test;
    public string Test;
}

image

@tarekgh / @davidfowl Is it possible to fix this at least in version 7.0?

.NET 7.0 is done. This issue is planned for .NET 8.0 now.

@buyaa-n @tarekgh , the release of .NET 8 is near, will it include the fix for this issue?

@buyaa-n @steveharter can help advising about it. @vchirikov are you interested to submit a PR for it?

Pretty sure this is fixed, can someone validate?

@davidfowl, I've checked preview 4 bug was there, but looks like have been fixed in preview 6 (at least in my repro)

Proof:

image

Thanks :)

Thanks @vchirikov for confirming. Closing the issue per your comment.