aws/aws-xray-sdk-dotnet

Using a custom DbCommandInterceptor along with X-Ray's EFInterceptor throws InvalidCastException

srprash opened this issue · 3 comments

In my sample ASPNetCore Webapp which uses EntityFramework with a Postgresql database context, I have an empty and very basic DbCommandInterceptor added to the database context in conjunction with AddXRayInterceptor method which adds the EFInterceptor to trace EF queries to my postgres db.
When I start the application and try to do a GET on the base url (http://localhost:18111/user), I get an InvalidCastException: Unable to cast object of type 'Amazon.XRay.Recorder.Core.Internal.Entities.Segment' to type 'Amazon.XRay.Recorder.Core.Internal.Entities.Subsegment error message.

Stack trace:

System.InvalidCastException: Unable to cast object of type 'Amazon.XRay.Recorder.Core.Internal.Entities.Segment' to type 'Amazon.XRay.Recorder.Core.Internal.Entities.Subsegment'.
   at Amazon.XRay.Recorder.Core.AWSXRayRecorderImpl.PrepEndSubsegment() in aws-xray-sdk-dotnet\sdk\src\Core\AWSXRayRecorderImpl.cs:line 826
   at Amazon.XRay.Recorder.Core.AWSXRayRecorderImpl.ProcessEndSubsegment(Nullable`1 timestamp) in aws-xray-sdk-dotnet\sdk\src\Core\AWSXRayRecorderImpl.cs:line 787
   at Amazon.XRay.Recorder.Core.AWSXRayRecorder.EndSubsegment(Nullable`1 timestamp) in aws-xray-sdk-dotnet\sdk\src\Core\AWSXRayRecorder.netcore.cs:line 331

Interesting thing to note is that POST requests on the url do not throw this error. Digging around by putting breakpoints, I noticed that POST calls for saving data into the database are synchronous while GET is asynchronous.
In case of GET, the ReaderExecutingAsync method is called right before the db call to begin a subsegment, and ReaderExecutedAsync is called after the db command is complete to close the subsegment. These methods are called on different threads and therefore the TraceContext is different. The subsegment opened in the first thread is not present on the second thread which tries to close the subsegment. Rather only the Segment which is created by these thread's parent is present in the second thread, therefore the InvalidCastException occurs.

Another point is that even though GET requests are asynchronous even when just the X-Ray's EFInterceptor is being used, the error does not occur. The TraceContext is consistent in this case.

My sample app's startup.cs:

namespace InterceptorExample
{
    public class MyDbCommandInterceptor : DbCommandInterceptor
    {
        // An empty interceptor
    }

    public class Startup
    {
        // This method gets called by the runtime. Use this method to add services to the container.
        public void ConfigureServices(IServiceCollection services)
        {
            services.AddControllers();

            services.AddDbContext<DatabaseContext>(options =>
            {
                options.UseNpgsql("Server=localhost;Port=1234;Database=postgres;User Id=dummy;Password=dummy;");

                // Add a simple do-nothing interceptor
                options.AddInterceptors(new MyDbCommandInterceptor());

                // Add XRay interceptor
                options.AddXRayInterceptor();
            });
        }

        // This method gets called by the runtime. Use this method to configure the HTTP request pipeline.
        public void Configure(IApplicationBuilder app)
        {
            app.UseXRay("Sample Application");

            app.UseDeveloperExceptionPage();

            app.UseRouting();

            app.UseEndpoints(endpoints =>
            {
                endpoints.MapControllers();
            });
        }
    }
}

Tested this with a simpler and no X-Ray sample app. I've come to the conclusion that its definitely something wrong with AsyncLocal context retention when more than one DbCommandInterceptors are used in the db call chain.
Please find details about the repro code here.