microsoft/fhir-server

All queries failing suddenly

Jaco-Minnaar opened this issue · 9 comments

Describe the bug
We are running an instance of the FHIR Server from the Docker file and running into some issues. Since at least 12 hours ago, all queries are failing with the following exception and trace:

2024-05-08T05:41:26.650255158Z fail: Microsoft.Health.Fhir.CosmosDb.Features.Storage.CosmosFhirDataStore[0]
2024-05-08T05:41:26.650270158Z     CosmosFhirDataStore. 'Search Parameters' failed. Elapsed: 00:00:00.0006074. Memory: 186966472.
2024-05-08T05:41:26.650273858Z     Microsoft.Health.Fhir.Core.Exceptions.InitializationException: Failed to initialize search parameters
2024-05-08T05:41:26.650277358Z         at Microsoft.Health.Fhir.Core.Features.Definition.SearchParameterDefinitionManager.EnsureInitialized() in /repo/src/Microsoft.Health.Fhir.Core/Features/Definition/SearchParameterDefinitionManager.cs:line 106
2024-05-08T05:41:26.650280958Z         at Microsoft.Health.Fhir.Core.Features.Definition.SearchParameterDefinitionManager.GetSearchParameters(String resourceType) in /repo/src/Microsoft.Health.Fhir.Core/Features/Definition/SearchParameterDefinitionManager.cs:line 112
2024-05-08T05:41:26.650284658Z         at Microsoft.Health.Fhir.Core.Features.Definition.SearchableSearchParameterDefinitionManager.GetSearchParameters(String resourceType) in /repo/src/Microsoft.Health.Fhir.Core/Features/Definition/SearchableSearchParameterDefinitionManager.cs:line 49
2024-05-08T05:41:26.650310959Z         at Microsoft.Health.Fhir.Core.Features.Conformance.CapabilityStatementBuilder.SyncSearchParamsAsync(String resourceType) in /repo/src/Microsoft.Health.Fhir.Core/Features/Conformance/CapabilityStatementBuilder.cs:line 185
2024-05-08T05:41:26.650314759Z         at Microsoft.Health.Fhir.Core.Features.Conformance.CapabilityStatementBuilder.SyncSearchParametersAsync() in /repo/src/Microsoft.Health.Fhir.Core/Features/Conformance/CapabilityStatementBuilder.cs:line 348
2024-05-08T05:41:26.650318059Z         at Microsoft.Health.Fhir.CosmosDb.Features.Storage.CosmosFhirDataStore.Build(ICapabilityStatementBuilder builder) in /repo/src/Microsoft.Health.Fhir.CosmosDb/Features/Storage/CosmosFhirDataStore.cs:line 806
2024-05-08T05:41:26.650655764Z fail: Microsoft.Health.Fhir.Core.Features.Conformance.SystemConformanceProvider[0]
2024-05-08T05:41:26.650665064Z     SystemConformanceProvider: Failed running 'Microsoft.Health.Fhir.CosmosDb.Features.Storage.CosmosFhirDataStore' when building a new CapabilityStatement.
2024-05-08T05:41:26.650668964Z     Microsoft.Health.Fhir.Core.Exceptions.InitializationException: Failed to initialize search parameters
2024-05-08T05:41:26.650672264Z         at Microsoft.Health.Fhir.Core.Features.Definition.SearchParameterDefinitionManager.EnsureInitialized() in /repo/src/Microsoft.Health.Fhir.Core/Features/Definition/SearchParameterDefinitionManager.cs:line 106
2024-05-08T05:41:26.650675665Z         at Microsoft.Health.Fhir.Core.Features.Definition.SearchParameterDefinitionManager.GetSearchParameters(String resourceType) in /repo/src/Microsoft.Health.Fhir.Core/Features/Definition/SearchParameterDefinitionManager.cs:line 112
2024-05-08T05:41:26.650678965Z         at Microsoft.Health.Fhir.Core.Features.Definition.SearchableSearchParameterDefinitionManager.GetSearchParameters(String resourceType) in /repo/src/Microsoft.Health.Fhir.Core/Features/Definition/SearchableSearchParameterDefinitionManager.cs:line 49
2024-05-08T05:41:26.650682665Z         at Microsoft.Health.Fhir.Core.Features.Conformance.CapabilityStatementBuilder.SyncSearchParamsAsync(String resourceType) in /repo/src/Microsoft.Health.Fhir.Core/Features/Conformance/CapabilityStatementBuilder.cs:line 185
2024-05-08T05:41:26.650686065Z         at Microsoft.Health.Fhir.Core.Features.Conformance.CapabilityStatementBuilder.SyncSearchParametersAsync() in /repo/src/Microsoft.Health.Fhir.Core/Features/Conformance/CapabilityStatementBuilder.cs:line 348
2024-05-08T05:41:26.650689565Z         at Microsoft.Health.Fhir.CosmosDb.Features.Storage.CosmosFhirDataStore.Build(ICapabilityStatementBuilder builder) in /repo/src/Microsoft.Health.Fhir.CosmosDb/Features/Storage/CosmosFhirDataStore.cs:line 806
2024-05-08T05:41:26.650692965Z         at Microsoft.Health.Fhir.Core.Features.Conformance.SystemConformanceProvider.GetCapabilityStatementOnStartup(CancellationToken cancellationToken) in /repo/src/Microsoft.Health.Fhir.Core/Features/Conformance/SystemConformanceProvider.cs:line 146

The response content is the following:

{
    "resourceType": "OperationOutcome",
    "id": "ee3a3476-4e9c-4f28-a960-699f23b1cab6",
    "meta": {
        "lastUpdated": "2024-05-08T05:51:35.2237241+00:00"
    },
    "issue": [
        {
            "severity": "error",
            "code": "unknown",
            "diagnostics": ""
        }
    ]
}

We are using CosmosDB and not sure if the issue lies with the FHIR server instance or the connection with the CosmosDB instance.

We have 3 environments with the same setup and all of them have the same issue.

FHIR Version?
R4

Data provider?
CosmosDB

To Reproduce
Steps to reproduce the behavior:

  1. Start FHIR server (successful)
  2. Do any query.
  3. See query failing.

Expected behavior
Successful queries.

Actual behavior
No access to any data on FHIR Server

If you're using containers, you can try going into the webapp/deployment center in portal and pick an earlier tag:
https://github.com/microsoft/fhir-server/releases
If it is because of the last release then try rolling back to image tag 4.0.116

If you're using containers, you can try going into the webapp/deployment center in portal and pick an earlier tag:
https://github.com/microsoft/fhir-server/releases
If it is because of the last release then try rolling back to image tag 4.0.116

We have tried rolling back to the previous version and seem to be having a working system now. Thanks!

What could have caused this issue?

In the last release we added a check on startup to make sure search parameters were fully loaded before allowing requests to the service. The error you are seeing implies the search parameters were never loaded.
Are there any other errors farther up in the logs? I'm curious if any exceptions were logged during startup.

The only other exception I see is this, but it looks like it's the same exception, just further down in the stack trace.

2024-05-09T07:48:56.667551201Z fail: Microsoft.Health.Fhir.Api.Features.Filters.OperationOutcomeExceptionFilterAttribute[0]
2024-05-09T07:48:56.667576402Z       InternalServerError error returned
2024-05-09T07:48:56.667584002Z       Microsoft.Health.Fhir.Core.Exceptions.InitializationException: Failed to initialize search parameters
2024-05-09T07:48:56.667587402Z          at Microsoft.Health.Fhir.Core.Features.Definition.SearchParameterDefinitionManager.EnsureInitialized() in /repo/src/Microsoft.Health.Fhir.Core/Features/Definition/SearchParameterDefinitionManager.cs:line 106
2024-05-09T07:48:56.667591302Z          at Microsoft.Health.Fhir.Core.Features.Definition.SearchParameterDefinitionManager.GetSearchParameters(String resourceType) in /repo/src/Microsoft.Health.Fhir.Core/Features/Definition/SearchParameterDefinitionManager.cs:line 112
2024-05-09T07:48:56.667595102Z          at Microsoft.Health.Fhir.Core.Features.Definition.SearchableSearchParameterDefinitionManager.GetSearchParameters(String resourceType) in /repo/src/Microsoft.Health.Fhir.Core/Features/Definition/SearchableSearchParameterDefinitionManager.cs:line 49
2024-05-09T07:48:56.667599203Z          at Microsoft.Health.Fhir.Core.Features.Conformance.CapabilityStatementBuilder.SyncSearchParamsAsync(String resourceType) in /repo/src/Microsoft.Health.Fhir.Core/Features/Conformance/CapabilityStatementBuilder.cs:line 185
2024-05-09T07:48:56.667602703Z          at Microsoft.Health.Fhir.Core.Features.Conformance.CapabilityStatementBuilder.SyncSearchParametersAsync() in /repo/src/Microsoft.Health.Fhir.Core/Features/Conformance/CapabilityStatementBuilder.cs:line 348
2024-05-09T07:48:56.667606103Z          at Microsoft.Health.Fhir.CosmosDb.Features.Storage.CosmosFhirDataStore.Build(ICapabilityStatementBuilder builder) in /repo/src/Microsoft.Health.Fhir.CosmosDb/Features/Storage/CosmosFhirDataStore.cs:line 806
2024-05-09T07:48:56.667609403Z          at Microsoft.Health.Fhir.Core.Features.Conformance.SystemConformanceProvider.GetCapabilityStatementOnStartup(CancellationToken cancellationToken) in /repo/src/Microsoft.Health.Fhir.Core/Features/Conformance/SystemConformanceProvider.cs:line 146
2024-05-09T07:48:56.667613603Z          at Microsoft.Health.Fhir.Core.Features.Conformance.SystemConformanceProvider.GetMetadata(CancellationToken cancellationToken) in /repo/src/Microsoft.Health.Fhir.Core/Features/Conformance/SystemConformanceProvider.cs:line 349
2024-05-09T07:48:56.667629203Z          at Microsoft.Health.Fhir.Core.Features.Conformance.GetCapabilitiesHandler.Handle(GetCapabilitiesRequest request, CancellationToken cancellationToken) in /repo/src/Microsoft.Health.Fhir.Core/Features/Conformance/GetCapabilitiesHandler.cs:line 29
2024-05-09T07:48:56.667633204Z          at MediatR.Pipeline.RequestExceptionProcessorBehavior`2.Handle(TRequest request, RequestHandlerDelegate`1 next, CancellationToken cancellationToken)
2024-05-09T07:48:56.667636704Z          at MediatR.Pipeline.RequestExceptionProcessorBehavior`2.Handle(TRequest request, RequestHandlerDelegate`1 next, CancellationToken cancellationToken)
2024-05-09T07:48:56.667640104Z          at MediatR.Pipeline.RequestExceptionActionProcessorBehavior`2.Handle(TRequest request, RequestHandlerDelegate`1 next, CancellationToken cancellationToken)
2024-05-09T07:48:56.667643404Z          at MediatR.Pipeline.RequestExceptionActionProcessorBehavior`2.Handle(TRequest request, RequestHandlerDelegate`1 next, CancellationToken cancellationToken)
2024-05-09T07:48:56.667646704Z          at MediatR.Pipeline.RequestPreProcessorBehavior`2.Handle(TRequest request, RequestHandlerDelegate`1 next, CancellationToken cancellationToken)
2024-05-09T07:48:56.667649804Z          at Microsoft.Health.Fhir.Core.Extensions.FhirMediatorExtensions.GetCapabilitiesAsync(IMediator mediator, CancellationToken cancellationToken) in /repo/src/Microsoft.Health.Fhir.Shared.Core/Extensions/FhirMediatorExtensions.cs:line 157
2024-05-09T07:48:56.667653204Z          at Microsoft.Health.Fhir.Api.Controllers.FhirController.Metadata() in /repo/src/Microsoft.Health.Fhir.Shared.Api/Controllers/FhirController.cs:line 636
2024-05-09T07:48:56.667656404Z          at Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor.TaskOfIActionResultExecutor.Execute(ActionContext actionContext, IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)
2024-05-09T07:48:56.667659704Z          at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeActionMethodAsync>g__Awaited|12_0(ControllerActionInvoker invoker, ValueTask`1 actionResultValueTask)
2024-05-09T07:48:56.667663804Z          at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeNextActionFilterAsync>g__Awaited|10_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)

Thanks @Jaco-Minnaar
The root cause of this is likely a search parameter in a bad state. I've identified a bug that is preventing a more useful exception from being surfaced, but I'm afraid I can't repro the failure you are seeing without knowing the state of the search parameter. I'm working on a change that will let the better error message surface, which may help us find the root of the issue.

@Jaco-Minnaar This issue should be resolved with #3849
Could you try release 4.0.139 and see if the crash is still present? We have also added in more error logging, so you may see a more helpful error on startup pointing to what the root of the issue is. This log shouldn't impact your ability to use your service.

@Jaco-Minnaar closing the issue due to no response

Apologies for no response, we were only able to test the new release in the last week. Everything seems to be fine now.