Project-MONAI/monai-deploy-informatics-gateway

ExportRequestEvent looping

JoeBatt1989 opened this issue · 6 comments

Description

When MIG is trying to action an ExportRequestEvent, if the port on the destination is incorrect then the action seems to loop excessively but also other ExportRequestEvents are not picked up from the queue and sit in ready status.

Steps to reproduce

  1. Register a destination with correct ip but incorrect port for receiving DICOM server
  2. Create Workflow with an export task with that destination referenced
  3. Send data to MIG

Expected behavior

Task is marked as failed after some attempts to create an association and next message on the queue is picked up.

Actual behavior

Task is still in dispatched state and logs show that the attempted assiciation happens a lot.

Configuration

Regression?

Other information

AD shared the workflow and logs. @mocsharp one for us to Test next week to investigate.

I tried to reproduce the issue by adding a DICOM destination with a bad port number and here's the log:

mdl-wm         | info: Monai.Deploy.Messaging.RabbitMQ.RabbitMQMessageSubscriberService[10002]
mdl-wm         |       Message received from queue md.tasks.update for md.tasks.update.
mdl-wm         | dbug: Monai.Deploy.WorkflowManager.WorkfowExecuter.Common.ArtifactMapper[32]
mdl-wm         |       Verifying artifact existence on bucket monaideploy: env_MONAI_OUTPUTPATH=2cc38ddc-6942-4cb5-bcd9-1d327f86d698/workflows/200ea8fa-a331-405b-8476-c63125cb93a9/9ab060da-a0fd-4af9-989a-2567eb279069/env_MONAI_OUTPUTPATH.
mdl-wm         | info: Monai.Deploy.Messaging.RabbitMQ.RabbitMQMessagePublisherService[10000]
mdl-wm         |       Publishing message to rabbitmq/monaideploy. Exchange=monaideploy, Routing Key=md.export.request.monaiscu.
mdl-wm         | info: Monai.Deploy.WorkflowManager.WorkfowExecuter.Services.WorkflowExecuterService[28]
mdl-wm         |       {"message":"Task Complete","object":{"ExecutionId":"9ab060da-a0fd-4af9-989a-2567eb279069","TaskId":"lung","WorkflowInstanceId":"200ea8fa-a331-405b-8476-c63125cb93a9","WorkflowId":"3bdd4d82-fe22-40b8-821b-53e4a6b68248","CorrelationId":"88aba874-61eb-4484-a550-86a358dbc4eb","TaskStatus":"Succeeded","TaskType":"docker","TaskStartTime":"2022-09-19T21:11:39.009Z","TaskEndTime":"2022-09-19T21:12:19.8466956Z","TaskStatsObject":{},"PatientDetails":{"patient_id":"covid-19-A-0001","patient_name":"{\n  \"Alphabetic\": \"covid-19-A-0001 Clara\"\n}","patient_sex":"M","patient_dob":"2006-01-01T00:00:00Z","patient_age":null,"patient_hospital_id":null}}}
mdl-wm         | info: Monai.Deploy.Messaging.RabbitMQ.RabbitMQMessageSubscriberService[10004]
mdl-wm         |       Sending message acknowledgement for message 2c968863-5e4d-4170-b94e-f26901b742fc.
mdl-wm         | info: Monai.Deploy.Messaging.RabbitMQ.RabbitMQMessageSubscriberService[10005]
mdl-wm         |       Ackowledge sent for message 2c968863-5e4d-4170-b94e-f26901b742fc.
mdl-ig         | info: Monai.Deploy.Messaging.RabbitMQ.RabbitMQMessageSubscriberService[10002]
mdl-ig         |       Message received from queue md.export.request.monaiscu for md.export.request.monaiscu.
mdl-ig         | info: Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService[525]
mdl-ig         |       Sending job to ORTHANC@172.29.0.100:8899...
mdl-ig         | [DEBUG] [IDLE] --> [CONNECTING]
mdl-ig         | [DEBUG] [CONNECTING] --> [COMPLETED]
mdl-ig         | [DEBUG] [COMPLETED] DICOM client completed with an error
mdl-ig         | [WARNING] [COMPLETED] An error occurred and no active connection was detected, so no cleanup will happen!
mdl-ig         | fail: Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService[524]
mdl-ig         |       Error exporting to DICOM destination. Waiting 00:00:00.2500000 before next retry. Retry attempt 1.
mdl-ig         |       System.AggregateException: One or more errors occurred. (Connection refused)
mdl-ig         |        ---> System.Net.Sockets.SocketException (111): Connection refused
mdl-ig         |          at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
mdl-ig         |          at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
mdl-ig         |          at System.Threading.Tasks.ValueTask.ValueTaskSourceAsTask.<>c.<.cctor>b__4_0(Object state)
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at System.Net.Sockets.TcpClient.CompleteConnectAsync(Task task)
mdl-ig         |          --- End of inner exception stack trace ---
mdl-ig         |          at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
mdl-ig         |          at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
mdl-ig         |          at System.Threading.Tasks.Task.Wait()
mdl-ig         |          at FellowOakDicom.Network.DesktopNetworkStream..ctor(String host, Int32 port, Boolean useTls, Boolean noDelay, Boolean ignoreSslPolicyErrors, Int32 millisecondsTimeout)
mdl-ig         |          at FellowOakDicom.Network.DesktopNetworkManager.CreateNetworkStreamImpl(String host, Int32 port, Boolean useTls, Boolean noDelay, Boolean ignoreSslPolicyErrors, Int32 millisecondsTimeout)
mdl-ig         |          at FellowOakDicom.Network.NetworkManager.FellowOakDicom.Network.INetworkManager.CreateNetworkStream(String host, Int32 port, Boolean useTls, Boolean noDelay, Boolean ignoreSslPolicyErrors, Int32 millisecondsTimeout)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.<>c__DisplayClass4_0.<Connect>b__0()
mdl-ig         |          at System.Threading.Tasks.Task`1.InnerInvoke()
mdl-ig         |          at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.Connect(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientCompletedState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.Transition(IDicomClientState newState, DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.Transition(IDicomClientState newState, DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientIdleState.SendAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.SendAsync(CancellationToken cancellationToken, DicomClientCancellationMode cancellationMode)
mdl-ig         |          at Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService.<>c__DisplayClass14_0.<<HandleDesination>b__4>d.MoveNext() in /app/src/InformaticsGateway/Services/Export/ScuExportService.cs:line 123
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at Polly.AsyncPolicy.<>c__DisplayClass40_0.<<ImplementationAsync>b__0>d.MoveNext()
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at Polly.Retry.AsyncRetryEngine.ImplementationAsync[TResult](Func`3 action, Context context, CancellationToken cancellationToken, ExceptionPredicates shouldRetryExceptionPredicates, ResultPredicates`1 shouldRetryResultPredicates, Func`5 onRetryAsync, Int32 permittedRetryCount, IEnumerable`1 sleepDurationsEnumerable, Func`4 sleepDurationProvider, Boolean continueOnCapturedContext)
mdl-ig         | info: Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService[525]
mdl-ig         |       Sending job to ORTHANC@172.29.0.100:8899...
mdl-ig         | [DEBUG] [COMPLETED] --> [IDLE]
mdl-ig         | [DEBUG] [IDLE] More requests to send (and no cancellation requested yet), automatically opening new association
mdl-ig         | [DEBUG] [IDLE] --> [CONNECTING]
mdl-ig         | [DEBUG] [CONNECTING] --> [COMPLETED]
mdl-ig         | [DEBUG] [COMPLETED] DICOM client completed with an error
mdl-ig         | [WARNING] [COMPLETED] An error occurred and no active connection was detected, so no cleanup will happen!
mdl-ig         | fail: Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService[524]
mdl-ig         |       Error exporting to DICOM destination. Waiting 00:00:00.5000000 before next retry. Retry attempt 2.
mdl-ig         |       System.AggregateException: One or more errors occurred. (Connection refused)
mdl-ig         |        ---> System.Net.Sockets.SocketException (111): Connection refused
mdl-ig         |          at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
mdl-ig         |          at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
mdl-ig         |          at System.Threading.Tasks.ValueTask.ValueTaskSourceAsTask.<>c.<.cctor>b__4_0(Object state)
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at System.Net.Sockets.TcpClient.CompleteConnectAsync(Task task)
mdl-ig         |          --- End of inner exception stack trace ---
mdl-ig         |          at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
mdl-ig         |          at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
mdl-ig         |          at System.Threading.Tasks.Task.Wait()
mdl-ig         |          at FellowOakDicom.Network.DesktopNetworkStream..ctor(String host, Int32 port, Boolean useTls, Boolean noDelay, Boolean ignoreSslPolicyErrors, Int32 millisecondsTimeout)
mdl-ig         |          at FellowOakDicom.Network.DesktopNetworkManager.CreateNetworkStreamImpl(String host, Int32 port, Boolean useTls, Boolean noDelay, Boolean ignoreSslPolicyErrors, Int32 millisecondsTimeout)
mdl-ig         |          at FellowOakDicom.Network.NetworkManager.FellowOakDicom.Network.INetworkManager.CreateNetworkStream(String host, Int32 port, Boolean useTls, Boolean noDelay, Boolean ignoreSslPolicyErrors, Int32 millisecondsTimeout)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.<>c__DisplayClass4_0.<Connect>b__0()
mdl-ig         |          at System.Threading.Tasks.Task`1.InnerInvoke()
mdl-ig         |          at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.Connect(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientCompletedState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.Transition(IDicomClientState newState, DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.Transition(IDicomClientState newState, DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientIdleState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.Transition(IDicomClientState newState, DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientCompletedState.SendAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.SendAsync(CancellationToken cancellationToken, DicomClientCancellationMode cancellationMode)
mdl-ig         |          at Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService.<>c__DisplayClass14_0.<<HandleDesination>b__4>d.MoveNext() in /app/src/InformaticsGateway/Services/Export/ScuExportService.cs:line 123
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at Polly.AsyncPolicy.<>c__DisplayClass40_0.<<ImplementationAsync>b__0>d.MoveNext()
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at Polly.Retry.AsyncRetryEngine.ImplementationAsync[TResult](Func`3 action, Context context, CancellationToken cancellationToken, ExceptionPredicates shouldRetryExceptionPredicates, ResultPredicates`1 shouldRetryResultPredicates, Func`5 onRetryAsync, Int32 permittedRetryCount, IEnumerable`1 sleepDurationsEnumerable, Func`4 sleepDurationProvider, Boolean continueOnCapturedContext)
mdl-ig         | info: Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService[525]
mdl-ig         |       Sending job to ORTHANC@172.29.0.100:8899...
mdl-ig         | [DEBUG] [COMPLETED] --> [IDLE]
mdl-ig         | [DEBUG] [IDLE] More requests to send (and no cancellation requested yet), automatically opening new association
mdl-ig         | [DEBUG] [IDLE] --> [CONNECTING]
mdl-ig         | [DEBUG] [CONNECTING] --> [COMPLETED]
mdl-ig         | [DEBUG] [COMPLETED] DICOM client completed with an error
mdl-ig         | [WARNING] [COMPLETED] An error occurred and no active connection was detected, so no cleanup will happen!
mdl-ig         | fail: Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService[524]
mdl-ig         |       Error exporting to DICOM destination. Waiting 00:00:01 before next retry. Retry attempt 3.
mdl-ig         |       System.AggregateException: One or more errors occurred. (Connection refused)
mdl-ig         |        ---> System.Net.Sockets.SocketException (111): Connection refused
mdl-ig         |          at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
mdl-ig         |          at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
mdl-ig         |          at System.Threading.Tasks.ValueTask.ValueTaskSourceAsTask.<>c.<.cctor>b__4_0(Object state)
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at System.Net.Sockets.TcpClient.CompleteConnectAsync(Task task)
mdl-ig         |          --- End of inner exception stack trace ---
mdl-ig         |          at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
mdl-ig         |          at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
mdl-ig         |          at System.Threading.Tasks.Task.Wait()
mdl-ig         |          at FellowOakDicom.Network.DesktopNetworkStream..ctor(String host, Int32 port, Boolean useTls, Boolean noDelay, Boolean ignoreSslPolicyErrors, Int32 millisecondsTimeout)
mdl-ig         |          at FellowOakDicom.Network.DesktopNetworkManager.CreateNetworkStreamImpl(String host, Int32 port, Boolean useTls, Boolean noDelay, Boolean ignoreSslPolicyErrors, Int32 millisecondsTimeout)
mdl-ig         |          at FellowOakDicom.Network.NetworkManager.FellowOakDicom.Network.INetworkManager.CreateNetworkStream(String host, Int32 port, Boolean useTls, Boolean noDelay, Boolean ignoreSslPolicyErrors, Int32 millisecondsTimeout)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.<>c__DisplayClass4_0.<Connect>b__0()
mdl-ig         |          at System.Threading.Tasks.Task`1.InnerInvoke()
mdl-ig         |          at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.Connect(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientCompletedState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.Transition(IDicomClientState newState, DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.Transition(IDicomClientState newState, DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientIdleState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.Transition(IDicomClientState newState, DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientCompletedState.SendAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.SendAsync(CancellationToken cancellationToken, DicomClientCancellationMode cancellationMode)
mdl-ig         |          at Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService.<>c__DisplayClass14_0.<<HandleDesination>b__4>d.MoveNext() in /app/src/InformaticsGateway/Services/Export/ScuExportService.cs:line 123
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at Polly.AsyncPolicy.<>c__DisplayClass40_0.<<ImplementationAsync>b__0>d.MoveNext()
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at Polly.Retry.AsyncRetryEngine.ImplementationAsync[TResult](Func`3 action, Context context, CancellationToken cancellationToken, ExceptionPredicates shouldRetryExceptionPredicates, ResultPredicates`1 shouldRetryResultPredicates, Func`5 onRetryAsync, Int32 permittedRetryCount, IEnumerable`1 sleepDurationsEnumerable, Func`4 sleepDurationProvider, Boolean continueOnCapturedContext)
mdl-ig         | info: Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService[525]
mdl-ig         |       Sending job to ORTHANC@172.29.0.100:8899...
mdl-ig         | [DEBUG] [COMPLETED] --> [IDLE]
mdl-ig         | [DEBUG] [IDLE] More requests to send (and no cancellation requested yet), automatically opening new association
mdl-ig         | [DEBUG] [IDLE] --> [CONNECTING]
mdl-ig         | [DEBUG] [CONNECTING] --> [COMPLETED]
mdl-ig         | [DEBUG] [COMPLETED] DICOM client completed with an error
mdl-ig         | [WARNING] [COMPLETED] An error occurred and no active connection was detected, so no cleanup will happen!
mdl-ig         | fail: Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService[530]
mdl-ig         |       Association aborted with error Connection refused.
mdl-ig         |       System.AggregateException: One or more errors occurred. (Connection refused)
mdl-ig         |        ---> System.Net.Sockets.SocketException (111): Connection refused
mdl-ig         |          at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
mdl-ig         |          at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
mdl-ig         |          at System.Threading.Tasks.ValueTask.ValueTaskSourceAsTask.<>c.<.cctor>b__4_0(Object state)
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at System.Net.Sockets.TcpClient.CompleteConnectAsync(Task task)
mdl-ig         |          --- End of inner exception stack trace ---
mdl-ig         |          at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
mdl-ig         |          at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
mdl-ig         |          at System.Threading.Tasks.Task.Wait()
mdl-ig         |          at FellowOakDicom.Network.DesktopNetworkStream..ctor(String host, Int32 port, Boolean useTls, Boolean noDelay, Boolean ignoreSslPolicyErrors, Int32 millisecondsTimeout)
mdl-ig         |          at FellowOakDicom.Network.DesktopNetworkManager.CreateNetworkStreamImpl(String host, Int32 port, Boolean useTls, Boolean noDelay, Boolean ignoreSslPolicyErrors, Int32 millisecondsTimeout)
mdl-ig         |          at FellowOakDicom.Network.NetworkManager.FellowOakDicom.Network.INetworkManager.CreateNetworkStream(String host, Int32 port, Boolean useTls, Boolean noDelay, Boolean ignoreSslPolicyErrors, Int32 millisecondsTimeout)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.<>c__DisplayClass4_0.<Connect>b__0()
mdl-ig         |          at System.Threading.Tasks.Task`1.InnerInvoke()
mdl-ig         |          at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.Connect(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientCompletedState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.Transition(IDicomClientState newState, DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientConnectState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.Transition(IDicomClientState newState, DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientIdleState.GetNextStateAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.Transition(IDicomClientState newState, DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.States.DicomClientCompletedState.SendAsync(DicomClientCancellation cancellation)
mdl-ig         |          at FellowOakDicom.Network.Client.DicomClient.SendAsync(CancellationToken cancellationToken, DicomClientCancellationMode cancellationMode)
mdl-ig         |          at Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService.<>c__DisplayClass14_0.<<HandleDesination>b__4>d.MoveNext() in /app/src/InformaticsGateway/Services/Export/ScuExportService.cs:line 123
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at Polly.AsyncPolicy.<>c__DisplayClass40_0.<<ImplementationAsync>b__0>d.MoveNext()
mdl-ig         |       --- End of stack trace from previous location ---
mdl-ig         |          at Polly.Retry.AsyncRetryEngine.ImplementationAsync[TResult](Func`3 action, Context context, CancellationToken cancellationToken, ExceptionPredicates shouldRetryExceptionPredicates, ResultPredicates`1 shouldRetryResultPredicates, Func`5 onRetryAsync, Int32 permittedRetryCount, IEnumerable`1 sleepDurationsEnumerable, Func`4 sleepDurationProvider, Boolean continueOnCapturedContext)
mdl-ig         |          at Polly.AsyncPolicy.ExecuteAsync(Func`3 action, Context context, CancellationToken cancellationToken, Boolean continueOnCapturedContext)
mdl-ig         |          at Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService.HandleDesination(ExportRequestDataMessage exportRequestData, String destinationName, CancellationToken cancellationToken) in /app/src/InformaticsGateway/Services/Export/ScuExportService.cs:line 112
mdl-ig         | info: Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService[505]
mdl-ig         |       Export task completed with 1 failures out of 1.
mdl-ig         | info: Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService[509]
mdl-ig         |       Sending acknowledgement.
mdl-ig         | info: Monai.Deploy.Messaging.RabbitMQ.RabbitMQMessageSubscriberService[10004]
mdl-ig         |       Sending message acknowledgement for message 0cdf4e7c-df72-40dd-beec-9f90c385f356.
mdl-ig         | info: Monai.Deploy.Messaging.RabbitMQ.RabbitMQMessageSubscriberService[10005]
mdl-ig         |       Ackowledge sent for message 0cdf4e7c-df72-40dd-beec-9f90c385f356.
mdl-ig         | info: Monai.Deploy.InformaticsGateway.Services.Export.ScuExportService[511]
mdl-ig         |       Publishing export complete message.
mdl-ig         | info: Monai.Deploy.Messaging.RabbitMQ.RabbitMQMessagePublisherService[10000]
mdl-ig         |       Publishing message to rabbitmq/monaideploy. Exchange=monaideploy, Routing Key=md.export.complete.
mdl-wm         | info: Monai.Deploy.Messaging.RabbitMQ.RabbitMQMessageSubscriberService[10002]
mdl-wm         |       Message received from queue md.export.complete for md.export.complete.
mdl-wm         | info: Monai.Deploy.WorkflowManager.WorkfowExecuter.Services.WorkflowExecuterService[28]
mdl-wm         |       {"message":"Task Complete","object":{"ExecutionId":"c7fe562d-2075-4bea-badd-af57a9e23865","TaskId":"export-lung-seg","WorkflowInstanceId":"200ea8fa-a331-405b-8476-c63125cb93a9","WorkflowId":"3bdd4d82-fe22-40b8-821b-53e4a6b68248","CorrelationId":"88aba874-61eb-4484-a550-86a358dbc4eb","TaskStatus":"Failed","TaskType":"export","TaskStartTime":"2022-09-19T21:12:19.806Z","TaskEndTime":"2022-09-19T21:12:21.8733409Z","TaskStatsObject":{},"PatientDetails":{"patient_id":"covid-19-A-0001","patient_name":"{\n  \"Alphabetic\": \"covid-19-A-0001 Clara\"\n}","patient_sex":"M","patient_dob":"2006-01-01T00:00:00Z","patient_age":null,"patient_hospital_id":null}}}
mdl-wm         | info: Monai.Deploy.Messaging.RabbitMQ.RabbitMQMessageSubscriberService[10004]
mdl-wm         |       Sending message acknowledgement for message 0cdf4e7c-df72-40dd-beec-9f90c385f356.
mdl-wm         | info: Monai.Deploy.Messaging.RabbitMQ.RabbitMQMessageSubscriberService[10005]
mdl-wm         |       Ackowledge sent for message 0cdf4e7c-df72-40dd-beec-9f90c385f356.

The log shows that IG retried the export 3 times (total 4) and it gave up by sending an Ack message and an export complete message.

@neildsouth do you have some logs showing this please?

@neildsouth, where can I find the logs? Thanks

logs.zip
these should be the ones from the night in question

After going through the logs, I see that export messages are being acknowledged.

The design was changed due to the fact that we want to be able to export to multiple destinations in a single message. To reduce memory usage, each file is downloaded and sent down to the export pipeline to export to each of the specified destinations. This means that if a destination is not reachable, each file will retry 3 times (total 4). If there are 3 files, there will be 12 attempts.

We can improve this by downloading all files to the local filesystem before attempting to export them in the future.

@JoeBatt1989 I'm closing this for now; if you find that an export request message doesn't ever get ack'ed, please reopen this issue.