OrleansContrib/Orleans.Clustering.Kubernetes

Directory.RegisterAsync S10.100.0.133:11111:286399169*cli/693b40b9@9aa135a9 failed.

4c74356b41 opened this issue · 3 comments

Hey, I'm getting a weird issue:

[19:19:43:723 WRN] UnregisterManyAsync 1 failed.
System.InvalidOperationException: Grain directory is stopping
   at Orleans.Runtime.GrainDirectory.LocalGrainDirectory.CheckIfShouldForward(GrainId grainId, Int32 hopCount, String operationDescription)
   at Orleans.Runtime.GrainDirectory.LocalGrainDirectory.UnregisterOrPutInForwardList(IEnumerable`1 addresses, UnregistrationCause cause, Int32 hopCount, Dictionary`2& forward, List`1 tasks, String context)
   at Orleans.Runtime.GrainDirectory.LocalGrainDirectory.UnregisterManyAsync(List`1 addresses, UnregistrationCause cause, Int32 hopCount)
   at Orleans.Runtime.Scheduler.AsyncClosureWorkItem.Execute()
   at Orleans.Runtime.Catalog.FinishDestroyActivations(List`1 list, Int32 number, MultiTaskCompletionSource tcs)
[19:19:43:724 WRN] UnregisterManyAsync 3 failed.
System.InvalidOperationException: Grain directory is stopping
   at Orleans.Runtime.GrainDirectory.LocalGrainDirectory.CheckIfShouldForward(GrainId grainId, Int32 hopCount, String operationDescription)
   at Orleans.Runtime.GrainDirectory.LocalGrainDirectory.UnregisterOrPutInForwardList(IEnumerable`1 addresses, UnregistrationCause cause, Int32 hopCount, Dictionary`2& forward, List`1 tasks, String context)
   at Orleans.Runtime.GrainDirectory.LocalGrainDirectory.UnregisterManyAsync(List`1 addresses, UnregistrationCause cause, Int32 hopCount)
   at Orleans.Runtime.Scheduler.AsyncClosureWorkItem.Execute()
   at Orleans.Runtime.Catalog.FinishDestroyActivations(List`1 list, Int32 number, MultiTaskCompletionSource tcs)
[19:19:44:527 ERR] Directory.RegisterAsync S10.100.0.133:11111:286399169*cli/693b40b9@9aa135a9 failed.
System.InvalidOperationException: Grain directory is stopping
   at Orleans.Runtime.GrainDirectory.LocalGrainDirectory.CheckIfShouldForward(GrainId grainId, Int32 hopCount, String operationDescription)
   at Orleans.Runtime.GrainDirectory.LocalGrainDirectory.RegisterAsync(ActivationAddress address, Boolean singleActivation, Int32 hopCount)
   at Orleans.OrleansTaskExtentions.LogException(Task task, ILogger logger, ErrorCode errorCode, String message)
[19:19:44:528 ERR] OnClientRefreshTimer has thrown an exceptions.
System.InvalidOperationException: Grain directory is stopping
   at Orleans.Runtime.GrainDirectory.LocalGrainDirectory.CheckIfShouldForward(GrainId grainId, Int32 hopCount, String operationDescription)
   at Orleans.Runtime.GrainDirectory.LocalGrainDirectory.RegisterAsync(ActivationAddress address, Boolean singleActivation, Int32 hopCount)
   at Orleans.OrleansTaskExtentions.LogException(Task task, ILogger logger, ErrorCode errorCode, String message)
   at Orleans.Runtime.ClientObserverRegistrar.OnClientRefreshTimer(Object data)
[19:19:44:560 ERR] RunClientMessagePump has thrown exception
System.OperationCanceledException: The operation was canceled.
   at System.Collections.Concurrent.BlockingCollection`1.TryTakeWithNoTimeValidation(T& item, Int32 millisecondsTimeout, CancellationToken cancellationToken, CancellationTokenSource combinedTokenSource)
   at System.Collections.Concurrent.BlockingCollection`1.TryTake(T& item, Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at System.Collections.Concurrent.BlockingCollection`1.Take(CancellationToken cancellationToken)
   at Orleans.Runtime.HostedClient.RunClientMessagePump()
[19:19:44:582 WRN] Process is exiting

Not really sure what the issue could be, sample app works fine. This is an existing Orleans app i'm trying to convert to run on top of kubernetes cluster. crd's are there, I can see it creating custom resources.

address: 10.100.0.133
apiVersion: orleans.dot.net/v1
clusterId: clarityorleansclusterid
generation: 286399169
hostname: cli-784b77474b-w2wsr
iAmAliveTime: "2019-01-28T19:19:43.8210054+00:00"
kind: OrleansSilo
metadata:
  name: 10.100.0.133-11111-286399169
  namespace: orleans
port: 11111
proxyPort: 30000
siloName: Silo_63485
startTime: "2019-01-28T19:19:30.0103034+00:00"
status: dead
suspectingSilos:
- 10.100.0.133:11111@286399169
suspectingTimes:
- 2019-01-28 19:19:43.821 GMT
apiVersion: orleans.dot.net/v1
clusterId: clarityorleansclusterid
clusterVersion: 4
kind: OrleansClusterVersion
metadata:
  name: clarityorleansclusterid
  namespace: orleans

Any pointers?

nevermind, was a silly configuration from the past, where the application awaited the keystroke to shut itself off

Sorry for taking too much to reply.

I was about to say that it was either a networking issue or your silo was not properly shutdown and other silos were trying to reach it. Probably the latter.

Glad you solved it.

Please let me know if you have any other problem.

Best regards,
Gutemberg

nah, it was waiting for console input and obviously container was shutdown because of that