dotnet/orleans

Grain State changes when silo restarts

gungorenu opened this issue · 4 comments

Hello, I want to ask about a weird situation where my grain state data changes when I restart my service. I wonder if I do something wrong.

first of all my grain state (I cleaned up a little):

[GenerateSerializer]
public class PublishGrainState : TaleEntityStateBase
{
    public PublishGrainState()
    {
        LastExecutedPage = new ChapterPagePair();
    }

    [Id(4)]
    public ChapterPagePair LastExecutedPage { get; set; }
}

[GenerateSerializer]
public class ChapterPagePair
{
    public ChapterPagePair()
    {
        Chapter = -1;
        Page = -1;
    }

    [Id(0)]
    public int Chapter { get; init; }
    [Id(1)]
    public int Page { get; init; }
}

I set the chapter and page to -1 in constructor as you see but in operations it will be set to something else, starting from 0.
I can see in debug that I overwrite the property to 30 and 0 (chapter/page) for example (or other combinations)

then I stop my "client" and restart it. my grain have below code to get the last executed page (it is a little different and I cleaned up but almost same functionality).

        private readonly IPersistentState<PublishGrainState> _state;
        public PublishGrain([PersistentState("persistentState", "TaleSvcStorage")] IPersistentState<PublishGrainState> persistentState)
        {
            _state = persistentState;
        }

        public PublishGrainState State => _state.State;

        public Task<ChapterPagePair> LastExecutedPage()
        {
            if (State.Status != ControllerGrainStatus.Executed &&
                State.Status != ControllerGrainStatus.Published &&
                State.Status != ControllerGrainStatus.Idle) return Task.FromResult(new ChapterPagePair { Chapter = -1, Page = -1 });
            return Task.FromResult(State.LastExecutedPage);
        }

        async Task SaveState(Func<TState, Task> action)
        {
            await action(_state.State);
            await _state.WriteStateAsync();
        }
        
        // a random call
         public async Task OnExecuteComplete(int callerChapter, int callerPage, ExecutionResult result)
        {
            // ....
            await SaveState(async (state) =>
            {
                // ....
                state.ExecuteResults[callerChapter] = result;
                state.Status = ControllerGrainStatus.Executed;
                state.LastExecutedPage = new ChapterPagePair { Chapter = callerChapter, Page = callerPage };
            });
        }

in this call I can see the state info just fine (all properties are just fine including this LastExecutedPage) "until I restart my server". once I restart my server the response becomes 30/-1 even the object is not changed

I have other objects like chapter/page pairs with 29/17, 26/23, 7/8 and so on. they all work correctly except the last one which is supposed to be 30/0 but says 30/-1. Note: I see this in my silo, not only client, so the page data is -1 at silo too.

funny thing, in ChapterPagePair constructor I set Page to -100 and after restart I see -100 in response but chapter info is always correct, 30.

I changed properties from "init" to "set" but it did not help. I also changed code to use clones of the object when setting to state. did not help either.

What could be wrong here? I cannot see the DB content (encrypted) and I do not know if there is a way to see that, but all other objects work except this one so I wonder if there is a rule about constructors in grain states.

Any idea?
Thanks

Note: I use Orleans 8.1.0. I do not know if such an issue is fixed in 8.2.0 though. maybe I can upgrade to latest and try again

Are you calling await _state.WriteStateAsync() to persist the state?

@ReubenBond Yes, but only when I need to write sth or update it. for read operations I dont.

I added the sample save state code to first post

I think I figured out. serializer does not serialize 0 as it is default value for int. but for me 0 is meaningful and I cannot use 0 for default. I think we can close this, Orleans has no issue with it, but it is serializer that is the culprit, not serializing 0s so in restart they turn to -1 per constructor (value was not serialized at all so not read either, stays at -1).

sorry for trouble.

No worries at all, I'm glad you got it sorted out!