Weird behaviour with NonDet and Stays modality

Smallest reproduction I could build:

module Specs.ConvergenceSpec where

import Control.Monad (guard)
import Data.Either qualified as Either
import Data.Type.Rec (RecF)
import Language.Spectacle
import Test.Hspec (Spec, hspec, it, shouldSatisfy)

main :: IO ()
main = hspec spec

initialStepsLeft :: Word
initialStepsLeft = 4

spec :: Spec
spec = do
  it "model checks specification with stepNegative and stepPositive actions" $ do
    result <-
      modelcheck
        ( specification $
            ConF #stepNegative (takeStep (pure pred)) $
              ConF #stepPositive (takeStep (pure succ)) NilF
        )
    result `shouldSatisfy` Either.isRight
  it "model checks specification with a single takeStep action that can go negative or positive" $ do
    result <-
      modelcheck
        ( specification $
            ConF #takeStep (takeStep (oneOf [pred, succ])) NilF
        )
    result `shouldSatisfy` Either.isRight

specification ::
  RecF (ActionType '["stepsLeft" # Word, "position" # Int]) acts ->
  Specification
    '["stepsLeft" # Word, "position" # Int]
    acts
    '["staysAtZero" # 'Stays]
specification specNext =
  Specification
    { specInit =
        ConF #stepsLeft (pure initialStepsLeft) $
          ConF #position (pure 0) NilF
    , specNext = specNext
    , specProp =
        ConF #staysAtZero atZero NilF
    }

takeStep ::
  Action '["stepsLeft" # Word, "position" # Int] (Int -> Int) ->
  ActionType '["stepsLeft" # Word, "position" # Int] 'WeakFair
takeStep possibleSteps = ActionWF do
  stepsLeft <- plain #stepsLeft
  guard (stepsLeft > 0)
  position <- plain #position
  delta <- possibleSteps
  let newPosition = delta position
  guard (abs newPosition < fromIntegral stepsLeft) -- never go so far that we can't come back
  #position .= pure newPosition
  #stepsLeft .= pure (pred stepsLeft)
  pure True

atZero :: TemporalType '["stepsLeft" # Word, "position" # Int] 'Stays
atZero = PropFG $ ((0 :: Int) ==) <$> plain #position

The spec models a behaviour where we start at 0, there are a finite number of steps to be taken, and we never take a step away from 0 if it would take us to a distance that is too far to go back. We should be able to move around, but eventually, we should end up stuck at position == 0 (with either 1 or 0 steps left, depending on whether the initial number of steps is odd or even).

In the first test case, we have two possible actions: stepNegative tries to decrease the position, stepPositive tries to increase the position. In the second test case, we have a single action takeStep that uses oneOf to decide whether to decrease or increase the position.

I would expect both cases to be equivalent,. However, the first test case passes, while the second fails.

For some reason, I'm only able to reproduce if initialStepsLeft is >= 4.

On the left, the trace I get for the spec with two separate actions; on the right, the trace I get for the spec with a single, non-deterministic action.

Notice the state with the orange rectangles, 0xb3e6be4b (stepsLeft = 1, position = 1). It shows up in both traces, but it only gets "stuck" and repeats itself in the non-deterministic implementation. The same happens with state 0x5c1941b4 (stepsLeft = 1, position = -1).

I've explored a little and found that the "two separate actions" implementation also fails if I run it with initialStepsLeft = 6.
I suspect the real problem here is something involving this line:

spectacle/src/Language/Spectacle/Model.hs

Line 228 in 76f9838

local (over weakFairActions (Set.delete actionWF)) do

I think this might be excluding some traces where the same action is called twice in a row. It also interacts with the queuedActionsAt list in some way I can't really understand. I think it requires that the same state be achieved through several different paths.
Apparently having more labeled actions make this less likely to happen, but it still happens when if the tree gets wide enough.

I would expect both cases to be equivalent,. However, the first test case passes, while the second fails.

For some reason, I'm only able to reproduce if initialStepsLeft is >= 4.

Why do you expect them to be equivalent? When you split the actions into stepNegative/stepPositive you are also placing weak fair constraints on each of the actions. This is not true in the case where the implementation of the action is thread through with (oneOf [..] >>=).

Hi, thanks for replying!

See #47 (comment), I no longer think this is related to NonDet specifically. The two-action model fails when the number of steps is 6, with a similar behaviour in the trace (a branch just hangs where it seems like it should be able to make progress.)

Below I discuss the NonDet thing anyway, more as a learning opportunity for myself 😅 I'm not super experienced in formal methods, my mental model might be mistaken here.

I expect that the weak fair constraint would require the action to eventually be taken if it's available, and that the action is available as long as it has at least 1 possible result.

So in the single-action case I'd expect that weak fairness would require us to eventually attempt takeStep and, if it's possible to go positive or negative, we would do so (in different worlds). I do not expect for the model to stutter forever in a state where it's possible for takeStep to make progress.

I'm not sure if this means "one weak-fair action with two possible results" should always be equivalent to "two weak-fair actions", but in this case it seems to me that the two traces should be equivalent.

Again, thanks for taking the time to respond!

I no longer think this is related to NonDet specifically. The two-action model fails when the number of steps is 6, with a similar behaviour in the trace (a branch just hangs where it seems like it should be able to make progress.)

The weak fairness constraint specified on two distinct action formula should imply that both are interleaved fairly (so that one action or stutter-steps cannot be applied infinitely many times)

the run here is stuttering infinitely, so we never get to satisfy that position reaches and stays as 0. This is also the case if we set the initial value of stepsLeft to be 6.

I really like this test though, so I've transcribed it in #48 modolo hspec. Thanks for taking the time to write it. I'll be sure to double check the specification later on today to make sure I agree with everything the model checker is asserting.

both are interleaved fairly (so that one action or stutter-steps cannot be applied infinitely many times)
...
the run here is stuttering infinitely

Hm, right, but I think I don't understand why this happens here. If we have weak fair actions A and B, and we're in a series of states where B cannot run but A can, we should be able to take multiple A steps, right?

I tried to reproduce the specs in TLA, see below. TLC seems to accept both models. Maybe there's some subtle distinction that I'm missing?

---------------------------- MODULE ConvergenceNonDetAction ----------------------------
EXTENDS TLC, Integers

abs(n) == IF n < 0 THEN -n ELSE n

VARIABLES position, stepsLeft

EventuallyAlwaysAtZero == <>[](position = 0 /\ stepsLeft \in {0,1})

vars == << position, stepsLeft >>

Init == (* Global variables *)
        /\ position = 0
        /\ stepsLeft \in 0..10

TakeStep == /\ stepsLeft > 0
            /\ \E newPosition \in {position - 1, position + 1}:
                 /\ abs(newPosition) < stepsLeft
                 /\ position' = newPosition
                 /\ stepsLeft' = stepsLeft - 1

Next == TakeStep

Spec == /\ Init /\ [][Next]_vars
        /\ WF_vars(TakeStep)

=============================================================================

And here with two weak-fair actions:

---------------------------- MODULE ConvergenceTwoActions ----------------------------
EXTENDS TLC, Integers

abs(n) == IF n < 0 THEN -n ELSE n

VARIABLES position, stepsLeft

EventuallyAlwaysAtZero == <>[](position = 0 /\ stepsLeft \in {0,1})

vars == << position, stepsLeft >>

Init == (* Global variables *)
        /\ position = 0
        /\ stepsLeft \in 0..10

StepNegative == /\ stepsLeft > 0
                /\ \E newPosition \in {position - 1}:
                    /\ abs(newPosition) < stepsLeft
                    /\ position' = newPosition
                    /\ stepsLeft' = stepsLeft - 1

StepPositive == /\ stepsLeft > 0
                /\ \E newPosition \in {position + 1}:
                    /\ abs(newPosition) < stepsLeft
                    /\ position' = newPosition
                    /\ stepsLeft' = stepsLeft - 1

Next == StepNegative \/ StepPositive

Spec == /\ Init /\ [][Next]_vars
        /\ WF_vars(StepNegative)
        /\ WF_vars(StepPositive)

=============================================================================

cfg file:

\* SPECIFICATION definition
SPECIFICATION
Spec
\* PROPERTY definition
PROPERTY
EventuallyAlwaysAtZero

I believe it may just be that the implementation of oneOf is folding over the empty alternative, causing a no-op action to be allowed; however, I still haven't had the time have a look for myself.

If this were true then I believe something along the lines foldr1 ((<|>) . pure) [succ, pred] would remove ... <|> empty and make these two specifications the same. Otherwise there is likely an issue in the model checker, but I don't suspect that to be the case.

if x <|> empty /= x then this seems like it violates some typeclass laws?

@gusbicalho we've come back around to this issue. Thank you for creating it (and sorry for the almost one-year-long delay).

@riz0id and I both agree there's a bug here, we think there might be two. @riz0id wrote a specification that more faithfully matches the TLA+ version you've given us (thanks for that example) and spectacle's exists isn't behaving correctly. We wrote a version that uses oneOf and that produces a trace we expect. However, the model checker doesn't appear to be correctly checking the <>[] (stays) formula and we think we know where the bug is (checkStays is not actually the composition of checkInfinitely and checkAlways!)

I'm taking a stab at fixing it, then we will probably fix the exists problem next.

Notably, we haven't written any specifications for Spectacle that exercised the checkStays model checking function until this issue! So, thanks.

Hi, glad to see that you're going back to work on spectacle!
Back when I open the issue I was evaluating whether to use it for a model at my job, but in the end we end up going with TLA+ due to this bug. I was bummed, because the experience of using spectacle (with all the power of Haskell) was so great.
When you get this fixed, I look forward to giving it another try!