ivanperez-keera/dunai

strange asymmetry in composition of MSF (MaybeT m) a b using <|>

tort opened this issue · 2 comments

tort commented

dunai-0.12.1

We are going to combine the following arrows using <|>

:{
constMEmpty :: MSF (MaybeT Identity) Int Int
constMEmpty = constM empty
constMPure :: MSF (MaybeT Identity) Int Int
constMPure = constM $ pure 3
msf1 :: MSF (MaybeT Identity) Int Int
msf1 = arrM $ \x -> if x <= 5 then pure x else empty
msf2 :: MSF (MaybeT Identity) Int Int
msf2 = arrM $ \x -> if x >= 5 then pure x else empty
:}

The following looks perfectly expectable

λ: embed (constMEmpty <|> constMPure) [1..10]
MaybeT (Identity (Just [3,3,3,3,3,3,3,3,3,3]))
it :: MaybeT Identity [Int]

λ: embed (constMPure <|> constMEmpty) [1..10]
MaybeT (Identity (Just [3,3,3,3,3,3,3,3,3,3]))
it :: MaybeT Identity [Int]

The following is very strange. Why msf1 failure is not covered by never failing alternative?

λ: embed (msf1 <|> constMPure) [1..10]
MaybeT (Identity Nothing)
it :: MaybeT Identity [Int]

λ: embed (constMPure <|> msf1) [1..10]
MaybeT (Identity (Just [3,3,3,3,3,3,3,3,3,3]))
it :: MaybeT Identity [Int]

How come msf2 behaves differently from msf1? They both succeed only on a part of the values range

λ: embed (msf2 <|> constMPure) [1..10]
MaybeT (Identity (Just [3,3,3,3,3,3,3,3,3,3]))
it :: MaybeT Identity [Int]

λ: embed (constMPure <|> msf2) [1..10]
MaybeT (Identity (Just [3,3,3,3,3,3,3,3,3,3]))
it :: MaybeT Identity [Int]

In the same time <|> seems to work correct

λ: fmap fst $ runIdentity $ runMaybeT $ unMSF (msf1 <|> constMPure) 1
Just 1
it :: Maybe Int
(0.01 secs, 902,072 bytes)
λ: fmap fst $ runIdentity $ runMaybeT $ unMSF (msf1 <|> constMPure) 6
Just 3
it :: Maybe Int
(0.01 secs, 902,416 bytes)
λ: fmap fst $ runIdentity $ runMaybeT $ unMSF (constMPure <|> msf1) 1
Just 3
it :: Maybe Int
(0.01 secs, 902,064 bytes)

It's a miracle

λ: embed (msf1 <|> constMPure) [5]
MaybeT (Identity (Just [5]))
it :: MaybeT Identity [Int]
(0.01 secs, 919,656 bytes)
λ: embed (msf1 <|> constMPure) [5..6]
MaybeT (Identity Nothing)
it :: MaybeT Identity [Int]
(0.02 secs, 917,976 bytes)
λ: embed (msf1 <|> constMPure) [6]
MaybeT (Identity (Just [3]))
it :: MaybeT Identity [Int]
(0.01 secs, 919,976 bytes)
λ: embed (msf1 <|> constMPure) [6..7]
MaybeT (Identity (Just [3,3]))
it :: MaybeT Identity [Int]
(0.01 secs, 922,512 bytes)

Looks like an issue.

I can see that this is unexpected. But I think it's not a bug. Let's see how one of your examples evaluates. (I'll simplify the type signature a bit)

-- Plug in the definition of embed
embed (msf1 <|> constMPure) [5..6]
= do
  (b, sf') <- unMSF (msf1 <|> constMPure) 5
  bs       <- embed sf' [6]
  return (b:bs)

-- Plug in the definition of <|> for MSFs.
= do
  (b, sf') <- unMSF msf1 5 `mplus` unMSF constMPure 5
  bs       <- embed sf' [6]
  return (b:bs)

-- Plug in the definition of your examples
= do
  (b, sf') <- unMSF (arrM $ \x -> if x <= 5 then pure x else empty) 5 `mplus` unMSF (constM $ pure 3) 5
  bs       <- embed sf' [6]
  return (b:bs)

-- Expand arrM for both cases
= do
  (b, sf') <- (do b <- if 5 <= 5 then pure 5 else empty; return (b, (arrM $ \x -> if x <= 5 then pure x else empty)))
                    `mplus`
              (do b <- pure 3; return (b, constM $ pure 3))
  bs       <- embed sf' [6]
  return (b:bs)

-- For msf1, we get a Just this time, for constMPure we always get a Just
= do
  (b, sf') <- (do b <- Just 5; return (b, (arrM $ \x -> if x <= 5 then pure x else empty)))
                    `mplus`
              (do b <- Just 3; return (b, constM $ pure 3))
  bs       <- embed sf' [6]
  return (b:bs)

-- Reduce Monad and MonadPlus instances for Maybe
= do
  (b, sf') <- (return (5, (arrM $ \x -> if x <= 5 then pure x else empty)))
                    `mplus`
              (return (3, constM $ pure 3))
  bs       <- embed sf' [6]
  return (b:bs)
= do
  (b, sf') <- (return (5, (arrM $ \x -> if x <= 5 then pure x else empty)))
  bs       <- embed sf' [6]
  return (b:bs)

-- First value is completed. Whooops, where is the second MSF gone??
= do
  bs       <- embed (arrM $ \x -> if x <= 5 then pure x else empty) [6]
  return (5:bs)

-- Let's go on evaluating embed on the first MSF, this is analogous to the earlier steps
= do
  (b, sf') <- unMSF (arrM $ \x -> if x <= 5 then pure x else empty) 6
  bs       <- embed sf' []
  return (5:b:bs)
= do
  (b, sf') <- (do b <- if 6 <= 5 then pure 6 else empty; return (b, (arrM $ \x -> if x <= 5 then pure x else empty)))
  bs       <- embed sf' []
  return (5:b:bs)

-- 6 <= 5 is False, hence everything left collapses to Nothing.
= do
  (b, sf') <- (do b <- Nothing; return (b, (arrM $ \x -> if x <= 5 then pure x else empty)))
  bs       <- embed sf' []
  return (5:b:bs)
= do
  (b, sf') <- Nothing
  bs       <- embed sf' []
  return (5:b:bs)
= Nothing

This is very verbose to read through in detail, so let me try and summarize it in words.

  1. Call unMSF with the first input value on msf1 <|> constMPure.
  2. This returns (up to expansion and aliases) Just (5, msf1) <|> Just (3, constMPure). Yes, the result is the output value and the continuation together, covered in a Maybe.
  3. <|> on Maybe chooses the leftmost value. That is, if there are several branches, they are not explored in parallel, but they are collapsed onto the leftmost branch that is present.
  4. Later, msf1 turns into Nothing, so the whole evaluation becomes Nothing. (dunai doesn't have a concept of "rollback" or "backtrack" to an earlier branch that would have succeeded.)

The takeaways are:

  • Maybe only allows at most one successful path. If there are two paths, the leftmost is chosen, even if it later turns out to be unsuccessful.
  • There is no backtracking in the time direction.

You might see the behaviour you're hoping for by switching Maybe for ListT, but the usual caveat applies: The old-fashioned ListT is not a monad, and it is hard removed from the ecosystem in GHC 9.6 onwards. You might want to use https://hackage.haskell.org/package/free-listt (shameless self-advertisement) or https://hackage.haskell.org/package/list-transformer/ or some other similar package, but you would have to write widthFirst or similar functions to escape the ListT context yourself. Do feel free to open a discussion and ping me there if you have questions how that might work, once you've chosen your monad.

tort commented

Thank you for the answer! Really appreciate the effort you put into the explanation. I managed to notice that <|> evaluates only once, but could not understand what is the meaning of <|> for MSF (MaybeT m).

  • Maybe only allows at most one successful path. If there are two paths, the leftmost is chosen, even if it later turns out to be unsuccessful.

This is exactly the interpretation i was looking for.

  • There is no backtracking in the time direction.

This one i have yet to understand.

You might see the behaviour you're hoping for by switching Maybe for ListT, but the usual caveat applies: The old-fashioned ListT is not a monad, and it is hard removed from the ecosystem in GHC 9.6 onwards. You might want to use https://hackage.haskell.org/package/free-listt (shameless self-advertisement) or https://hackage.haskell.org/package/list-transformer/ or some other similar package, but you would have to write widthFirst or similar functions to escape the ListT context yourself. Do feel free to open a discussion and ping me there if you have questions how that might work, once you've chosen your monad.

Thank you for the hint! As it was easy to build intuition about running independent behaviors with Applicative (MSF m a), i hoped that Alternative (MSF m a) would provide means of choice on the result of an arrow without running all the arrows with subsequent foldr ((<|>) @maybe) Nothing . I'll definitely check the packages you suggested.