gregwebs/Shelly.hs

asyncSh is prone to output corruption

maurges opened this issue · 3 comments

Script to reproduce:

#!/usr/bin/env stack
{- stack runhaskell --resolver=lts-19.7 --system-ghc --package=shelly --package=text --package=async -}
{-# LANGUAGE OverloadedStrings, ExtendedDefaultRules  #-}
{-# OPTIONS_GHC -fno-warn-type-defaults #-}
import Shelly
import qualified Data.Text as T
default (T.Text)

main = shelly $ do
    asyncSh $ echo "a quick brown fox jumped over the lazy dog"
    asyncSh $ echo "a quick brown fox jumped over the lazy dog"
    asyncSh $ echo "a quick brown fox jumped over the lazy dog"
    asyncSh $ echo "a quick brown fox jumped over the lazy dog"
    asyncSh $ echo "a quick brown fox jumped over the lazy dog"
    asyncSh $ echo "a quick brown fox jumped over the lazy dog"
    asyncSh $ echo "a quick brown fox jumped over the lazy dog"
    asyncSh $ echo "a quick brown fox jumped over the lazy dog"
    asyncSh $ echo "a quick brown fox jumped over the lazy dog"
    asyncSh $ echo "a quick brown fox jumped over the lazy dog"
    asyncSh $ echo "a quick brown fox jumped over the lazy dog"

On my macbook pro m1 it prints:

[morj<2>@MacBook-Pro]:~/Projects/temp> ./test.hs
a quick brown fox jumped over thaeaaaaaaaaa      quic[morj<2>@MacBook-Pro]:~/Projects/temp>

It would be great if subprocess output would be buffered line by line, so that at least inside the lines the output is normal.

After digging for a little bit I found that shelly actually does cache the output linewise! The problem is that it's logged via Data.Text.IO.putStrLn, which corrupts it. So this is partly GHC's fault.

Here's how one can force it to not jamble:

#!/usr/bin/env stack
{- stack runhaskell --resolver=lts-19.7 --system-ghc --package=shelly --package=text --package=async -}
{-# LANGUAGE OverloadedStrings, ExtendedDefaultRules  #-}
{-# OPTIONS_GHC -fno-warn-type-defaults #-}
import Shelly
import Control.Monad.IO.Class (MonadIO)
import qualified Data.Text as T
import qualified Data.Text.IO as T
import Control.Concurrent.MVar (newMVar, withMVar)
import Control.Concurrent.Async (wait, Async)
default (T.Text)

main = do
    mutex <- newMVar ()
    let logOutput msg = withMVar mutex $ \() -> T.putStrLn msg
    shelly . log_stdout_with logOutput . log_stderr_with logOutput $ do
        a1 <- asyncSh $ cmd "echo" "1 a quick brown fox jumped over the lazy dog\n"
        a2 <- asyncSh $ cmd "echo" "2 a quick brown fox jumped over the lazy dog"
        a3 <- asyncSh $ cmd "echo" "3 a quick brown fox jumped over the lazy dog"
        a4 <- asyncSh $ cmd "echo" "4 a quick brown fox jumped over the lazy dog"
        a5 <- asyncSh $ cmd "echo" "5 a quick brown fox jumped over the lazy dog"
        a6 <- asyncSh $ cmd "echo" "6 a quick brown fox jumped over the lazy dog"
        a7 <- asyncSh $ cmd "echo" "7 a quick brown fox jumped over the lazy dog"
        a8 <- asyncSh $ cmd "echo" "8 a quick brown fox jumped over the lazy dog"
        a9 <- asyncSh $ cmd "echo" "9 a quick brown fox jumped over the lazy dog"
        a10 <- asyncSh $ cmd "echo" "10 a quick brown fox jumped over the lazy dog"
        a11 <- asyncSh $ cmd "echo" "11 a quick brown fox jumped over the lazy dog"
        l <- awaitAll [a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11]
        echo "done"

awaitAll :: MonadIO m => [Async a] -> m [a]
awaitAll [] = pure []
awaitAll (x:xs) = (:) <$> liftIO (wait x) <*> awaitAll xs

(I also inserted some logic to wait for all output)

But here we run into another problem that shelly's own internal functions (and the 'echo' family) don't respect these log settings and just do putStrLn regardless!

I think this locking behaviour should be the default, and that all internal shelly logs should also be protected with locks.

Thanks for the investigation and the reproducer!

I think this locking behaviour should be the default, and that all internal shelly logs should also be protected with locks.

Would such a change be backwards-compatible?

Ok, what I propose is a change to behaviour like this:

  1. Executing the 'shelly' function creates a new mutex object for all 'Sh' operations.
  2. Every logging operation: echo, inspect, calls to sPutStdout in run_ and foldLines - should lock this mutex for the duration of the putstr call itself.

This does not change the function API. This does slightly change the behaviour of the functions, but I believe that no one relied on the old behaviour anyway.

In more details:

This doesn't change the behaviour of single-threaded scripts, since for them no logs occur simultaneously, and so they can't detect the locks at all. I think this is the most of the shelly users: if more people used multithreaded, someone would have open this issue sooner. =)

This changes the behaviour of multi-threaded scripts: it changes the atomicity of the logging operation. It's not easy to find the exact details on atomic guarantees of putStrLn, but it's either byte-atomic or char-atomic. With locks for the duration of putStrLn, the operation itself becomes atomic, or you might say it's line-atomic since shelly operates in terms of lines. In terms of defined-behaiour, it doesn't change with this addition since interleaving non-atomic operations yields undefined order, which might as well be line-atomic.
It also changes the atomicity of custom sPutStdout/sPutStderr functions: calling to them is now mutually exclusive. It means that if a user expects one log to occur during another log, they may be stuck waiting forever. This might deadlock scripts written with this expectation in mind, but I don't think anyone actually wrote scripts like that.

So, who would be affected by this change? Of those who wrote multithreaded scripts, of those of them who used custom sPutStdout, those who relied on one sPutStdout to occur during another sPutStdout.