kazu-yamamoto/logger

Check-then-act race condition for short log entries in concurrent logging, results in entries being not in order

Closed this issue · 3 comments

Synopsis

Due to check-then-act race condition and not enough synchronization in https://hackage.haskell.org/package/fast-logger-3.1.1/docs/src/System.Log.FastLogger.Logger.html#pushLog in otherwise case, if the log messages are shorter than buffer size and there are multiple threads logging messages at once using a single buffer, messages are not logged in order.

Reproduction conditions

  1. Use threaded runtime, GHC 8.10.7
  2. Create a logger (e.g. newFDLoggerSet) with a single buffer (Just 1) with a big buffer size. Default 4096 should be enough.
  3. Start logging from multiple threads messages significantly shorter than the buffer size. Log the thread identifier t e.g.:
numCapabilities <- getNumCapabilities
let concurrency = numCapabilities * 2500
    logEntriesCount = 100
forConcurrently_ [1 .. concurrency] $ \t ->
  for_ [1 .. logEntriesCount] $ \i ->
    pushLogStrLn loggerSet $ "thread id: " <> (show t) <> " log line nr: " <> (show i)

t is assumed to be a thread identifier in the further discussion (do not confuse with ThreadId).

Expected result

Messages from each thread (with the same thread identifier t) appear in order.

Expected result

There are a few messages in a single thread (the same t) that appear not in order.

Analysis

This is the problematic code fragment of pushLog function linked above:

| otherwise = do
    mmsg <- atomicModifyIORef' ref checkBuf
    case mmsg of
        Nothing  -> return ()
        Just msg -> withMVar mbuf $ \buf -> writeLogStr fdref buf size msg
  where
    checkBuf ologmsg@(LogStr olen _)
      | size < olen + nlen = (nlogmsg, Just ologmsg)
      | otherwise          = (ologmsg <> nlogmsg, Nothing)

if two threads execute atomicModifyIORef' and for both of them checkBuf matches the first guard, the olomsg will get flushed in both cases and we don't have any guarantees about the order in which writeLogStr will be executed for both threads.

OK. I understand.
One approach to fix this is prepare a write queue and a writer thread for each buffer.
Are there anything else?

@ruicc @khibino You might be interested in this issue. In short, atomicModifyRef' is done in order but there is no guarantee to execute writeLogStr in order.

ruicc commented

Ordering is not a big problem for me because log services I'm using (e.g. datadog, bigquery) can handle it.