Yesod is catching Timeout, which breaks the timeout middleware
pbrisbin opened this issue · 10 comments
It seems Yesod is catching the Timeout
exception used by the standard timeout
function and turning it into a 500 instead of letting to get back to timeout
. This means that when we use the Timeout
middleware, instead of getting a 503 as desired, we get see InternalError "<<timeout>>"
come through as a 500.
Steps
Given the following executable:
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
module Main where
import Control.Concurrent (threadDelay)
import Data.Maybe (fromMaybe)
import Data.Text (Text)
import Network.HTTP.Types.Status (status200)
import Network.Wai (responseLBS)
import Network.Wai.Handler.Warp as Warp
import Network.Wai.Middleware.Timeout as Middleware
import System.Environment (lookupEnv)
import Text.Read (readMaybe)
import Yesod hiding (respond)
data App = App
mkYesod "App" [parseRoutes|
/ RootR GET
|]
getRootR :: Handler Text
getRootR = do
liftIO $ threadDelay $ 3 * 1000000
pure "Hello word\n"
instance Yesod App
simpleApp :: Application
simpleApp _req respond = do
threadDelay $ 3 * 1000000
respond $ responseLBS status200 [] "Hello world\n"
main :: IO ()
main = do
app <- do
x <- lookupEnv "YESOD"
if x == Just "1" then toWaiAppPlain App else pure simpleApp
port <- fromMaybe 3000 . (readMaybe =<<) <$> lookupEnv "PORT"
Warp.run port $ Middleware.timeout 2 app
Expected behavior
When run using the simpleApp
, it does as expected (a 503):
stack exec -- timeout-example
% curl -D - localhost:3000
HTTP/1.1 503 Service Unavailable
Transfer-Encoding: chunked
Date: Mon, 27 Jun 2022 20:35:32 GMT
Server: Warp/3.3.21
Actual behavior
When run using App
with yesod, the Timeout
turns into a 500:
% YESOD=1 stack exec -- timeout-example
27/Jun/2022:16:40:38 -0400 [Error#yesod-core] <<timeout>> @(yesod-core-1.6.23.1-BngrWz5NDXK4xc2KyG7z7M:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:691:6)
% curl -D - localhost:3000
HTTP/1.1 500 Internal Server Error
Date: Mon, 27 Jun 2022 20:40:38 GMT
Server: Warp/3.3.21
Content-Length: 157
Content-Type: text/html; charset=utf-8
Set-Cookie: _SESSION=vYgezGHwk6Qauaj0RdmWc3XmIZh0HaWbZ7phPFzTHrdoLpzE/UrxshoB+rp1pgMbzE5PUMWbb65dIfdCY30rYn9QWkh21VocKCGttI0FAgzT1lH9F4Owj2sWZ/PqFfBq0Daylw0SHfGt230IZZ18oT5je1RR1CGjD00WkZlY3KrUivN0y2E=; Path=/; Expires=Mon, 27-Jun-2022 22:40:36 GMT; HttpOnly
Vary: Accept, Accept-Language
X-XSS-Protection: 1; mode=block
<!DOCTYPE html>
<html><head><title>Internal Server Error</title></head><body><h1>Internal Server Error</h1>
<pre><<timeout>></pre>
</body></html>%
Versions
The above does not reproduce in lts-19.0, and appears in lts-19.1. From that, I assumed the bug is in yesod-core-1.6.22.0
. I eventually found this code, which appears kind of intentional.
Yes this indeed looks intentional, #1753
it's better if yesod handles [async exceptions] so that we get a nice rendered error page.
I think Timeout
was an oversight, but I agree with this benefit generally. I'm not sure if we can have it both ways?
I've found another circumstance in which this has broken our error-handling/reporting:
We see a lot of InvalidRequest Warp: Client closed connection prematurely
noise in our logs starting with this update. Warp has always considered this a "normal" exception not worth reporting (and I agree),
By Yesod taking over and not letting it get to Warp, the defaultShouldDisplayException
behavior (which we leveraged to keep this noise out of our logs) no longer works.
Only thing I can think of is adding a specific list of exceptions inside Yesod that it should not catch and should instead propagate.
Well, I've been thinking about this quite a bit. If it were up to me, the PR should be reverted based on the advice in your blog post on the matter:
[W]ith asynchronous exceptions, you never want to recover ... If you swallow those exceptions ... you break the very nature of the async exception mechanism. [W]ith async exceptions, you are allowed to clean up, but never recover.
I'd considering rescuing to the framework's error page a form of "recovery". And I think violating that rule will continue to run us into edge-case trouble. Timeout
is just the canary in the coal mine.
Your original justification for this fix was,
I can't think of a good reason to let async exceptions propagate in this kind of case, since ultimately Warp will catch and (poorly) handle those anyway
I think Timeout
is a "good reason" and I've also mentioned that we actually want Warp to handle them because it owns more knowledge about when things can be safely ignored. If you find Warp's handling "poor", you can modify onException
. I'd dig into the user-requirement that async exceptions render "my Yesod error page" and see if maybe "a better error page" is acceptable.
Correct me if I'm wrong, but I think the original Issue (#1751) could also be solved in user code through Yesod middleware,
instance Yesod App where
yesodMiddleware h =
defaultYesodMiddleware h `catchSyncOrAsync` \ex ->
-- render your 500 page
We could even supply an out of the box middleware for it, and even include it in defaultYesodMiddleware
. That seems like a relatively elegant way to make this an opt-in/out for users.
That's just my 2c though. A safe-list approach would certainly solve the Timeout
case, which seems like the most pressing "bug".
I see now the original issue already mentioned it's going against the blog post's advice, and the complexity with re-throwing AllocationLimitExceeded
in particular means my middleware suggestion wouldn't work and they'd end up with the workaround they were already doing anyway. So I guess I'm not really raising anything new here, apologies.
I honestly don't feel great about any of this. Pinging @jappeace. @pbrisbin is raising some valid concerns with the previous async exception behavior change. Here's how I'd summarize the situation:
- Generally, catching an async exception is a bad thing
- However, in the web server context, that's not true. Since each connection is handled by its own green thread, catching those async exceptions and turning them into valid responses to the client makes all the sense in the world.
- The issue here is: who should do the catching?
I can't really say that either is "more correct" than the other. A configuration option to determine which async exceptions (if any) should be caught by Yesod is the best middle ground I can think of.
Thanks for the summary Michael!
in the web server context, that's not true. ... catching those async exceptions and turning them into valid responses to the client makes all the sense in the world.
FWIW, I think we should be careful to distinguish Yesod as the web framework and Warp as the web server. I agree with you that the server bounding the request thread in error handling and ensuring valid client responses makes complete sense. Which is why I consider the framework taking that ability away from the server an over-reach.
That said,
A configuration option to determine which async exceptions (if any) should be caught by Yesod
Would absolutely work for us.
Making it configurable would also work for us.
I think we don't want to catch the connection closed exception (doesn't matter since the connection is closed after all), but we do want to catch the timeout one.
Cool. Would one of you be interested in submitting a PR with this kind of change?
yes, I"ll try doing this wensday