Handling panics & timeouts
bkolobara opened this issue · 3 comments
Currently the process handling the request (middleware & handler) is the one holding the TcpStream
, so if it fails (panics) the browser will not get a response.
We should introduce a "supervisor" process. This process wouldn't be of the Supervisor
type as it wouldn't have generic behaviour.
- I would spawn it as a
AbstractProcess
. - It should set
host::api::process::die_when_link_dies(1)
in theinit
method and spawn a linked sub process. - Both the sub-process (handler) and the "supervisor" should hold onto the
TcpStream
. - The sub-process should use the stream to parse the incoming request data.
- Once the
Response
data is available, it should send it as a message to the supervisor. - The supervisor then should write the Response to the
TcpStream
. - If at any time the sub-process should fail (link breaks and
handle_link_trapped
gets called), the supervisor should return a500 Internal Server Error
. - We could also use the newly added
send_after
function, so that the supervisor can send a timeout message to itself. If the sub-process doesn't finish in lets say 60 sec, the supervisor should write back to the browserRequest Timed Out
and panic so it kills the linked sub-process too. - We could also use this mechanism to put memory and compute limitations on the sub-process by spawning it with a specific configuration (ProcessConfig)
Once the Response data is available, it should send it as a message to the supervisor.
I think sending back the response, even encoded as Vec<u8>
is quite costly in terms of performance, right? Unless we add something like Erlang's immutable binary data to just pass around pointers of large data. I think it's better if the handler writes to the response.
I think sending back the response, even encoded as
Vec<u8>
is quite costly in terms of performance, right? Unless we add something like Erlang's immutable binary data to just pass around pointers of large data. I think it's better if the handler writes to the response.
I was thinking the same, but it would not work in practice to have two writers. For example, if the supervisor has a timeout of 60 sec. It will send itself a message using send_after
that is delayed for 60 seconds. The sub-process might already start writing something to the TcpStream
and suddenly the supervisor is also writing a 408 Request Timed Out
mixed together to the same stream. The response needs to stay atomic and only answered by one process.
As you mentioned, in the future we will need a way to share bigger buffers with low overhead in lunatic. So, it's a problem that we will need to solve once this becomes a bottleneck, but I think we need to solve it in a more general way and not just as part of this web framework. I have some ideas on how to do this already that could also be combined with vectored i/o and avoiding serialisation and it would be completely safe to do so in scenarios like this, where one process is "ending" and giving up control over its linear memory to another.
Philipax was doing something similar in the past and turns out for a 4MB response it only takes 4ms.
Philpax — 05/23/2022
Switched over toserde_bytes
and it's much better now, thank you!
4 milliseconds, 4743875 bytes
might be nice to have that in a FAQ somewhere
I would just go with a simple solution for now, before we start optimising for message size.
Yeah, having just a single writer is of course better. But I was thinking maybe the supervisor could receive a start_writing
request and "allow" the handler to write. That being said, if it's only 4ms for a 4MB response we can just send a message to the supervisor and that should not be a problem for a long time. If/when we later optimise sharing large data in the vm this problem will cease to exist.
We should however look into using serde_bytes
for encoding the data for this