unikernels crash on receipt of malformed TCP options
Closed this issue · 8 comments
Unikernels running with the direct
stack crash upon receipt of a TCP packet that contains a variable-length TCP option, when the length is set to 0. For example, setting the TCP options bytefield to 01 02 00 00 (padding, MSS with length 0, end-of-options) and sending a SYN packet to an open port on a unikernel running the example services
code results in the following console output:
mm.c: Cannot handle page request order 8!
Fatal error: out of memory.
exit: 2
kernel.c: Do_exit called!
The unikernel then exits.
I'm looking into the root cause, which I expect is somewhere in Options.unmarshal
.
This is a really interesting bug. A quick code review in mirage-tcpip
reveals this:
let unmarshal buf =
let open Cstruct in
let i = iter
(fun buf ->
match get_uint8 buf 0 with
|0 -> None (* EOF *)
|1 -> Some 1 (* NOP *)
|n -> Some (get_uint8 buf 1)
)
(fun buf ->
match get_uint8 buf 0 with
|0 -> assert false
|1 -> Noop
|2 -> MSS (BE.get_uint16 buf 2)
We don't respect the length of the MSS field here, and so assume that we can read a length of 2 bytes from the Cstruct. Ordinarily, this should result in a bounds check failure since the Cstruct view has a smaller size. So why didn't we get a bounds check failure in this case?
The BE.get_uint16
comes from the ocplib-endian library, which does optimized conversion to-and-from Bigarrays and OCaml integers. In 4.00.1, it runs normal OCaml code, but in 4.01.0, compiler-builtin functions are provided which can do this parsing without allocating a lot of intermediate values. My hypothesis (which needs to be confirmed) is that in OCaml 4.01.0, the bounds checking is not performed, resulting in junk memory being read as a 16-bit integer. We then attempt to allocate a buffer of this (potentially very large) MSS to send traffic in, resulting in the VM running out of memory and crashing.
This is a relatively benign bug (without type safety, it could easily be possible to corrupt memory allocator data structures, but is "just" a denial of service as-is), but it's extremely bad practise to read memory locations that we should not have access to.
Things it would be good to verify and/or fix in the code:
- In OCaml 4.00.1, does this result in a bounds check exception?
- Why don't we enforce a maximum sensible MSS size in
options.ml
? - This would be an excellent regression test in
cstruct
to check ifBE.get_*
functions do respect the bounds of cstruct views.
The actual packet used in this report isn't a valid TCP packet, so I'm not too worried about just rejecting it. Ideally without a crash :-)
+1*several billion
concur this is an excellent case for a unit/regression test in cstruct
-- this kind of behaviour is part of the contract between developer and library/api in that case, and certainly should be ensured by all means necessary...
On 07/02/2014 07:56 AM, Anil Madhavapeddy wrote:
This is a really interesting bug. A quick code review in
|mirage-tcpip| reveals this:[snip]
Things it would be good to verify and/or fix in the code:
- In OCaml 4.00.1, does this result in a bounds check exception?
- Why don't we enforce a maximum sensible MSS size in |options.ml|?
- This would be an excellent regression test in |cstruct| to check
if |BE.get_*| functions do respect the bounds of cstruct views.
I have a fix for the length checking in Options.ml - in implementing
this I noticed that other options also need length (and sanity) checking
and got a little ratholed. PR coming soon.
I'm checking now to see whether this behavior occurs with a 4.00.1 compiler.
The actual packet used in this report isn't a valid TCP packet, so I'm
not too worried about just rejecting it. Ideally without a crash :-)I concur!
-Mindy
A unikernel compiled in a 4.00.1 environment (via opam switch
) also crashes on receipt of this packet with
mm.c: Cannot handle page request order 8!
Fatal error: out of memory.
exit: 2
kernel.c: Do_exit called!
@avsm our use of cstruct is encapsulated in try .. with (which then produces a monadic fail -- thus we don't do length checks in code, but we catch the exceptions https://github.com/mirleft/ocaml-tls/blob/master/lib/reader.mli and https://github.com/mirleft/ocaml-tls/blob/master/lib/reader.ml#L21 )
The bug is due to the Cstruct.BE and LE modules not checking that the length of their view has been exceeded. There's a bound check error if the overall buffer is violated. In the case here, we are reading into the rest of the tcp packet
On 2 Jul 2014, at 18:40, Mindy Preston notifications@github.com wrote:
A unikernel compiled in a 4.00.1 environment (via opam switch) also crashes on receipt of this packet with
mm.c: Cannot handle page request order 8!
Fatal error: out of memory.
exit: 2
kernel.c: Do_exit called!
—
Reply to this email directly or view it on GitHub.
Started fixing the underlying issue in mirage/ocaml-cstruct#25
Cstruct 1.3.0 is now in ocaml/opam-repository#2322