mirage/mirage-tcpip

unikernels crash on receipt of malformed TCP options

Closed this issue · 8 comments

Unikernels running with the direct stack crash upon receipt of a TCP packet that contains a variable-length TCP option, when the length is set to 0. For example, setting the TCP options bytefield to 01 02 00 00 (padding, MSS with length 0, end-of-options) and sending a SYN packet to an open port on a unikernel running the example services code results in the following console output:

mm.c: Cannot handle page request order 8!
Fatal error: out of memory.
exit: 2
kernel.c: Do_exit called!

The unikernel then exits.

I'm looking into the root cause, which I expect is somewhere in Options.unmarshal.

avsm commented

This is a really interesting bug. A quick code review in mirage-tcpip reveals this:

let unmarshal buf =
  let open Cstruct in
  let i = iter
    (fun buf ->
      match get_uint8 buf 0 with
      |0 -> None   (* EOF *)
      |1 -> Some 1 (* NOP *)
      |n -> Some (get_uint8 buf 1)
    )
    (fun buf ->
      match get_uint8 buf 0 with
      |0 -> assert false
      |1 -> Noop
      |2 -> MSS (BE.get_uint16 buf 2)

We don't respect the length of the MSS field here, and so assume that we can read a length of 2 bytes from the Cstruct. Ordinarily, this should result in a bounds check failure since the Cstruct view has a smaller size. So why didn't we get a bounds check failure in this case?

The BE.get_uint16 comes from the ocplib-endian library, which does optimized conversion to-and-from Bigarrays and OCaml integers. In 4.00.1, it runs normal OCaml code, but in 4.01.0, compiler-builtin functions are provided which can do this parsing without allocating a lot of intermediate values. My hypothesis (which needs to be confirmed) is that in OCaml 4.01.0, the bounds checking is not performed, resulting in junk memory being read as a 16-bit integer. We then attempt to allocate a buffer of this (potentially very large) MSS to send traffic in, resulting in the VM running out of memory and crashing.

This is a relatively benign bug (without type safety, it could easily be possible to corrupt memory allocator data structures, but is "just" a denial of service as-is), but it's extremely bad practise to read memory locations that we should not have access to.

Things it would be good to verify and/or fix in the code:

  • In OCaml 4.00.1, does this result in a bounds check exception?
  • Why don't we enforce a maximum sensible MSS size in options.ml?
  • This would be an excellent regression test in cstruct to check if BE.get_* functions do respect the bounds of cstruct views.

The actual packet used in this report isn't a valid TCP packet, so I'm not too worried about just rejecting it. Ideally without a crash :-)

mor1 commented

+1*several billion

concur this is an excellent case for a unit/regression test in cstruct -- this kind of behaviour is part of the contract between developer and library/api in that case, and certainly should be ensured by all means necessary...

On 07/02/2014 07:56 AM, Anil Madhavapeddy wrote:

This is a really interesting bug. A quick code review in
|mirage-tcpip| reveals this:

[snip]

Things it would be good to verify and/or fix in the code:

  • In OCaml 4.00.1, does this result in a bounds check exception?
  • Why don't we enforce a maximum sensible MSS size in |options.ml|?
  • This would be an excellent regression test in |cstruct| to check
    if |BE.get_*| functions do respect the bounds of cstruct views.

I have a fix for the length checking in Options.ml - in implementing
this I noticed that other options also need length (and sanity) checking
and got a little ratholed. PR coming soon.

I'm checking now to see whether this behavior occurs with a 4.00.1 compiler.

The actual packet used in this report isn't a valid TCP packet, so I'm
not too worried about just rejecting it. Ideally without a crash :-)

I concur!

-Mindy

A unikernel compiled in a 4.00.1 environment (via opam switch) also crashes on receipt of this packet with

mm.c: Cannot handle page request order 8!
Fatal error: out of memory.
exit: 2
kernel.c: Do_exit called!

@avsm our use of cstruct is encapsulated in try .. with (which then produces a monadic fail -- thus we don't do length checks in code, but we catch the exceptions https://github.com/mirleft/ocaml-tls/blob/master/lib/reader.mli and https://github.com/mirleft/ocaml-tls/blob/master/lib/reader.ml#L21 )

avsm commented

The bug is due to the Cstruct.BE and LE modules not checking that the length of their view has been exceeded. There's a bound check error if the overall buffer is violated. In the case here, we are reading into the rest of the tcp packet

On 2 Jul 2014, at 18:40, Mindy Preston notifications@github.com wrote:

A unikernel compiled in a 4.00.1 environment (via opam switch) also crashes on receipt of this packet with

mm.c: Cannot handle page request order 8!
Fatal error: out of memory.
exit: 2
kernel.c: Do_exit called!

Reply to this email directly or view it on GitHub.

avsm commented

Started fixing the underlying issue in mirage/ocaml-cstruct#25

avsm commented

Cstruct 1.3.0 is now in ocaml/opam-repository#2322