google/flatbuffers

Rust: various unsafe traits/functions meant for gencode in public API

TethysSvensson opened this issue ยท 28 comments

The following safe code will segfault:

fn main() {
    let ptr: &&u8 = flatbuffers::follow_cast_ref(&[1, 2, 3, 4, 5, 6, 7, 8], 0);
    println!("{}", **ptr);
}

Thanks for documenting these. I'm going to close the other issues and combine them with this one:

  • #6628 - Follow<'a> for &'a str
  • #6629 - Follow for slices
  • #6630 - SafeSliceAccess should be unsafe.
  • #6631 - emplace_scalar_array
  • #6632 - Push for scalars

For most of these, since these are typically used within the Flatbuffers library functions or generated code and I think the usual usage is safe, I'm not going to make fixing this the highest priority. The safety model we're going for is that flatbuffers are constructed with a FlatBufferBuilder and read from the root_* variants -- those APIs must be safe. We could mark the above types/traits them as unsafe to be rigorous, and/or move them to a module like flatbuffers::for_gencode_only which will make their internal use more obvious (though obviously that'd still be sub optimal).

I do want to deprecate SafeSliceAccess, though. That one was an especially bad idea, and getting unaligned access is in the public API (sigh).

Let's try to fix this for the 3.0 release #6636

This is quite serious:

namespace MyGame.Sample;

enum Color:byte { Red = 0, Green, Blue = 2 }

is compiled to

// automatically generated by the FlatBuffers compiler, do not modify



use std::mem;
use std::cmp::Ordering;

extern crate flatbuffers;
use self::flatbuffers::{EndianScalar, Follow};

#[allow(unused_imports, dead_code)]
pub mod my_game {

  use std::mem;
  use std::cmp::Ordering;

  extern crate flatbuffers;
  use self::flatbuffers::{EndianScalar, Follow};
#[allow(unused_imports, dead_code)]
pub mod sample {

  use std::mem;
  use std::cmp::Ordering;

  extern crate flatbuffers;
  use self::flatbuffers::{EndianScalar, Follow};

#[deprecated(since = "2.0.0", note = "Use associated constants instead. This will no longer be generated in 2021.")]
pub const ENUM_MIN_COLOR: i8 = 0;
#[deprecated(since = "2.0.0", note = "Use associated constants instead. This will no longer be generated in 2021.")]
pub const ENUM_MAX_COLOR: i8 = 2;
#[deprecated(since = "2.0.0", note = "Use associated constants instead. This will no longer be generated in 2021.")]
#[allow(non_camel_case_types)]
pub const ENUM_VALUES_COLOR: [Color; 3] = [
  Color::Red,
  Color::Green,
  Color::Blue,
];

#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Default)]
#[repr(transparent)]
pub struct Color(pub i8);
#[allow(non_upper_case_globals)]
impl Color {
  pub const Red: Self = Self(0);
  pub const Green: Self = Self(1);
  pub const Blue: Self = Self(2);

  pub const ENUM_MIN: i8 = 0;
  pub const ENUM_MAX: i8 = 2;
  pub const ENUM_VALUES: &'static [Self] = &[
    Self::Red,
    Self::Green,
    Self::Blue,
  ];
  /// Returns the variant's name or "" if unknown.
  pub fn variant_name(self) -> Option<&'static str> {
    match self {
      Self::Red => Some("Red"),
      Self::Green => Some("Green"),
      Self::Blue => Some("Blue"),
      _ => None,
    }
  }
}
// ...
impl<'a> flatbuffers::Follow<'a> for Color {
  type Inner = Self;
  #[inline]
  fn follow(buf: &'a [u8], loc: usize) -> Self::Inner {
    let b = unsafe {
      flatbuffers::read_scalar_at::<i8>(buf, loc)
    };
    Self(b)
  }
}

impl flatbuffers::Push for Color {
    type Output = Color;
    #[inline]
    fn push(&self, dst: &mut [u8], _rest: &[u8]) {
        unsafe { flatbuffers::emplace_scalar::<i8>(dst, self.0); }
    }
}
// ...

which allows both read and write out of bounds in safe code. Specifically, the following
tests do not pass miri:

mod generated;

#[cfg(test)]
mod test {
    use super::*;

    use flatbuffers::Follow;
    use flatbuffers::Push;

    #[test]
    fn write() {
        let color = generated::Color::Blue;
        color.push(&mut [], &[]);
    }

    #[test]
    fn read() {
        generated::Color::follow(&[1], 1);
    }
}

and constitute CWE-787 and CWE-125 respectively, ranking 1st and 3rd in the most dangerous vulnerabilities at https://cwe.mitre.org/top25/archive/2021/2021_cwe_top25.html.

Yep a bunch of the traits, especially Follow, are unsafe and should be refactored as such. Using it outside the generated code is not intended and a great way of shooting yourself in the foot. @aardappel, I'm not going to be available to contribute for a while, but marking Follow and friends stuff unsafe is an easy first contribution if you find anyone willing.

Not sure what rustaceans are around to take this on..
@rw

Is this all about marking those traits and their methods unsafe? If so, I can probably take this.

Yes and please do!

I started by marking Follow::follow as unsafe in #6951

rw commented

@CasperN WDYT about making "follow" and "push" checked and/or typesafe, so that users can't read or write invalid data? I'd like to avoid turning everything in the library into unsafe when we could possibly solve the problem in a more Rust-y way.

@rw @CasperN in the mcve provided here, the vulnerabilities come from read_scalar and emplace_scalar which are not doing any bound checks, probably for performance reasons.

One thing we could do is add a debug assert so that at least it would panic in debug mode. That doesn't help much with release mode though but I see this as a similar problem as integer over/under-flow which are disabled in release mode because the performance impact is too big.

I haven't looked into the generated code too much, but if we are confident it is sound and safe, instead of exposing those traits directly to user, we could generate a sealed trait to only allow the generated code to call push and follow.

One thing we could do is add a debug assert so that at least it would panic in debug mode. That doesn't help much with release mode though but I see this as a similar problem as integer over/under-flow which are disabled in release mode because the performance impact is too big.

Sounds reasonable to me

I haven't looked into the generated code too much, but if we are confident it is sound and safe, instead of exposing those traits directly to user, we could generate a sealed trait to only allow the generated code to call push and follow.

I'd say we're fairly confident but there's always more we can do. Specifically the fuzzing story could be improved: cargo fuzz is a coverage guided fuzzer which, unlike quickcheck, mutates examples to optimize for code coverage, which is totally necessary given the number of branches in monster_test. We may even need custom mutators to make fuzzing efficient. Having that running continuously with ASAN would make me a lot more confident in our safety.

we could generate a sealed trait to only allow the generated code to call push and follow.

I thought about sealed traits too... I'm not sure it will work because the generated code is part of the client's crate, not our own. I guess we could call the trait GeneratedCodeOnlyVerySubtleSeriouslyDontDoItYourself or something, but I don't think there can be compiler enforcement.

WDYT about making "follow" and "push" checked and/or typesafe, so that users can't read or write invalid data? I'd like to avoid turning everything in the library into unsafe when we could possibly solve the problem in a more Rust-y way.

I feel like we've gone back and forth on this discussion a few times. Originally, I advocated for bounds checks but there was pushback for performance reasons. Making follow and friends unsafe is the natural consequence to removing checks. I actually think this is okay, given the existence of the verifier.

When we were working on the verifier I think there were a few ways of thinking about reading data from a flatbuffer

  1. read it without checks, but give users the option to verify first, and opting out requires unsafe
  2. read it with checks and panic
  3. read it with checks and wrap in a result type
  4. (there were actually a few more even more sophisticated solutions, which I loved but decided were impractical1)

(2) was not great as the checks add unnecessary overhead to every field access, after verification. (3) was probably the most rust-y solution but there'd be a lot of result type propagation which was deemed unergonomic and also repeated access to the same field will incur the overhead of checks. Ultimately, we went with (1) because its the most similar to the C++ API, incurred the least complexity both for maintenance and usability, while solving most of the problem. See #6161 for the full discussion.

Footnotes

  1. We could have 2 kinds of tables, verified and unverified; and 3 accessors for each field in the unverified table; foo (which checks and panics), try_foo (result typed) and unchecked_foo (unsafe/fast). There's also ideas around of deep vs lazy verification of object trees, only verifying subtrees... We can get arbitrarily intricate here โ†ฉ

Am I right that now reading an untrusted Flatbuffer input with Rust can lead to some segfaults and now there is no way to read values in some checked way? I mean some "Buffer verifier" functionality in the Rusty Flatbuffer implementation.

EDITED: Just read this #6269 and I am a little bit disappointed because seems like the verifier is implemented but the official Flatbuffer page disagrees with me.

@CasperN @rw maybe you can clarify my question above? (sorry for the explicit ping)

@zamazan4ik

Am I right that now reading an untrusted Flatbuffer input with Rust can lead to some segfaults and now there is no way to read values in some checked way?

No, because we do have a verifier and you need unsafe to opt out of it. Though if you do opt out I think you can get an out-of-bound read. Sorry about the misleading docs, my bad, I think I was last to touch them.

Fwiw, if popular demand in Rust is to add checks and take the performance hit, even with the verifier, I don't mind going in that direction.

No, because we do have a verifier and you need unsafe to opt out of it. Though if you do opt out I think you can get an out-of-bound read. Sorry about the misleading docs, my bad, I think I was last to touch them.

Thanks for the clarification. Would be awesome if you will fix the documentation too :)

Fwiw, if popular demand in Rust is to add checks and take the performance hit, even with the verifier, I don't mind going in that direction.

Not sure about other people's use cases. For me is enough that flatbuffers with the verifier don't segfault/panic.

Fwiw, if popular demand in Rust is to add checks and take the performance hit, even with the verifier, I don't mind going in that direction.

Essentially, I would love to see flatbuffers as requiring the user to explicitly call a safe runtime checked cast to a verified table, or an unsafe zero-cost cast to a verified table. Then all the other accessors in the generated code get to assume the table was already verified and omit the runtime cost, because it is an invariant on the table type itself.

Looking at the api of Follow, i don't think there is a way for that low level offset-based api to be safe and not bounds checked. So yes I'm in the camp of "mark it unsafe and don't bounds check". The generated accessor code that calls into Follow will be marked safe and also will be zero cost, because those accessors would assume pre-verified tables, and my thought is that flatbuffers 3.0 would make it so that its impossible to get such a table without safely calling the verifier to promote the slice to the table via root(), or unsafely casting the slice to the table with root_unchecked().

I've submitted #7518 which I think eliminates any unsoundness, but I'm not intimately familiar with this crate and so would appreciate any additional eyes on it, assuming this is the path we want to take ๐Ÿ˜„

@tustvold Trying to fix any soundness issue is probably a bit of a high bar. ๐Ÿ˜‰

There are still bugs in the parser like #7007.

a bit of a high bar

Eh... Nothing like aiming high ๐Ÿ˜„ I think it would eliminate any known practical unsoundness issues, or am I missing something?

I still think there are a lot of ways to bypass this. For instance this one:

In schema.fbs:

table Foo {
  x: uint32;
}

In main.rs:

mod schema_generated;

fn main() {
    #[rustfmt::skip]
    let root = flatbuffers::root::<schema_generated::Foo>(&[
        12, 0, 0, 0, // offset to root
        0, 0, // padding
        // <vtable>
        6, 0, // vtable size
        8, 0, // object size
        4, 0, // offset for field x
        // </vtable>
        // <object>
        6, 0, 0, 0, // (negative) offset for vtable
        2, 0, 0, 0,
        // </object>
        0xc3, 0x28, // extra stuff after object
    ])
    .unwrap();
    // re-interpret the uint32 field as a string field (without offset)
    // this calls from_utf8_unchecked internally
    let field = root._tab.get::<&str>(4, None).unwrap();

    // this should never be possible with a valid string
    assert!(std::str::from_utf8(field.as_bytes()).is_err());
}

Disappointing, I presumed the verifier was doing utf8 validation. Fortunately that is easy enough to add ๐Ÿ˜„

@tustvold It is, but the schema does not contain a string -- and the get function allows re-interpreting the data.

Aah, awesome, was sure I'd missed some methods, thanks for pointing it. Will do another audit tomorrow morning ๐Ÿ‘

Personally I think a lot of these issues would be solved by doing late validation instead of using the verifier. That would also be faster, because you don't have to go over the data twice.

But even if you don't want this, I think re-implementing read_scalar_at as a safe function would help a lot with unsafe infecting everything.

There is also the fact that safe_slice is inherently undefined behavior according to the Rust ABI, which e.g. requires u32 to be 4-byte aligned. This is an example of this undefined behavior:

schema.fbs:

table Foo {
  x: [uint32];
}

main.rs:

mod schema_generated;

static DATA: &[u8] = &[
    0, // A single byte of padding
    12, 0, 0, 0, // offset to root
    0, 0, // padding before vtable
    // <vtable>
    6, 0, // vtable size
    8, 0, // object size
    4, 0, // offset for field x
    // </vtable>
    // <object>
    6, 0, 0, 0, // (negative) offset for vtable
    4, 0, 0, 0, // offset to vector
    // </object>
    3, 0, 0, 0, // vector length
    1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0, // vector data
];

fn main() {
    let root = flatbuffers::root::<schema_generated::Foo>(&DATA[1..]).unwrap();
    let slice: &[u32] = root.x().unwrap().safe_slice();
    println!("Value: {slice:?}");
    println!("Required alignment: {}", std::mem::align_of::<u32>());
    println!("Pointer: {:p}", slice.as_ptr());
    println!(
        "Correctly aligned? {}",
        slice.as_ptr() as usize % std::mem::align_of::<u32>() == 0
    );
}

I've updated #7518 to also address the alignment issues, unfortunately this does require a user-facing breaking change, namely we can't treat as many things as slices, but I think this is fine. There is something a little amusing that we can provide slices of structs but not primitives ๐Ÿ˜† PTAL and let me know if I've missed anything

I you need to mark init_from_table as unsafe as well in the generated code. For now that is all I can find, but I would not at all be surprised if more issues are hiding around.