rust-embedded-community/serde-json-core

deserialize_str with escaped strings is broken

jordens opened this issue · 13 comments

serde_json_core needs to refuse deserializing a borrowed string with escape sequences.

serde_json_core::from_slice::<&str>(br#""\n""#) incorrectly gives Ok(("\\n", 4)).

serde_json::from_slice::<&str>(br#""\n""#) correctly says Error("invalid type: string \"\\n\", expected a borrowed string", line: 1, column: 4).

@ryan-summers This is resolved by the merging of #83

t-moe commented

Is it fixed?

 let mut escape_buf = [0; 8];
 assert_eq!(from_slice_escaped::<&str>(br#""a\nb""#, &mut escape_buf), Ok(("a\nb", 6)));

fails with

Err(CustomErrorWithMessage("invalid type: string \"a\\nb\", expected a borrowed string"))

while this one is ok:

#[derive(Deserialize, PartialEq, Debug)]
  struct Test(heapless::String<10>);

  let mut escape_buf = [0; 8];
  assert_eq!(
      from_slice_escaped::<Test>(br#""a\nb""#, &mut escape_buf),
      Ok((Test(heapless::String::try_from("a\nb").unwrap()), 6))
  );

You can't deserialize into a &str type because the destination data type needs to own the memory for the unescaped string from what I understand. I believe you should be able to do:

 let mut escape_buf = [0; 8];
 assert_eq!(from_slice_escaped::<heapless::String<8>>(br#""a\nb""#, &mut escape_buf), Ok(("a\nb".into(), 6)));
t-moe commented

Ah yes. Thank you!

It's definitely not fixed.

Right now I would classify it as more of an intentional design choice. The from_slice and from_string APIs currently don't do attempt at string de-escaping. If the user wants to de-escape strings, they can use the new from_*_escaped APIs. If we want to directly mirror the serde-json API semantics, then indeed this doesn't align with what they do

That's not the point. It doesn't need to do de-escaping.
It should error out (as described) instead of performing the wrong deserialization.

If the user is explicitly opting in to an API that does not claim to do string escaping, I would expect a deserialization of r#""\n""# to result in a deserialized "\n", or am I misunderstanding something?

Are you referring to the removal of the inner quotes? I'm just a bit confused here

t-moe commented

I think a good default for from_slice would be to refuse to deserialize strings containing escape sequences.
An API which simply ignores them is unsafe and should be explicitly marked as such. (e.g. from_slice_ignore_escapes).

I understand that you dont want to change the behavior from_slice now though, as this would be a breaking change... Not sure how to proceed here. But I agree with @jordens that the current behavior is harmful for the users IMHO.

The only correct deserialization is one that understands escaping. It's part of Json. If it can't, it must error. Not doing so is a bug, likely even a severe security issue.

Could someone please reopen this and mark this as a security vulnerability?

I don't have permissions for any security-related issues on this repository. CC @eldruin can you help us out here?

Sorry for the delay.
What do you mean by marking as security vulnerability?
We can add a "security-vulnerability" or similarly-named tag to this issue or publish a full blown security advisory in this repo. Here is the template for that:

### Impact
_What kind of vulnerability is it? Who is impacted?_

### Patches
_Has the problem been patched? What versions should users upgrade to?_

### Workarounds
_Is there a way for users to fix or remediate the vulnerability without upgrading?_

### References
_Are there any links users can visit to find out more?_

Plus the CVSS scoring.