serde-rs/test

`serde_test`: Don't forward (almost) everything to `deserialize_any`

jhpratt opened this issue · 3 comments

By forwarding things like deserialize_bool to deserialize_any, the deserialize_bool behavior can then deserialize a string, for example. This is unexpected and lets through a ton of stuff that, in my opinion, shouldn't be. serde_test is for testing, not general purpose use, so it should remain strict in what it accepts. deserialize_bool should reject everything other than true or false, unless there's some reason I'm missing?

@dtolnay Is this something you'd be interested in? I can try to work on it if you are. This issue has caused me to miss multiple bugs in my code, and I'm sure I'm not the only one.

Yeah I am open to this. I would want to better understand what currently passing (reasonable, correct) tests this would break though, but maybe that would be clearer from seeing the implementation of the change.

I attempted to simply error on the fallback to deserialize_any, but it's actually more involved than I expected. Here's the diff I have:

diff --git a/serde_test/src/de.rs b/serde_test/src/de.rs
index 673a0c0e..dce6206c 100644
--- a/serde_test/src/de.rs
+++ b/serde_test/src/de.rs
@@ -278,7 +278,10 @@ impl<'de, 'a> de::Deserializer<'de> for &'a mut Deserializer<'de> {
             {
                 visitor.visit_enum(DeserializerEnumVisitor { de: self })
             }
-            _ => self.deserialize_any(visitor),
+            token => Err(de::Error::invalid_type(
+                token.into_unexpected(),
+                &format!("enum {}", name).as_str(),
+            )),
         }
     }
 
@@ -291,7 +294,10 @@ impl<'de, 'a> de::Deserializer<'de> for &'a mut Deserializer<'de> {
                 assert_next_token!(self, Token::UnitStruct { name: name });
                 visitor.visit_unit()
             }
-            _ => self.deserialize_any(visitor),
+            token => Err(de::Error::invalid_type(
+                token.into_unexpected(),
+                &format!("unit struct {}", name).as_str(),
+            )),
         }
     }
 
@@ -333,7 +339,7 @@ impl<'de, 'a> de::Deserializer<'de> for &'a mut Deserializer<'de> {
                 self.next_token();
                 self.visit_seq(Some(len), Token::TupleStructEnd, visitor)
             }
-            _ => self.deserialize_any(visitor),
+            token => Err(de::Error::invalid_type(token.into_unexpected(), &"tuple")),
         }
     }
 
@@ -367,7 +373,10 @@ impl<'de, 'a> de::Deserializer<'de> for &'a mut Deserializer<'de> {
                 assert_next_token!(self, Token::TupleStruct { name: name, len: n });
                 self.visit_seq(Some(len), Token::TupleStructEnd, visitor)
             }
-            _ => self.deserialize_any(visitor),
+            token => Err(de::Error::invalid_type(
+                token.into_unexpected(),
+                &"tuple struct",
+            )),
         }
     }
 
@@ -389,7 +398,7 @@ impl<'de, 'a> de::Deserializer<'de> for &'a mut Deserializer<'de> {
                 self.next_token();
                 self.visit_map(Some(fields.len()), Token::MapEnd, visitor)
             }
-            _ => self.deserialize_any(visitor),
+            token => Err(de::Error::invalid_type(token.into_unexpected(), &"struct")),
         }
     }
 
diff --git a/serde_test/src/token.rs b/serde_test/src/token.rs
index 22513614..fb5bbf0d 100644
--- a/serde_test/src/token.rs
+++ b/serde_test/src/token.rs
@@ -1,5 +1,7 @@
 use std::fmt::{self, Debug, Display};
 
+use serde::de::Unexpected;
+
 #[derive(Copy, Clone, PartialEq, Debug)]
 pub enum Token {
     /// A serialized `bool`.
@@ -517,3 +519,44 @@ impl Display for Token {
         Debug::fmt(self, formatter)
     }
 }
+
+impl Token {
+    pub(crate) fn into_unexpected(self) -> Unexpected<'static> {
+        match self {
+            Token::Bool(val) => Unexpected::Bool(val),
+            Token::I8(val) => Unexpected::Signed(val as i64),
+            Token::I16(val) => Unexpected::Signed(val as i64),
+            Token::I32(val) => Unexpected::Signed(val as i64),
+            Token::I64(val) => Unexpected::Signed(val),
+            Token::U8(val) => Unexpected::Unsigned(val as u64),
+            Token::U16(val) => Unexpected::Unsigned(val as u64),
+            Token::U32(val) => Unexpected::Unsigned(val as u64),
+            Token::U64(val) => Unexpected::Unsigned(val),
+            Token::F32(val) => Unexpected::Float(val as f64),
+            Token::F64(val) => Unexpected::Float(val),
+            Token::Char(val) => Unexpected::Char(val),
+            Token::Str(val) | Token::BorrowedStr(val) | Token::String(val) => Unexpected::Str(val),
+            Token::Bytes(val) | Token::BorrowedBytes(val) | Token::ByteBuf(val) => {
+                Unexpected::Bytes(val)
+            }
+            Token::None | Token::Some => Unexpected::Option,
+            Token::Unit | Token::UnitStruct { .. } => Unexpected::Unit,
+            Token::UnitVariant { .. } => Unexpected::UnitVariant,
+            Token::NewtypeStruct { .. } => Unexpected::NewtypeStruct,
+            Token::NewtypeVariant { .. } => Unexpected::NewtypeVariant,
+            Token::Seq { .. }
+            | Token::Struct { .. }
+            | Token::Tuple { .. }
+            | Token::TupleStruct { .. } => Unexpected::Seq,
+            Token::SeqEnd | Token::TupleEnd | Token::TupleStructEnd => todo!(),
+            Token::TupleVariant { .. } => Unexpected::TupleVariant,
+            Token::TupleVariantEnd => todo!(),
+            Token::Map { .. } => Unexpected::Map,
+            Token::MapEnd => todo!(),
+            Token::StructEnd => todo!(),
+            Token::StructVariant { .. } => Unexpected::StructVariant,
+            Token::StructVariantEnd => todo!(),
+            Token::Enum { .. } => Unexpected::Enum,
+        }
+    }
+}

This is obviously not complete given the presence of todo!(), but there are tests in this repository failing aside from that. I've no idea if the conversion is fully accurate, as I'm not as knowledgeable with regard to serde's internal data structures.

Even if the into_unexpected method worked, there's still the issue of having custom "expected" strings that I don't see any way to handle.