ron-rs/ron

C-like enum nested inside of an internally tagged enum fails to deserialize

Closed this issue · 6 comments

Trying to roundtrip this type Rust-RON-Rust:

#[derive(Debug, Serialize, Deserialize)]
enum InnerEnum {
    UnitVariant,
}

#[derive(Debug, Serialize, Deserialize)]
struct Container {
    field: InnerEnum,
}

#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "tag")]
enum OuterEnum {  // <- this type
    Variant(Container),
}

gives the following RON:

(
    tag: "Variant",
    field: UnitVariant,
)

which produces the following error when trying to deserialize it:

4:2: Expected string or map but found a unit value instead

(playground-ish)

Some notes:

  1. Deserialization to ron::Value reveals that the variant of the InnerEnum is lost and replaced with Unit.
  2. The error location is particularly unhelpful: it points to the end of the file.
Full repro code duplicated here for resillience
/*
[dependencies]
serde = { version = "1", features = ["derive"] }
ron = "=0.8.0"
*/

use serde::{Deserialize, Serialize};

#[derive(Debug, Serialize, Deserialize)]
enum InnerEnum {
    UnitVariant,
}

#[derive(Debug, Serialize, Deserialize)]
struct Container {
    field: InnerEnum,
}

#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "tag")]
enum OuterEnum {
    Variant(Container),
}

fn main() -> Result<(), ron::Error> {
    let value = OuterEnum::Variant(Container {
        field: InnerEnum::UnitVariant,
    });
    let serialized = ron::ser::to_string_pretty(&value, <_>::default())?;
    println!("--- serialized ---\n{serialized}\n");
    let ron_value: ron::Value = ron::from_str(&serialized)?;
    println!("--- deserialized as ron::Value ---\n{ron_value:#?}\n");
    let err = ron::from_str::<OuterEnum>(&serialized).unwrap_err();
    println!("--- deserialization error ---\n{err}");

    Ok(())
}

Possibly related to #217, but doesn’t use neither untagged nor externally tagged enums.

juntyr commented

Thank you for the detailed report @GoldsteinE :) I've expanded it a bit to cover all enum cases:

use serde_derive::{Serialize, Deserialize};

#[derive(Debug, Serialize, Deserialize, PartialEq, Eq)]
enum InnerEnum {
    UnitVariant,
}

#[derive(Debug, Serialize, Deserialize, PartialEq, Eq)]
struct Container {
    field: InnerEnum,
}

#[derive(Debug, Serialize, Deserialize, PartialEq, Eq)]
enum OuterEnum {
    Variant(Container),
}

#[derive(Debug, Serialize, Deserialize, PartialEq, Eq)]
#[serde(tag = "tag")]
enum OuterEnumInternal {
    Variant(Container),
}

#[derive(Debug, Serialize, Deserialize, PartialEq, Eq)]
#[serde(tag = "tag", content = "c")]
enum OuterEnumAdjacent {
    Variant(Container),
}

#[derive(Debug, Serialize, Deserialize, PartialEq, Eq)]
#[serde(untagged)]
enum OuterEnumUntagged {
    Variant(Container),
}

#[test]
fn test_enum_in_enum_roundtrip() {
    let outer = OuterEnum::Variant(Container { field: InnerEnum::UnitVariant });

    let ron = ron::to_string(&outer).unwrap();

    assert_eq!(ron, "Variant((field:UnitVariant))");

    let de = ron::from_str::<OuterEnum>(&ron);

    assert_eq!(de, Ok(outer));
}

#[test]
fn test_enum_in_internally_tagged_roundtrip() {
    let outer = OuterEnumInternal::Variant(Container { field: InnerEnum::UnitVariant });

    let ron = ron::to_string(&outer).unwrap();

    assert_eq!(ron, "(tag:\"Variant\",field:UnitVariant)");

    // Wrong JSONy RON would correctly deserialise here
    assert_eq!(ron::from_str::<OuterEnumInternal>("(tag:\"Variant\",field:\"UnitVariant\")").as_ref(), Ok(&outer));

    let de = ron::from_str::<OuterEnumInternal>(&ron);

    // FAILS
    assert_eq!(de, Ok(outer));
}

#[test]
fn test_enum_in_adjacently_tagged_roundtrip() {
    let outer = OuterEnumAdjacent::Variant(Container { field: InnerEnum::UnitVariant });

    let ron = ron::to_string(&outer).unwrap();

    assert_eq!(ron, "(tag:\"Variant\",c:(field:UnitVariant))");

    let de = ron::from_str::<OuterEnumAdjacent>(&ron);

    assert_eq!(de, Ok(outer));
}

#[test]
fn test_enum_in_untagged_roundtrip() {
    let outer = OuterEnumUntagged::Variant(Container { field: InnerEnum::UnitVariant });

    let ron = ron::to_string(&outer).unwrap();

    assert_eq!(ron, "(field:UnitVariant)");

    // Wrong JSONy RON would correctly deserialise here
    assert_eq!(ron::from_str::<OuterEnumUntagged>("(field:\"UnitVariant\")").as_ref(), Ok(&outer));

    let de = ron::from_str::<OuterEnumUntagged>(&ron);

    // FAILS
    assert_eq!(de, Ok(outer));
}

Normal enums and adjacently tagged enums work here, but untagged enums and internally tagged enums fail. In both cases, the problem is that serde is built for JSON and deserialises the input into an effectively-JSON Value, which does not preserve the identity of the UnitVariant enum variant. #409 would fix this issue but comes with its own problems. For now the RON stance has been that RON does not fully support using RON as a self-describing format, i.e. anything that needs to go through deserialize_any, like untagged and internally tagged enums and struct flattening. I'm trying to slowly change that though. However, until serde gives the formats more control over how these special functionalities are implemented (e.g. though a format-specific Value type), supporting them in RON requires nasty hacks that are difficult to maintain since RON just is not JSON.

I think that “RON does not support untagged and internally tagged enums (and also flattening)” should be documented in some visible place, because it’s features that are normally expected from a serde format, and people use enums a lot in “Rusty” code. It’s especially important given that README says:

It's designed to support all of Serde's data model, so structs, enums, tuples, arrays, generic maps, and primitive values.

and also

Note the following advantages of RON over JSON: ... enums are supported (and less verbose than their JSON representation)

which is really misleading, given that serde_json actually supports enums more fully (while they’re obviously not a first-class feature in JSON).

The (only?) way to get the information about RONs limitations now is to read through the issues, which the user isn’t supposed to do IMO. I think a “Limitations / known bugs” section in the README and/or the docs.rs/ron page would be really helpful to the new users.

The other thing that I think is worth exploring is a way to make the error message more useful. Currently it doesn’t even point to the correct span, so the issue takes some effort to minimize and debug.

juntyr commented

which is really misleading, given that serde_json actually supports enums more fully (while they’re obviously not a first-class feature in JSON).

I think I would disagree with that on a technicality though I see why it seems like it from a user perspective. We do support the serde full data model, serde's attributes are implemented outside of that which makes them so difficult to support in any format that does not behave like JSON. In fact, RON does support enums while JSON does not and it means that to support these attributes we would have to transform everything inside deserialize_any into pseudo-JSON that serde can understand.

I do agree that adding a limitations section would be useful to point this out to new users.

The other thing that I think is worth exploring is a way to make the error message more useful. Currently it doesn’t even point to the correct span, so the issue takes some effort to minimize and debug.

The problem is that serde attributes are entirely independent of the serde formats and not something we have any control over. They are implemented in serde with the assumption that dataformats work similar to JSON with no ability for the formats to hook into. So unfortunately the entire enum tagging happens outside of RON (effectively inside the generated Deserialize impl) and we have no way to detect the error or report the error in a better way. I fully agree that this is not ideal, but I unfortunately do not know how it could be solved since serde is just flawed in that respect.

The same holds for the span. The way internally tagged enums work is that serde uses deserialize_any to create a JSON map from whatever is inside RON. Once that map has been created, i.e. RON's job is done since we have successfully provided a map, it tries to deserialise the variant from that. That second step now fails, but it fails inside the generated Deserialize impl after the full RON fragment describing the enum has been read in. Therefore, once the error is generated, RON has no idea where exactly it comes from.

juntyr commented

I've opened #450 - does that limitations section look good to you?

I think it certainly helps, thanks.

juntyr commented

I've found a way untagged and internally tagged enums could be supported at minimal code cost but with some heavy sighs -> #451

#403 is also still hanging around as an idea for how struct flattening could be supported. Unfortunately it is again a case where serde just imposes JSON semantics and thus requires some uncomfortable hacks.

What are your thoughts?