dtolnay/serde-yaml

externally tagged enums not correctly parsed in 0.9.25

Closed this issue · 5 comments

Hi,

I hope I'm not missing something obvious, since I'm still new to Rust, but I've found the current version of serde_yaml to not behave as per the docs for untagged enums parsing, whereas reverting to 0.8 does comply to the docs.

Here is some code:

#![allow(dead_code)]
use serde::Deserialize;

#[derive(Deserialize,Debug)]
#[serde(rename_all ="snake_case")]
enum Message {
    Request { id: String, method: String },
    Response { id: String, result: String },
}
fn main() {
    let yaml = r#"
    - request:
       id: abc
       method: abc
    - response:
       id: abc
       result: abc
        "#;
    

    dbg!(serde_yaml::from_str::<Vec<Message>>(yaml).unwrap());
}

According to the Serde docs, this complies with the default way of tagging enums, the externally tagged way. With serde_yaml 0.8, this works fine. With 0.9.25, the current published version, this fails with:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error(".[0]: invalid type: map, expected a YAML tag starting with '!'", line: 2, column: 7)', src/main.rs:21:53

This is the dependencies section in Cargo.toml:

[dependencies]
serde = { version = "1.0.186", features = ["derive"] }
serde_yaml =  "0.9.25"

Switching from external tagging to internally tagged removes the error, however the style of the resulting Yaml is less ideal for my use case.

Again, if this is not a bug but a me-issue, I apologize, and will be thankful for a quick clarification! I've been assuming that the serde docs are up to date and therefore my example should work.

Pierric.

Externally tagged enums use YAML's tag syntax. See:

This should work:

- !Request
  id: abc
  method: abc
- !Response
  id: abc
  result: abc

Hi,

Thank you for your reply. I understand the YAML specs might have led to that decision, and it makes sense to implement what the specs say.

However, I wonder if this is the best move given how Yaml is used in many places in real life. Many projects rely on the "JSON-style singleton maps" rather than the !Tag format:

  • significantly, this is how Ansible files work, e.g.:
- copy:
    src: /here
    dest: /there
    mode: 0644
- docker_container:
    name: mycontainer
    status: started
...
  • many, many projects, from small to big, use Yaml for configuration files. A reason for that is likely to be that it's a rather human-friendly way of managing deep data structures. The !Tag approach is, in my opinion, less intuitive and less human-friendly than the singleton maps.

It seems to me that there is lost potential for future projects with this strict adherence to the !Tag concept. I wonder if those things could co-exist, with at least the ability to still deserialize from the singleton maps, even if serialization then defaults back to the official way. Or, if the style (!Tag or singleton map) could be an option in the structs with something like #[serde(tagged_as_map)] (or whichever wording is more fitting). Could that be considered? Otherwise I'm worried we'll need an extra library for deserializing this (albeit informal) style of Yaml.

Thanks,
Pierric.

astraw commented

@Pierric82 these things can co-exist. If using serde derive, you can tag the relevant field with #[serde(with = "serde_yaml::with::singleton_map")]. See https://docs.rs/serde_yaml/0.9.25/serde_yaml/with/singleton_map/index.html for an example.

Hi @astraw ,

Thanks for making that clear, I was not aware of it. I've looked it up a bit and for my needs, singleton_map_recursive did the job!

Regards,
Pierric.

I feel this needs more documentation...

I was trying to parse a vector of singleton maps like this:

- enumVarientA: some text
- enumVarientB: some other text
...

The solution didn't seem obvious, so here is how I resolved it, in case someone needs it in the future:

#[derive(Serialize, Deserialize)]
#[serde(transparent, bound = "F: Serialize + DeserializeOwned")]
pub struct Wrapper<F>(
    #[serde(with = "serde_yaml::with::singleton_map")]
    pub F,
);
pub type ItemList = Vec<Wrapper<Item>>;