double quotes lost when deserializing and serializing strings containing only numbers on serde_yaml 0.9
Closed this issue ยท 6 comments
I'm reading some yaml, editing and then saving it again after doing to_string.
For some reason I'm not understanding, some String values which only contain numbers end up not having the double quotes after to_string.
I'm feeding this yaml to something that expects Strings, not numbers, and I'm expecting serde_yaml to respect whichever type the value was.
Here's a minimal reproduceable test, which passes if I use serde_yaml 0.8.26 but does not pass if I use 0.9:
extern crate serde;
extern crate serde_yaml;
#[test]
fn can_serialize_with_quotes() {
use serde_yaml::Mapping;
let original_config = r#"---
configuration:
agent: "007"
"#;
let config: Mapping = serde_yaml::from_str(original_config).expect("should parse yaml");
let config = serde_yaml::to_string(&config).expect("should serialize");
assert_eq!(config, original_config);
}
basically this input:
configuration:
agent: "007"
ends up as:
configuration:
agent: 007
(there is also another difference between 0.8 and 0.9 regarding having or not having ---\n
at the beginning of the String, but this is not relevant as far as I can tell; the issue is that a String value ends up being converted to a Number value).
Is there any way to avoid this String->Number conversion or some other way to make this test pass?
This is behaving correctly as far as I can tell. 007
is a !!str in yaml, not a !!int. If a different library you are using is interpreting it as an int, that is a bug in the other library.
Here is the spec section that determines what untagged scalars are int: https://yaml.org/spec/1.2.2/#1022-tag-resolution.
sorry, I should have checked this properly. I think you're right. thanks!
I'm not sure I agree with this resolution. @dtolnay any chance you can take another look?
From the linked article I see that:
Scalars with the โ?โ non-specific tag (that is, plain scalars) are matched with an extended list of regular expressions.
One of which is [-+]? [0-9]+
, which resolves to tag:yaml.org,2002:int (Base 10).
I think 007
is a plain scalar and matches the above regular expression, so it should be interpreted as an int. Therefore the round-trip of "007"
to 007
is not valid, and parsing 007
as an integer is valid.
I've tested this with rust-yaml, yq, and PyYaml, which all agree so far that 007
is an int.
E.g.
use serde_yaml; // 0.9.24
use yaml_rust; // 0.4.5
fn main() {
use serde_yaml::Mapping;
let original_config = r#"---
agent: "007"
"#;
let parsed_serde_yaml: Mapping = serde_yaml::from_str(original_config).unwrap();
// serde_yaml knows it's a string
assert_eq!(parsed_serde_yaml["agent"], serde_yaml::Value::String("007".into()));
let serialized_serde_yaml = serde_yaml::to_string(&parsed_serde_yaml).unwrap();
// Serializes it back to 007, no quotes.
assert_eq!(serialized_serde_yaml, r#"agent: 007
"#);
// serde_yaml parses it back as a string, so we're self-consistent.
let parsed_serde_yaml: Mapping = serde_yaml::from_str(&serialized_serde_yaml).unwrap();
assert_eq!(parsed_serde_yaml["agent"], serde_yaml::Value::String("007".into()));
// But yaml_rust parses it as an integer
let parsed_yaml_rust = yaml_rust::YamlLoader::load_from_str(&serialized_serde_yaml).unwrap();
let doc = &parsed_yaml_rust[0];
// thread 'main' panicked at 'assertion failed: `(left == right)`
// left: `Integer(7)`,
// right: `String("007")`',
assert_eq!(doc["agent"], yaml_rust::Yaml::String("007".into()));
}
Yeah, good call.
Fixed in serde_yaml 0.9.25.
I'm getting similar behaviour using 0.9.26 but not in 0.8.26 where serializing and then deserializing results in an unquoted string as output.
The input:
- name: KUBERNETES
value: "yes"
Output:
- name: KUBERNETES
value: yes
Having a similar issue with a key containing a string with the single character y. Even if a create the mapping value as a string, when outputing the yaml, y is not quoted. This ends up being interpreted as a bool instead of a string value by kubectl.
Reference: https://yaml.org/type/bool.html
Question: Is there a way of forcing values of type string to be always quoted when outputing as a string?