Round-tripping string values that look like backward compatible arrays
Luthaf opened this issue · 9 comments
I realized I was not very clear in my last email, so here is a separate issue, hopefully clearer.
I'm wondering about the possibility to round-trip string values, i.e. store some data in extended XYZ and then be able to read it again.
In particular, there is some ambiguity here between string values and backward compatible arrays. If the user asks to store the string 1 3 4 5
(with spaces inside) with the string
key in an extended XYZ comment line, simply writing it as a key="1 3 4 5"
is not enough, since that will be read later as an array of 4 integers. It is possible to use a single element {}
array instead, i.e. write this as key={"1 3 4 5"}
, which will then be read as intended by the parser. The example from my email was about a string containing a single number or boolean value (True/...) which can not be written directly as key="3"
, but the same applies for string that would look like arrays.
This means that any extended XYZ writer must check all string values to ensure they would parse back as string, and otherwise use the {}
array trick to make sure the type of the value is preserved in the extended XYZ comment line.
Do you agree with my reading of the spec and is this the intended behaviour?
I fear it will make writing to extended XYZ a bit more complex, since now every software that needs to write to this format also needs to implement a (partial) parser for it; but I also understand this is allowed for backward compatibility reasons.
Single quote string would solve the issue here, writer could always use single quote when writing strings that needs to be quoted.
I'm also happy to support single quoted strings. Need to check if they work in the current implementation, and update the spec.
Are we OK with defining single quotes as strings only, even though the old Fortran (extxyz.c, really) implementation treats '
, "
, and {}
equivalently?
Yes, I think that's fine - most writers will have used "
for all arrays
OK, let's add single quotes as a string-only container, and then writers can always use that for strings unambiguously.