golang/go

proposal: encoding/xml: allow unmarshaling arbitrary local names with a given namespace via a glob

SamWhited opened this issue · 6 comments

Currently one can use a tag such as

`xml:"localname"`

to unmarshal something with a given name, and an arbitrary namespace. However, it doesn't work the other way around (a known namespace and an arbitrary localname). It would be nice if there could be some kind of glob unmarshalling so that something like an XMPP error conditions:

<defined-condition xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
<invalid-xml xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>

could be unmarshaled via:

`xml:"urn:ietf:params:xml:ns:xmpp-stanzas *"`

CL https://golang.org/cl/19812 mentions this issue.

adg commented

Someone who understands XmL should look at this.

adg commented

ping @rsc

rsc commented

How does the * interact with the > syntax? Today you can have a tag string like "a>b>c". Does "a>*>c" work? During Marshal, what does it mean to have a * tag? That's obviously not the name of the element.

This seems plausible but there are a bunch of little details that may interact with * that we need to enumerate and define answers for. Given the xml.Element definition in the other proposal, a natural type for a field with tag * would be xml.Element.

This seems plausible but there are a bunch of little details that may interact with * that we need to enumerate and define answers for.

Since submitting this I've thought about it a bit more, and all these little things start to add up to a larger query language / DSL in tags than I'm really comfortable with (it starts to become xpath, which is a truly terrifying thought). That being said:

How does the * interact with the > syntax? Today you can have a tag string like "a>b>c". Does "a>*>c" work?

In my mind this makes sense; it would match:

<a><z><c>value</c></z></a>
<a><y><c>value</c></y></a>
<a><yournamehere><c>value</yournamehere></y></a>

… etc.

possibly finding the first match. Maybe multiple fields with the same * tag would find subsequent matches?

Then again, for the sake of keeping things simple maybe it shouldn't be allowed to interact at all.

During Marshal, what does it mean to have a * tag? That's obviously not the name of the element.

I think that in this case it would fall back to the name of the field; that is the following:

struct {
Name xml.Name `xml:"iq"`
Ping `xml:"urn:xmpp:ping *"`
}

would marshal to:

<iq>
<Ping xmlns="urn:xmpp:ping"></Ping>
</iq>

Maybe a different character should be selected since people are used to thinking of * as a selector, and this is more a placeholder. _?

If the * is actually in the xml.Name, though we wind up with the same question (and I have no idea what the answer is).

rsc commented

Sounds like this needs more thought / is not a good idea, so declining the proposal.