golang/go

encoding/xml: restore Go 1.4 name space behavior

rsc opened this issue · 16 comments

rsc commented

There have been changes to the name space behavior in encoding/xml in this cycle. Those changes affect the output of existing programs, possibly breaking them. (I know of at least one real case.) And yet it's still not clear that the name space story is complete. Is it right to make breaking changes at all? Will we make more breaking changes in Go 1.6 to complete or extend the story? Should we make all the breaking changes at once? Should we arrange that you have to opt in to the breaking changes? Should we declare encoding/xml done, like syscall, and require people who want different behavior to use forks?

All this needs to be more carefully considered, and there is not time left for that in the Go 1.5 cycle.

I propose to rewind the name space behavior back to what we had in Go 1.4 and then look at this all very carefully in the Go 1.6 cycle.

@rogpeppe @nigeltao

CL https://golang.org/cl/12570 mentions this issue.

I agree with @rsc. I started hacking on this stuff a bit in an attempt to close issues for Go1.5 and started to feel the same way ("you are in a hall of mirrors, and some of them reflect from a universe where the namespaces are all different"), but didn't have enough data to pull the emergency brake.

I do not fully agree.. XML processing, for anything other than really basic stuff, is somewhat broken, especially when namespaces are concerned. We need to start making improvements and if we wait to the magical point in the future when we can do it all at once, then it will NEVER work as we will never get to that magical point. I think right now people are doing hackery to "make it work". This makes code very brittle overall and is what is probably going to break when these changes are made.

Please reconsider. Lets get some of the changes is now, and have everyone test and find what else still needs to be done. Then lets get a high priority items set for it to be finished in 1.6.

I agree that it's a pity that there are breaking changes, but I think it's
still worth making them now.

Note that the most important changes that I made were to the low level
tokenization, which was fundamentally broken to the point where
you could not round-trip any XML containing namespaces through
Encoder/Decoder without producing broken XML. This is now fixed.

I tried to make as few changes as were necessary to the struct marshaling
and unmarshaling, because a) I didn't want to break things and b) that
wasn't what I was trying to fix at the time and I didn't want to get bogged
down in it.

I don't see that any of the above-mentioned issues that are broken as a result of the
low level tokenization changes, and can only think one issue
where existing valid XML output was broken as a result of this (issue #11431)
and that was still a struct-encoding issue.

So, I agree that encoding/xml is broken but I'd argue that the foundations of
it are now sound, and that we don't have to fix everything now. Also,
we'll get to find more issues when more people actually use it.

On the other hand, I do see that it's a pity to have to make breaking changes
twice (even if, strictly speaking, it's within the remit of the compatibility guarantee
because it's fixing bugs). Another solution might be to move the current
version of encoding/xml to somewhere under golang.org/x, do the reversion
and make more breaking changes (with appropriate fanfare) in Go 1.6,
including a potential rethink of the way that namespaces work with structs,
potentially even renaming the package in std so we can remain backwardly compatible.

For us, it won't make too much difference because we have to use a fork of
encoding/xml anyway as we have to use the Go version in Ubuntu LTS.

Thanks for making those changes @rogpeppe I am grateful for them. I just want to see them put in to 1.5. Please!

@rsc's proposed https://go-review.googlesource.com/#/c/12570/ doesn't revert all of the encoding/xml changes since 1.4. It says:


This CL reverts:

5ae822b encoding/xml: minor changes
bb7e665 encoding/xml: fix xmlns= behavior
9f9d66d encoding/xml: fix default namespace of tags
b69ea01 encoding/xml: fix namespaces in a>b tags
3be158d encoding/xml: encoding name spaces correctly

and adjusts tests from

a9dddb5 encoding/xml: add more EncodeToken tests.


These git hashes map to gerrit CLs:

5ae822b https://go-review.googlesource.com/11832
bb7e665 https://go-review.googlesource.com/11635
9f9d66d https://go-review.googlesource.com/6927
b69ea01 https://go-review.googlesource.com/5910
3be158d https://go-review.googlesource.com/2660
a9dddb5 https://go-review.googlesource.com/4167

These are all about encoding in-memory structures to the wire format, right? Decoding from the wire to in-memory is unchanged? Or did you (@rogpeppe) make tokenization fixes for decoding as well as encoding?

I'm inclined to do the conservative thing and roll back, and provide an interim golang.org/x package during the 1.6 cycle if there's demand for it.

CC @rsto for any golang.org/x/net/webdav implications. At the least, we will need to update some golden tests.

rsto commented

In terms of net/webdav it would be a pity to see 3be158d https://go-review.googlesource.com/2660 rolled back, but even if we can't find a workaround in net/webdav most WebDAV clients should manage.

That being said, it would be great to have a temporary fork of encoding/xml at golang.org/x, and I'd volunteer time to support maintenance.

@nigeltao I'll wait until 12570 lands and a decision has been made about any interim fork of encoding/xml.

ebfe commented

It would be really unfortunate if we had to wait another 6 month for
@rogpeppe s namespace prefix changes.

More like tragic if we have to wait another 6 months.

I believe this should be a permanent fork, encoding/xml should be
considered "finished" and new development and future improvements should
occur elsewhere.

On Fri, 24 Jul 2015 04:54 Robert Stepanek notifications@github.com wrote:

In terms of net/webdav it would be a pity to see 3be158d
3be158d
https://go-review.googlesource.com/2660 rolled back, but even if we can't
find a workaround in net/webdav most WebDAV clients should manage.

That being said, it would be great to have a temporary fork of
encoding/xml at golang.org/x, and I'd volunteer time to support
maintenance.

@nigeltao https://github.com/nigeltao I'll wait until 12570 lands and a
decision has been made about any interim fork of encoding/xml.


Reply to this email directly or view it on GitHub
#11841 (comment).

If encoding/xml is to be considered "finished" then all the more reason to
go with the fixes I've made, because it is seriously broken without them.

All I have done is fixed bugs. AFAICS all the currently extant issues
were present before I did so.

Fixing more bugs would break more code. I still haven't seen an example
of some code which is broken now where it was working before.

FWIW I think it's well worth having encoding/xml in the standard library.

I am not sure how you could ever consider something so broken to be FINISHED... It should not be marked finished until it actually works and does what the community needs it to do. I think most of us are not yet expecting it to be fully featured like the Java version. But it needs to do most of the normal things. The fact that namespaces do not work is a big problem.

"finished" was a poor choice of words, hence the quotes.

My position is the limitations of the Go 1 contract, as well as implicit guarantees on previous behaviour that we only discover after the fact, will make it extremely difficult for the XML package to evolve.

I believe a better solution would be to develop a new XML package outside the standard repo.

If we do this, which I can see your rational, then we need to mark the current one as broken, and put a link to the new one and say use that. We do not want people building tooling around an official package that is broken and then having them say, "well Go must not be that stable or not very good because this official core package is broken and no one has fixed it.". Having been in software development for a very long time I would argue that breaking changes in minor release are justified and okay, when the original intent of the package does not work and some how made it through QA.

What I am saying is, if we were to survey the community, and ask them..... "Do you want the main repo fixed, even if it requires breaking changes, or do you want a new parallel package that may never get merged back in to the main repo"? My guess is the vast majority would just say, "Fix the main repo".

I thought the question wasn't a choice between "fix the main repo now (with breaking changes)" or "use a separate package forever". I thought it was between "make breaking changes twice, with Go 1.5 and Go 1.6" or "make breaking changes only once, in Go 1.6, and in the meantime, supply a separate, temporary package for those who want it (but wind it down once Go 1.6 is out)".

rsc commented

I don't believe we should freeze encoding/xml completely. As time goes on, of course, the bar for changes to any standard packages keeps going up. But there are clearly things that are not right about the handling of name spaces, and it would be good to fix those, provided we can figure out what the right behaviors are. My concern for Go 1.5 is that I don't believe we actually know those answers, and the release is upon us. I do want to try to understand name space behavior better for the next cycle and try to make a fix. I hope that a long term fork only for more sensible name space behavior is not necessary.

It's a bit extreme to call waiting for Go 1.6 "tragic". No other packages in the standard library force the use of encoding/xml (unlike, say, net or net/url), so it is trivial for people who need different behavior to use an internal copy of encoding/xml in the interim. @rogpeppe even noted that for Ubuntu they can't assume Go 1.5 so they have to use a copy anyway. Webdav can do the same (x/net/webdav/internal/xml). If you're willing to depend on the vendor experiment you can use your/code/vendor/encoding/xml and not even have to change your import paths.

I would prefer not to see a golang.org/x/exp/xml. I'd rather people make forks with the changes they need instead of trying to have a fork that's all things to all people.

Again, fixing xml name spaces is on my short list for Go 1.6.