Rebase off ruamel? - many new valuable features
mikedlr opened this issue ยท 7 comments
ruamel.yamel is a fork off PyYaml which has been much more active in recent years. According to the author it passes all PyYaml tests and so it should be more or less backwards compatible.
At the same time it has many new features and code improvements such as
- support for newer versions of YAML, such as 1.2
- support for editing YAML whilst maintaining comments
- unified Python 2 and Python 3 codebase
It would be really useful if these features could be available to all the users of PyYAML. Currently there are many more changes in ruamel.yaml than in PyYAML. Also I believe it already has some of the fixes. This means it would be easiest to rebase off ruamel.yaml and then port all of the new work on PyYAML. I'd like to propose that.
Probably easiest would be to just import the ruamel.yaml code into the current repo, but there could be other proposals such as starting to use the existing ruamel repository (unfortunately it's a mercurial repo so wouldn't be fully compatible with github).
Any comments on this? Reasons not to?
There are presently too many users of PyYAML that could break if we were to just wantonly pull in ruamel.yaml. If folks want to help pyyaml by contributing fixes and YAML 1.2 compatibility that's more than welcome. Rebasing off a completely separate project (regardless of origin) carries too much risk in my opinion. We would need far better testing before we could be certain that the updates wouldn't break downstream consumers of PyYAML which have been relying on the stability of releases for the last few years.
On the other hand, PyYAML has been practically dead for the last several years so there shouldn't be urgent need for backwards-compatible update. That is to say, that if you bumped version into 4.x, those upgrading should understand that things have changed and they must pay special attention if they wish to upgrade, as usual.
Related ticket in Bitbucket: https://bitbucket.org/ruamel/yaml/issues/81/merge-back-into-pyyaml
The author wrote there:
I don't think there are any bugs that are fixed in PyYAML that have not been fixed in ruamel.yaml. It e.g. still passes all of the PyYAML test. Merging is as easy as replacing.
He also states:
But that leaves the question open as to why not contribute any changes here instead of trying to catch up, wouldn't that be much more sensible?
I don't personally, as a user of the library, care too much about the name. I do however, humbly acknowledge and appreciate the work authors have put into it the projects (both @xitology and @AvdN, plus whoever has contributed). It would just feel a bit weird to re-implement the code already existing in ruamel.yaml
(e.g. YAML 1.2 support). I mean, it's not that these libraries were competitors and do things a bit differently. On the contrary, they base on same code and ruamel.yaml
just has bugfixes and features added.
So, if PyYAML is dead and ruamel.yaml
keeps on going, just mark PyYAML
dead, send the word out and put the effort on ruamel.yaml
. Or merge ruamel.yaml
code into PyYAML
(I would prefer this due to naming & history), bump version and mark ruamel.yaml
dead. Anthon seems to have some objections on using Github and, understandably, would like to preserve the name of his library, but I like to believe things are always open for disussion :)
@tuukkamustonen PyYAML isn't dead. We're also not reckless.
That is to say, that if you bumped version into 4.x, those upgrading should understand that things have changed and they must pay special attention if they wish to upgrade, as usual.
Yes, except in most cases people don't take care with their dependencies. I'm also a requests maintainer. When requests bumped to 1.0 (and then to 2.0) we documented our breaking changes and still were treated as if we had done some world ending thing. Now, PyYAML isn't downloaded nearly as much as requests, but I'm not keen to make the same mistakes.
Many projects use PyYAML in their requirements without a pin or an upper cap (e.g., <4.0
) because they have come to expect the stability of this project.
We don't want to duplicate the work that went into ruamel.yaml but it looks as though Anthon is frustrated with the author for not being responsive and further frustrated with the maintainers for moving the project to GitHub. I have been focusing more on libyaml right now, but pyyaml will soon be on my list. I plan to migrate the open pull requests on bitbucket (which include Anthon's https://bitbucket.org/xi/pyyaml/pull-requests/6/merge-of-python2-and-python3-codebases/diff) and review them as they are. So in all likelihood we'll end up converging on the same code-base, but it needs to be reviewed carefully and tested appropriately to ensure we continue to provide the stability to our users that they expect.
It's silly to not do this just because you may hurt folks who have bad practices with regard to dependency tracking.
- It seems as though reports of PyYaml's death have been greatly exaggerated.
There are commits in the last 7 months, and the most recent closed issue was 12 days ago. - Apparently, ruamel.yaml has made some forward strides, and I don't know if pyyaml has kept up.
It seems like this would be a good idea because pyyaml has the most history and ruamel.yaml is more up-to-date (or so they say).
Status report?
We actually would like to make a PyYAML release soon. I just started to try and help out here, but I don't have much experience with Python so far, so what I can do is limited.
Maybe @ingydotnet can comment on the plan.
As far as I know ruamel and PyYAML have different implementations, but that's about all I know.
If you look at http://matrix.yaml.io/ there are some differences between them, although these are mostly edge cases.
I think there are many other issues to discuss if the plan is to replace PyYAML with ruamel.
I could take on a small task.