schacon/hg-git

github merge button leaves extra clutter "heads"

Closed this issue · 2 comments

I am seeing a problem similar to #217 . Let me see if I can describe what the problem is like a little better. Note, some of this may be me just misunderstanding a "correct" behavior of GitHub, however there is one specific thing I think must count as a "real" bug in here.

There's a project I've been working with ( https://github.com/ivansafrin/Polycode ) which has many forks and has had several pull requests, some of which are still open after a few weeks and some of which are closed. When I pull from this repository, I see several "heads" (revisions with no children, list accessible using hg heads). What it looks like at first is that the different branches from the pull requests are being pulled into the single hg repository, and some of these branches terminate at heads. Okay, that sounds sensible.

But! Then I look closer and things get weird. I make a new clone of ivansafrin's Polycode with Mercurial 1.8.2, then I compare the github "network" graph to the revision graph pretty-printed by MacHg. I see:

Network: https://github.com/ivansafrin/Polycode/network
MacHg: http://i.imgur.com/WyJ1g.png

hg heads lists six heads; four were created by the "GitHub Merge Button" user. Meanwhile, like 96, 91 and 87 in the MacHg screenshot, none of these correspond to anything meaningful-- none of those revisions appear in the GitHub "network" graph. Meanwhile, the bonus heads all appear to correspond to moments in time that Ivan merged a pull request. For example, in github revision f6303 Ivan commits with "Merge pull request #21 from LeeRichmond/master". This shows up in the MacHg screenshot as r97, but at the same time that was created this extra garbage revision r96 by the github merge button appears. (Extra trivia: If I diff r96 and r97, there are no differences at all; if I type "hg book" only one revision is listed as having a bookmark.)

What it looks like to me is that on the git side, r96 and r97 (for example) are either equal or r97 is somehow a child of r96, or the r96/"merge button" revisions are just somehow invisible. But to the mercurial side, they are for some reason visible. And since they do not exist from the perspective of the git users they will never be operated on, or receive a child revision, which means they will never go away and the number of these garbage revisions will increase forever as more pull requests occur... So:

  • This looks to me (?) like a bug.
  • If this is not a bug but just something about the github data model that doesn't translate to mercurial well, then there is still a usability issue, either with hg-git or with the "github merge button" implementation-- since I expect most hg-git users will be using github, creating a situation where using pull requests on github screws up the hg heads feature is probably bad for hg-git as a platform
  • Failing that what is the "correct" way to dispose of these garbage revisions? If I say hg commit --close-branch on those revisions they do leave the heads list, but I am not sure what happens if I then push back to github and of course that change will only take effect on my branch... I can't very well pull request them back to ivansafrin because merging a pull request creates new clutter revisions :|

There's one last thing that confuses me-- and this is the "real bug" I promised at the top. What I describe in the paragraphs above is a fresh clone from ivansafrin/polycode. However my "working repository" has two heads:

$ hg paths
default = git+ssh://git@github.com:mcclure/Polycode.git
upstream = git+ssh://git@github.com:ivansafrin/Polycode.git

I push and pull from default, and I regularly hg pull upstream to get the new revisions from the main repository (I think this is the correct way to do things). I've been doing this for a few weeks, and over time, additional "github merge button" revisions have appeared out of nowhere! In my "working repository" currently if I hg heads | grep user: I see seven "Github Merge Button" heads, as opposed to the four I see if I clone ivansafrin/Polycode fresh (or clone mcclure/Polycode fresh and then pull from ivansafrin/Polycode).

In the comments for issue 217, it was pointed out "Perhaps the source of your pull request is rewriting history and therefore causing you to gain some defunct changes on the hg side?". This may be what is happening-- if I hg log there are somehow 104 revisions in the working repository as opposed to 101 revisions in a fresh checkout of ivansafrin/Polycode or mcclure/Polycode. However my understanding is that the maintainer of ivansafrin/Polycode is not doing anything special or magic, just using the github web interface and the github-for-mac gui client, so if github's merge tool is rewriting history in a way which is confusing github's hg-git plugin this is probably a usability problem also from github's perspective (and the "what is the correct way to make the extra revisions go away?" question stands here too).

I hope this makes sense, thanks.

OK, this is something that was introduced recently where we're (GitHub) exposing references outside the heads/ namespace for pull requests and temporary merge products for those pull requests. You can see what it looks like here:

$ git ls-remote origin
9caa40dca60562aade441b6c14020e1716b55de3    HEAD
9caa40dca60562aade441b6c14020e1716b55de3    refs/heads/master
bedc1bda8c38455df74e5acf54b0df5cb09c9494    refs/pull/13/head
1aeec1014a007294bb9e30500637144fd92a5960    refs/pull/13/merge
623b42fb1d75e6112cf09f22b76d78662436e5dc    refs/pull/14/head
d54049aaee01743743043dffecfcd1b7da897ffc    refs/pull/14/merge
992e27761dbb848c8c7334241bd9801e25b7dbdf    refs/pull/20/head
be4e98c07ddcaed29f487d3ec8d719e0688f975a    refs/pull/20/merge
4070760842102898f42c9eeb336a3a37880baece    refs/pull/21/head
b08f983ac5f8af53e8190b0b2b43082f2c0be619    refs/pull/21/merge

The real issue is not that we added these, but that hg-git is converting commits outside the refs/heads namespace to changesets, which it should not be doing. It breaks with this, but it would also break with other stuff kept there, like replacement refs or notes. I can look at the hg-git source (which I haven't touched in a while) to see if this is a quick fix, but this is an artifact of that bug. Once that is fixed, you should be able to get these out with a rebase strip, I believe.

I'm closing this because isn't it long since fixed by #223 ?