mackyle/topgit

Some scenarios result in unnecessary extra merge commits to get everyone "in sync"

pmolodo opened this issue · 3 comments

ie, in general, if we have two committers, A and B, and an upstream maintainer U, then if this happens:

1) U makes a change to a base branch
2) A fetches U, runs update (which creates new merge commits)
3) B fetches A, then runs update (which may or may not create new merge commits)

At this point, A will probably expect that, since B ran update after pulling A's changes... that they should have a "completely synced" version of the, and they will expect this to happen:

4) A fetches B, then runs update (no new merge commits made, fast-forward only)

Now, I know that it may not always be feasible or easy to meet A's expectation here... but I think that it should at least be the goal. Unfortunately, if the scenario is that there's a step 2.5, between 2 and 3:

2.5) B makes a commit on a topgit branch

...then it may be the case that our expectation is not met, and at 4), when A runs the update, they end up making a new merge commit.

For the workflow at our company, at least, this issue has created a number of headaches for our developers. We've ended up with scenarios with what I've called "battling top-bases" as a result, with a lot of confusion... and confusion tends to create merge conflicts / mistakes.

I have a script which provides a replication scenario for this.


In theory, however, this should be relatively simple to avoid. I think the problem is that, right now, the general way in which an update for a topgit-branch mybranch happens is:

  1. We update top-bases/mybranch by merging in any of it's local dependencies
  2. We handle merging of remote branches (at least for remote/top-bases/mybranch... and possibly handle remote/mybranch, too, I think)
  3. We merge top-bases/mybranch into mybranch

The problem with this ordering is that the remote branches in 2) are likely (or, at least, it's possible) to contain some of the merge work we've already done in 1)... but by the time we get to 2), it's too late, and some merges are now duplicated between top-bases/mybranch and remote/top-bases/mybranch. I think if we just made the policy to merge remotes FIRST, we could avoid some unnecessary extra merge-duplication. That is, I think we should have an ordering that looks roughly like:

  1. We merge remote/top-bases/mybranch into top-bases/mybranch
  2. We merge remote/mybranch into mybranch
  3. We merge the dependencies into top-bases/mybranch
  4. We merge top-bases/mybranch into mybranch

The key is that at steps 1 and 2, we may be doing fast-forward merges... so that when we get to steps 3 and 4, the merges may already be done, so we end up re-using more work.

(As for why I think we should do step 2) before 3), see this issue: #6)

I took a quick stab at doing this myself, by switching around the ordering of pieces within update_branch_internal() in tg-update.sh (ie, swapping the "First, take care of our base" section with the "Second, update our head with the remote branch" section) - but I think there's some assumptions about order of operations, and I ran through the time I had available to try to sort it out. I'm hoping this would be a bit easier to do for you...

Here's a replication script - it's a bash script, but github wouldn't allow that file type, so I figured I'd just change the extension, rather than put it in a zip file...

merge_order_test.txt

If you carefully watch the messages go by during an update -- for example, using the test script you provided -- you will see that it indeed claims to be updating the local branch's base with the remote base.

That being exactly what you want it to do. :)

Unfortunately, you've uncovered a bug that's been around since topgit-0.19.4. Another side effect of the bug is that tg summary still shows things not being up-to-date after the problematic tg update has been run.

In any case, the fix has been included in the topgit-0.19.13 release and there's a new test for this situation t5021 to make sure it doesn't come back.

Great! Glad to hear this has been fixed!