Why is squashed merging your preferred workflow?
marten opened this issue ยท 8 comments
I'm wondering why you chose to adopt this workflow. Do you really want to get rid of the individual commits, or do you only like the simplified history? We're looking at using this tool because it closely matches our current workflow, with the biggest difference being the squashing.
It's probably a lack of understanding on my part, but I'm wondering what advantage I'm missing, since the cleaner history can also be gotten with git log --first-parent
and seeing everything that was merged in a specific commit can be achieved with git show -m SHA1
(provided you do --no-ff
merge commits).
This might also be something to add to the README or Wiki, since I can't be the only one wondering this.
Hi Marten!
So you definitely aren't the first one to bring this up and we're thinking about building a way to configure the type of merge, but here's some background.
When it really comes down to it, the only place we care about enforcing a particular style of commit is in the master
branch. We don't care if you make a thousand commits to get there, the only thing we care about is the individual features that come in from each (small) pull request.
And while the history is nice, the biggest advantage of using the squash merge is that over time, git blame
becomes way more useful. You get to see for every line of code in your project, not only the person who changed it, but their commit in the full context of why that change was made, including an easy-to-reference link to the pull request and ideally (through the pull request description), a link to the ticket tracker. So we can tie any line of code all the way back to the ticket that caused it's creation.
And over time, that's all we really care about in the history. Who made this change and why was it made. Squash merging allows us to do that while still giving all of our developers the individual freedom to develop in the way that suits them best. To try and enforce commit styles in branches owned by other devs is to me, micromanagement that will go against the best results.
Hope this helps!
Thanks,
Nick
P.S. I'll close this and link to it from the README.
@nhance Sorry! I tend to use strong words to express myself as long as I'm not using them to insult people publicly I'd never call someone something, that would be rude and is hardly ever actually true anyways. But calling people "not something" is something that I'll use more freely, but possible shouldn't.
Anyways, what I meant was something along the lines of "It's probably a lack of understanding on my part, but...". Given that this issue thread is going to be linked up, and that there's an edit button available, I hope you'll forgive me for rewriting history to make my intentions clearer. ;)
And I hadn't thought about blame. That's a good point.
I'm usually not in favor of misusing issues trackers for discussions that should better be led on mailing lists, but now that it started here and even is linked from the main page, I'd like to add a few things:
- Even without squash-merge you can quite easily find out which merge introduced a specific commit, e.g. by using git-when-merged.
- I disagree that
git blame
"becomes way way more useful" with squash-merge, because most of the time when you blame you'd like to understand why / in which context some one did a change (and who). A good developer puts exactly that reasoning into the commit message that you lose when squash-merging. - In general, you lose granularity in your history, which would be good to have e.g. for
git bisect
, too, to track down commits that introduce bugs. Sure, less commits (as a result of squash-merging) may lead to fastergit bisect
times, but on the other hand you cannot just revert that single commit from the original feature branch anymore that introduced the bug, but keep the rest. - With a lot of feature branches, at least I sometimes forget whether a particular branch was already merged or not. When you use squash-merge, you lose the capability to quickly check this with a simple
git log feature --not upstream
or similar.
This thread is a little old, but hopefully it's okay to revive it. I'm curious if there was ever any progress in configuring different kinds of merge workflows?
The idea of using a squash merge is one of the really attractive things about reflow to me, because it allows one to use pull requests without the baggage of merge commits all over the place that mess up a clean linear history. I totally get that this makes viewing the git log easier, git blame, etc. Totally on board there.
However, there are times when a PR / feature can validly consist of logically distinct changes. Not a whole bunch of them, but at least some times more than one. Have you thought of git merge --ff-only
? Seems to accomplish the same goal while still allowing multiple commits to get merged to master. Though, I suppose this does require rebasing the feature branch against master and force-pushing the remote feature branch before deliver
. Maybe it is this difference in workflow that doesn't fit as well inside reflow
?
I totally get that this makes viewing the git log easier, git blame, etc. Totally on board there.
Squash merges do not make git blame
easier. git blame
does not really care if it has to go through a few more commits. In exchange, you get the exact commit and context (from the commit message) if you did not squash. Finally, it might be that different people contributed commits on the squash-merged branch, and you might ending up blaming the wrong guy in that case.
Have you thought of
git merge --ff-only
?
So what if the merge fails because the target branch cannot be fast-forwarded? Then you would probably have to rebase
. So you could as well skip the merge and always rebase, as in the case git merge --ff-only
would succeed rebase
would be a no-op.
There has been some discussion on changing the squash merge (see #111 and #149 for some reasons why); and I believe we have decided to alter our preferred workflow soon as we've found that squashing commits (via rebase) in the feature branch just before merging to master yields a little better result than the squash merge into master directly. It still does not resolve the issue of clearly labeled attribution for multi-authored feature branches, as the authors will be hidden in the squashed commit message and attribution given only to the author performing the squash. However, there are some best practices already for handling that.
That said, we are currently working on building out workflow customizations for altering our standardized process for each command. For a very rough idea of what we have planned you can view my WIP on using it here. So even if you still prefer the squash-merge to master approach that exists now you should be able to add your own customization fairly easily.
the baggage of merge commits all over the place that mess up a clean linear history
I feel like that is a reason one might use if they aren't aware of the many options git has for examining history. Specifically: git log --oneline --first-parent
which if done on master, gives one-line-per-feature-branch-merge no matter what method of merging you use.
Lines like:
{sha} Merge {branch-name} (#pullnumber) into master
A "linear history" doesn't really describe git, nor software development. Things are done in parallel in real life and it makes sense to not fake that in git. When reviewing history git can be made to give the commits in almost any order / fashion desired, there are a million options.
We switched our model from a rebase based workflow to the squashed merge workflow because we have a large team of developers (around 60) at many levels of familiarity with git, and rebase as a standard part of the process requires force pushing, which is confusing and scary to those unfamiliar with it. Squash merge reduced resistance to git reflow from many people on the team because now regular merges can be used everywhere to get the distributed updates from the rest of the team as often as desired and those merge commits will not clog up the git history anymore thanks to the final squash merge.
All of the fancy things you are mentioning are great for git experts, but the reality is we work with lots of people who don't know git, and although they could learn it, I'd rather they not mess up my repos with rebase and force pushes while they learn. merge, merge, merge, then squash merge, is safest.
git reflow reduces git to a simplified workflow that just works, and is only 4 commands people need to learn.