Should you squash commits or merge them as they are?
There’s certainly strong proponents on both sides of this debate. It becomes more complex when you consider the third option - rebasing. I never rebase, so I’m not going to discuss it here. The choice of merging commits as they are or “squashing” commits is not always clear. Both sides of the argument will fiercely extol the advantages of their preferred approach and the disadvantages of the opposite approach. It soon becomes a philosophical debate.
This could be considered the default behaviour of Git. The source control history will be preserved exactly as it happened. The argument for merging commits exactly as they are is that the commits form a history of the codebase and, therefore, are valuable because of the story the commits tell. The commits can be used to track down bugs, for example. It is easier to track down when a bug was introduced if there are lots of small commits instead of many compressed (squashed) into one. The argument against this is that too much noise makes the previous claim harder to be practical.
As its proponents argue, squash merges are more valuable than merge commits because entire new features or bug fixes can be compressed into a single commit and, therefore, easier to code review and read at some point in the future. As previously discussed, another benefit of squash merges is that they prevent noisy source control history, which can include typo fixes, previously accidentally missed files, etc. It keeps the history “clean”, as they say. However, proponents of merge commits vehemently dislike this concept, feeling that much history and context is lost in the process.
So, what’s the best choice?
I can honestly see the advantages and disadvantages of both sides of the debate. I don’t have a good answer for which should be the default option for a codebase. I like the ability to see every commit, but I also like a cleaner-looking history. I think this puts me in the minority simply because I haven’t become enthusiastically attached to one way or the other.
I think that factors such as team size (including teams of one developer) and how the source history is used (if at all) should be considered before picking an approach. For example, I am much less inclined to use squash merges on the repository for this site as I am the only developer. However, at times I have squash merged when I’ve had many commits with messages I didn’t want in the source control. On the other side, I am more inclined to use squash merges on team projects where other developers may read source history.
Another important consideration is usage of automated tooling and processes that use the commit messages as part of the CI/CD build pipeline. It is common to have the CHANGELOG.md file have new features and breaking changes added to it. How those tools are able to do this can depend if all the commits are present or if they are squashed.
What can I agree on then?
Whichever side of the debate you, the reader, are on, I certainly hope that as professionals, we all agree that keeping the build green (build pipelines running successfully and all tests passing) should be considered not just a default but expected for any software project regardless of chosen source control processes.
Improving your Git experience
Regarding Git tooling, I’ve found GitKraken to be a valuable asset. I use it for almost all of my Git usage simply because Git branches and commit history are far easier to understand visually - regardless of merges or squash merges. It’s not free, but it is absolutely worth the time savings and less frustrating Git experience. Consider buying it through my affiliate link to support quality software and have a chance of winning a $100 Amazon gift card.
- Version Control with Git: Powerful Tools and Techniques for Collaborative Software Development Paperback, Amazon
Wait, so why don’t you like rebasing?
I don’t plan on going to great lengths to explain my dislike for rebasing, but I knew there would be a lot of questions, so I’ll write a small note on it. Fundamentally, I dislike rebasing because of its possibly destructive nature. Sure, there are also a number of proponents of rebasing - usually spouting pithy advice such as “It’s not destructive if you use Git right”.
Git is already hard enough to use with a very poor and inconsistent developer experience that leads to too many mistakes. Telling developers to just “use Git right” when rebasing could result in you overwriting a team member’s work feels somewhat thoughtless. Not to mention, there are many explanations and guides with elaborate diagrams showing fictional repositories being rebased and “rewriting history” (that’s the destructive “oops I deleted your code” part) that I find scary.
Finally, merge conflicts during a rebase result in some of the worst developer experiences imaginable. Each conflict will be represented one commit at a time ad nauseam. This is confusing and an easy way to pick the wrong side of the conflict. Reversing a rebase is difficult. Merge conflicts during a merge are presented all at once - I’d much rather pair with the developer my commits are conflicting with and work together through a single commit than potentially dozens.