TBH, at least at every company I've worked at recently, the style of the commits is more or less irrelevant to me save for one very important piece: The commit starts with the ID of the ticket for the code change (or some placeholder for very small commits where a ticket wasn't necessary, e.g. [no-tix]). The biggest benefit of this is that these commits can then be automatically linked to the ticket which often has a lot more detail and conversation.
99% of the time when I'm looking at commit history, I'm not just generically scanning. I'm usually looking for a particular change, or when something touched a particular file, etc. Even the example given in the linked page isn't really relevant to me, because I'm rarely "quickly scanning" a whole commit log for something in particular - I'd always just use a search feature (in GitHub or some git IDE plugin) for that.
To me, the ticket id is extra info and goes at the end of the message body, not in the title at all.
I've seen far too many people think that putting the ticket id into the commit message can be a replacement for writing a good commit message. I've even seen people think that "Fix #123" is all they need to put in there.
The commit history should be the primary source of truth for what changes have happened.
I've also been at orgs where they've switched from one ticket tracking software to another, and haven't maintained history, or even when they do, the ticket ids and up changing afterward.
While I agree the commit message should summarize what happened in the commit, I pretty strongly disagree that the linkage between git and the ticketing system should not be considered an absolute "first class relationship". There is simply too much information/history to consider that linkage just a "nice to have".
Regarding migrating to new issue tracking systems, "not maintaining history" is absolutely a critical fail at that org. Every decent tracking system I can think of has ways to import issues from other systems - skipping that step is pretty inexcusable for all but the earliest stage projects. And the vast majority of them have ways to transparently map ticket IDs. E.g. in Jira if you migrate a project and give it a new project ID, the old IDs can still be used and they will transparently redirect to the new ID.
I’ve never seen a ticket system migration keep links alive. I’m sure there are professional IT centric orgs out there that prioritize this, but in my experience, the version history (master branch) is what’s kept in practice, to keep migration cost down. Git blame is the best you can hope for, long-term.
I've found that there are mainly two kinds of tickets:
1. The perfunctory kind, where you make the ticket just to track the work, when the work is already obvious and well-known. There's no discussion in the ticket; you just create it, do the work, and close it.
2. The kind where you end up with a lot of back and forth between various stakeholders about what the requirement/feature/whatever is, or about how to reproduce a bug that needs to be fixed, or something like that. In this case, there's a lot to wade through, and while the history can be an interesting artifact as to how everyone got to the end state of the git commit that closed the issue, it's overly wordy.
And sure, there are tickets that are somewhere in between.
But ultimately I think the kinds of tickets that are closer to #1 don't even need a reference in the git commit, because pretty much all of the ticket's text should be in the commit message anyway. And for #2, there's way too much information in the ticket anyway; it should be distilled down to the important details for the commit message. And yes, a reference to the ticket should be provided as well, for people who want or need to wade in to get the full context. But that goes at the end of the commit message, after the change has been adequately described.
> "not maintaining history" is absolutely a critical fail at that org
Absolutely agree! But that doesn't mean it doesn't happen (and often!), and when it does, all your ticket references become useless, so you'd better have a good, thorough description in the commit message, or all context is lost.
> Every decent tracking system I can think of has ways to import issues from other systems
But almost universally you end up with new ticket IDs when that happens, so the references in the commit messages become useless. Sure, many of those importers will start off the description in the imported ticket with something like "Imported from JIRA ID FOO-123", but that requires extra effort to search. Sure, if you migrate from one instance of JIRA to another instance of JIRA, there are ways to keep those links intact/redirected, but I'm not aware of any ticketing system that somehow preserves URLs/IDs from completely different ticketing systems.
> I think the kinds of tickets that are closer to #1 don't even need a reference in the git commit, because pretty much all of the ticket's text should be in the commit message anyway
Not really, because even tickets without body or comments have links. At the minimum, a decently configured tracker must have links to all the other commits involved. And even when there is only one commit, it's useful information unless all tickets in your project are done in exactly one commit.
What if there was another way to handle things like #2? What if you got everyone that needed to be there in the same "room" and had a synchronous conversation in order to resolve all ambiguity?
We don't call them tickets, because we aren't "ticket takers" and we don't expect them to contain all of the information -- how could they, it's our job as implementers to discover and understand. They are work items, and they are placeholders for conversations. We use high-bandwidth communication to reduce, and often eliminate, the "back and forth" that is so common on many teams.
We keep logs of our work both on a personal and project level, which includes decisions from these high-bandwidth conferences. When we have stakeholders that are hard to get meetings with we schedule a regular cadence with them so we can get their time at least once a day/week/whatever.
If you work in a regulated industry (medical, aerospace, auto) you need tracability of all code changes, so even your case 1 needs a ticket reference. Was it a bug? A feature change? Is it linked to requirements? Has it been tested and verified? Which release will it be in?
It's still a critical fail, and it's not like you can blame not properly migrating your ticketing system on someone not having permissions. And in practice, I find the opposite to be true: many more stakeholders with necessary reason to see the state of an issue are likely to have access to the issue tracking system than source control.
But more importantly, this thread somehow got off the rails into a debate about only leaving a ticket ID in the commit. I never argued for that, and I don't think anyone else did either. A commit message should still state what happened, code-wise, in a particular commit (e.g. one ticket may require many commits or PRs). But I still think the ticket ID is the most important piece of info in the message, as it ties the commit to important context about why the code change was made in the first place.
Colour me old fashioned, but these tickets are usually web documents that can be located by a URL. It’s ok to use URLs for links!
They are too long to fit into a title but fit nicely into the body text…
See https://bugs.com/8765309
…with the advantage that many tools already know how to make these magic strings into clickable links. No need to rely on parsing #<id> or !<id> or whatever.
having worked at several places for a while, you end up changing the tracking system url over time. This has happened to me at the job I worked at for 8 years, the one for 3 years (actually more than the others), and the recent 10 year tenure.
The first place was pre-jira popularity and we maintained our own system, or used systems from our clients.
The second place was a game studio and had many more bugs than any other place I've worked. One game title we had, we had around 29k tickets, and cleared out 7k in our biggest week. The speed of the UI actually impacted the effectiveness of our throughput at one point. We started with Traq, then hansoft. But I think jira won out over hansoft due to the ability to customize it and then sony just maintaining a jira for us that their teams used to coordinate with us, so kinda a no brainer.
The last one is interesting as Amazon has many tracking systems, and even if you use jira, over time the jiras were so crowded they had several, we used jira-4 (out of 7 or 8) before I setup one for our org. At twitch we've been consistent on our use of one jira instance since i've been there, but we've changed the url I think 4 times since during my tenure.
In all of these places we've had many codebases that outlasted the urls. All of those places I've been we just use the short numbers like AND-3421.
In the cases I needed to click them I could just use alfred or a tampermonkey script. It's a lot shorter as well.
My work isn't what I would call highly professional in terms of infra, but for us the old URL from our Jira move is dead and gone (including the parent domain) but the ticket references work in the new instance (redirecting to a new project name also). I think you just have to pick your poison, nothing's going to work all the time without small amounts of effort from IT.
Definitely - we recently moved issue tracking to another system after 15+ years and found there were too many links to abandon, so we run a service that just redirects old URLs.
> The commit history should be the primary source of truth for what changes have happened.
It's just super impractical. The work is discussed / documented in the project management systems, I'm not going to copy it extra (from design document, the ticket description, ticket discussion, screenshots) for the SCM. Too much work, too small benefit.
> I've also been at orgs where they've switched from one ticket tracking software to another, and haven't maintained history, or even when they do, the ticket ids and up changing afterward.
I've been in orgs where they've switched SCM and didn't keep history. Neither should happen.
> It's just super impractical. The work is discussed / documented in the project management systems
I agree, it is super impractical to have to read the whole discussion again instead of having a comprehensive description of the agreed upon result of that discussion. Therefore, writing good commit messages is really valuable.
My team uses git to develop in separate branches and merge into master once ready. It's quite practical for that purpose. We don't use it for documentation of changes made, we have better tools for that.
> 99% of the time when I'm looking at commit history, I'm not just generically scanning. I'm usually looking for a particular change, or when something touched a particular file,
Exactly this! In our company's commit messages, I start with the component or file I've touched, such as 'Config: update year in example value' or 'make check: reduce false positives for erroneous backslash detector' (context: we clone the repository for short-lived projects which result in a document, so having an up-to-date year value is handy and we check the PDF build for common issues)
Not only does it help others find when a particular change was made, it also gives context when they're scratching their head "what do you mean updated year value, where do we hardcode year values? ... Oooh, in the config file where we keep example values, right that makes sense!"
> when I'm looking at blame/history I'm usually very aware of what the code is doing already. My question at those times is less about "what" and more about "why".
By saying the change intends to reduce false positives, the regex change makes a lot more sense. Commit messages are often in the style of "improved backslash detector". You might wonder why the old version was broken, or how it is an improvement if it detects fewer instances of the problem. By knowing the why (false positive issues), things click into place.
Author here. We have actually managed to use the same task tracking tool for the three years our project has been running: Linear. We don't do pull requests, and we don't put our work item number in our commit messages.
I can count on zero fingers the number of times I have ever wished I had this.
We do, however, keep daily work logs (each person submits them) so we have a log of everything that everyone worked on. We have Slack history, which is sometimes referred to.
I'm curious what folks find this useful for. What are the scenarios where one has to refer back to an issue from a commit?
By the way, I also have the experience of inheriting a code base where the commit messages were mostly (or just) issue numbers. Guess what I didn't inherit? The issue tracker. The commit messages were useless to me.
Relying on Slack history is living dangerously. Viz., your org gets swallowed (or, is already a cog in a bigger enterprise) and Infosec/CISO starts to truncate all histories past NN days.
Got it, thank you for the links. As I suspected, we just don't have these problems. Rather, we have other things that cover these needs when they arise that we have full ownership of (i.e., it won't matter if our issue tracker is replaced).
Specifically those are work logs (individual daily logs), project logs (logs for larger projects which may consist of a work cell of a few people and several work items), material logs (logs for specific equipment, virtual or otherwise), work in progress notes (notes for ongoing initiatives, including checklists, etc.), etc. They are all text files contained in git repositories that are always available for reference. We can also refer to Slack conversations, but most of our conversations are done live so as to avoid hand-offs, so we must rely on recording of decisions by folks on the team.
We use our work tracking system just to track work -- placeholders for conversation.
> The commit starts with the ID of the ticket for the code change (or some placeholder for very small commits where a ticket wasn't necessary, e.g. [no-tix]).
Taking this a bit further you can make sure your branch names have the ticket in them, which you then extract in a git hook (git-prepare-msg) and put as a trailer at the end of the message using ‘git interpret-trailers’
This reduces clutter when looking at a list of messages (eg git log —-oneline) and makes it easier to script things based on the ticket numbers (other git commands can pull out the trailers)
This tends to become useless after a few years because of tool or software migrations, changes in JIRA/whatever structure, ticket retention policies etc.
I've been using the issue key prefix for years now and I never thought about this aspect. If a board gets archived, for example, those issues need to remain available. You're also just tied to that tool like you mentioned. It would probably be good practice then to also include a blurb in the PR description so that never gets lost.
Point of order, PRs get lost too. Git repositories can be moved from GitHub to GitLab, or tar'd up and sent in an email when a project gets handed off to its new maintainers. We try to keep things that matter in text files in repositories. We will always own those. Tools are just tools for transient activities (work items in our task tracker are just placeholders for conversations).
Or even the most mundane: corporate mandates that your project name change because of RULE with new access requirements.
You get a new repo in GitHub, and all GitHub activity is lost for REASON.
Tools never persist. I’d even say that ticket numbers in commit messages is harmful since it makes it much easier for the committer to be sloppy and write “Fix bug for JIRA123”. You can say “that’s a people problem”, but with a growing team I can either fix that people problem over and over, or I can ask for no external ticket ids in commit messages
Very much agree. I've seen it leak into conversation in slack too... "Hey, I have a question about JIRA123, should we use blue for the button?". There's no greater way to tell someone you don't care about their time than presuming to force them to look up work items every time you want to refer to something.
I believe that if teams look hard into the reasons they think that that practice is valuable, they will uncover even greater culture and process problems to address. Once they are addressed, they will no longer need them, and they'll even start to see the harm in them.
Most teams won't do that, however, and I will be dragged out to the town square for even suggesting it's possible (even though our team operates extremely well without this practice or anything resembling it).
I agree that the ticket id should go somewhere in the commit message. Certainly if things like pull requests are tied to tickets. However I get a little concerned that such sentiments are so popular here. Because the ticket id is a nice contextual link. But the commit message should be self-contained with regards to the change.[1]
It’s a version control system. That includes an explanation as well. The explanation for each revision is integral.
And I don’t think people here really disagree with that. Just that they might prefer to put all that integral information somewhere else—the information (princess) is in another castle, Mario. Why? Why not at least just copy-paste the relevant information into the version control system proper? I see this all the time; people often do the same with GitHub pull requests—paragraphs upon paragraphs in the pull request description and then just a “Merge <convoluted URL>” in the merge commit message!
It’s not that I blame people for this. Maybe Git is just so unergonomic that many would prefer to offload everything into another interface (even Jira).
[1] My tickets will usually have all the history. Maybe a lot of back and forth. The commit message though will just have what was done when all was said and done—no back and forth, just the result. And some of my tickets probably don’t even have a description body since what I ended up doing is just described in the commit message.
I prefer to go the other way. Put the commit ID(s) and/or links in the ticketing system as part of ticket comments. Now if the ticket system changes or is exported to another system the commit ID(s) will still be in the record. They are much less likely to change plus you can have multiple commits for an issue.
Wait until corporate drops jira, or migrates prem to cloud or vice versa with data loss, or revokes your access to an old project id the app was run on.
Git commit info needs to live in git. Ticketing systems will always die. The source remains.
I look at commit messages every day. I actually don't love a message that just describes the code change (I can figure that out on my own if I look at it long enough). That's fine for bigger or more misleading changes, but when I'm looking at blame/history I'm usually very aware of what the code is doing already. My question at those times is less about "what" and more about "why". Why was this change made? Sometimes that can be found by looking at accompanying tests (if the original author was so gracious), but if it's not in the commit (or only obtusely so, as many developers I know tend to commit in what I'll call "stream-of-consciousness" style) then I've got to go outside the codebase to research by interviewing people (assuming they're still around and remember) or trawling through our project management system to try and line up released features/fixes with the change. It quickly becomes an inexact science.
That said, in my experience, the larger an organization, the more commit messages get looked at. On my side projects where only 1-3 people are involved, commit messages only really show up on the PRs as things are going in. Sometimes we complete our own PRs to keep up the velocity since we're all working async part-time. Rarely if ever do we look back to figure out why we did something because the project is so small.
I agree with your first paragraph, but I don't think it's super relevant to the article: the why of a commit usually takes some time to explain, and the subject line is nowhere near sufficient, so that usually goes into the first paragraphs. For non-trivial commits I try to follow the "why, what, how" template: first explain the issue / concern / background for the change, then clarify the change if the diff is not super obvious or is a bit large, and finally explain the implementation changes if there were options.
In my experience the former and latter are the most important when blame-diving, I want to know the context / driving force / use case for a change, but I also want to know the scope of consideration, because e.g. sometimes I'm thinking of a fix which was considered and rejected for still-valid reasons, while at other times it was just considered unnecessary back then (YAGNI) or the blockers have since been removed.
Either way the subject needs to be snappy, so that it can quickly be selected for consideration or not e.g. if I'm blame diving and I see a commit which just does indenting / formatting, I probably can skip over it without looking at the actual change.
I mostly agree, though if I could only have “what” or “why” I would choose “why” every time. To me this is similar to describing the problem in product design. I wish the “why” was more often concise enough to be the prominent feature in a commit, because commits are difficult/annoying to amend (I’d be surprised if more than a couple people where I work knew how) and if something is going to be left out, it is the “why”.
Can't disagree with that. I think the order of importance is why > how > what, because the first two are the least likely / possible to infer from the commit contents. So an explicit record is most crucial.
> My question at those times is less about "what" and more about "why". Why was this change made?
Isn't the essence of the commit message body to delve into the 'why'?
The ability to swiftly glean the 'what' from the commit subject line aids in identifying the appropriate commit for review. The commit message body can then have a detailed explanation of the reasons behind the change.
Putting "what" in the message can be fluff, in the same way repeating the code in a comment is fluff. But the broad strokes or considerations of the what may not be obvious from the diff. Expressing what the commit intends as prose can help with understanding, with discarding irrelevant change data, with post-mortem analysis, ...
It is also useful for conveying intent. If the commit message describes one thing, and the code does the opposite (or doesn't do it at all), that's something the "gatekeeper" can and should call out.
I've certainly sent out changes for review where I forgot to include a crucial file (or subdirectory).
About to say this as well, but would like to add. When I have a bug dealing with Foo in Bar, seeing a commit message like “Updated Foo to support Baz\nSee issue #123 for additional context. Also discussed in chat (link).” When looking for a recent bug, I’m going to review that commit immediately without having to do any bisecting/blaming. Easy peasy.
Mostly though, I tend to see just “support baz” as a commit message, which drives me bonkers.
What if only people that had context did reviewing? As in, what if when someone needed a review, it was their job to ensure that the reviewer had the context necessary. That is, it was "push" rather than "pull".
What if a team's process was set up in such a way that separate phase-gate reviews were not typically necessary. That is, the team tended to work in pairs where there was constant reviewing going on, and the team was practiced at pulling people in when necessary to review specific parts, e.g., things they had never done before or needed additional guidance on.
Do we all remember waterfall development? Where PMs would write specs, then they would throw those specs over the wall to devs, who would then throw code over the wall to testers? Does anyone feel like, perhaps, we've done it again with pull requests and we've started favoring asynchronous hand-offs, rather than one-piece-flow?
1. I'm moving value from this address to that address through %rax register.
2. I'm copying this value from that place to that.
3. I'm making sure that this values are equal.
4. An invariant was broken and I need to restore it.
If you look at it, each next like is an answer to "why" question applied to a previous line. Answers to "why" questions becomes new "what" and then "why" applied to them again.
The point I want to make is it is easier to move back (in descending order) through this than forward. When you know (4) you can easily figure out what is needed to be done. But if you have (1) you need to infer next points one by one. One who just wrote the patch is probably unaware of differences, because he knows all the answers and can easily move from one to another. He/she do not feel difficulties. But the person who reads patch starts with (1) and needs to climb the ladder in opposite direction than the person who wrote the patch.
In bigger teams, scanning every diff instead of just the commit subject lines is significantly more time-consuming (a frequent activity in my experience).
I often decline proposed commits that lack concise, informative subject lines, and I hope other larger teams also do for efficiency and clarity. If that's considered gatekeeping, then I'm content to be one.
Understanding a diff usually requires knowing and remembering the codebase well, having a very clear code, the diff being reasonably short and atomic and time.
Commit messages, if good, are extremely helpful to check what changed e.g. in an open-source application you're using. Of course the commit messages can lie, so you should also check the diffs, but it's much easier to confirm what a diff does than deducing it.
Not to mention that for many changes you can only make a guess at what the intention was; if you have a message saying what was meant to be done, you can check that it really was done correctly.
Those are two very different things when it comes to commit messages.
For history, the linked argument is arguing that subject-first is better. Personally I'd prefer `[component] verb details` though, which should fix the skippability issues mentioned elsewhere. And of course you can enable diffstat to get a bit of an idea of the structure.
For blame, you have the full file context in front of you. Some blame frontends only show author, but assuming you have the commit message, what you really want out of it is a rough idea of the historical significance of that line. So definitely verbs first.
The subject line says what the purpose of the change is (CI/CD, tests, bug fix, new feauture), what component it primarily focuses on, and then a plain text one line summary.
For example:
ci(GitLab): Add rust build directory to cache
This should improve build times by caching most of the compile tree between builds. In testing, I’ve observed builds drop from 2m to 30s when scheduled on a node with a primed cache.
It also forces me to keep my commits focused. If I can’t create a one line summary of what I’ve done, it probably needs to be broken up into multiple commits.
Conventional commits are great, especially if you add in commit linting.
Being able to programmatically increment semantic versions and automatically generate relevant changelogs is awesome.
It’s also nice to implement Commitizen[0] for a little hand holding until folks get used to the linting.
I used to care a lot about doing things the way that felt right to me, but now I just want some common standard that is easy for everyone to follow, easy to automate, and easy to verify programmatically.
Things like conventional commits and semantic versioning aren’t perfect, but they are quite good and apply broadly to many use cases with common tooling and conventions.
"feat" is something that adds value to product's end users. The things we put on release notes. "chore" is something that adds value to the engineering team. The things we have to do to keep being productive.
feat is just the short form of feature; chore is for annoying stuff like fixing lint issues or maybe package updates that don't require any code changes etc... those are the little chores we do to keep everything flowing.
It is irrational, but "feat" tells me the change is a marvel, something to be looked at. Webster says "an act or product of skill". Nah, it's a feature. "feat" sounds pompous. Shout it from the rooftops: everyone, this is a feat! Not a small feat. Aah. Feat.
"chore" is too vague for me. A "feat" can be a "chore" if I don't want to work on it. "chore"-s like the one you describe are occasionally feats (the no-quote kind).
What's tremendously useful for us to have in the commit message, is the ticket ID the work is related to in our issue management system, to the point where it is practically mandatory.
I see it as a matter of scoping communications. The issue management system includes a broader set of people than the git repo. You want the commit linked to the issue so that you can see the entire history behind the commit, including the business decisions, the designer making a call, the back and forth with QA etc. That can all be useful a year or two down the road if you want to understand why something was done a certain way.
Now there is some info about the change that perhaps only the devs would ever care about, and probably that can sit in the commit messages. That sort of happens naturally but there's not really a lot of it for us. But that seems to me to be the relevant thing from an organizational perspective, no one writes commit messages except for devs, and almost nobody except the devs reads them.
This is what I came here to say. Assuming the team is working from JIRA or something, put the ticket number in the commit message. The ticket itself is where your “why” goes.
It's really nice to have the actual reasoning in the commit history though. But maybe I was just scarred by the project that switched project management software like 5 times over its lifetime (but kept all of the commits) though.
I’ve been through this in a job. The commit messages were borderline worthless, ostensibly the context was in a JIRA ticket that hadn’t existed in a decade, and I couldn’t figure out if a change was a mistake or something clever and poorly documented. It’s frustrating, a huge waste of time, and runs the risk of introducing subtle defects into the code today.
I can't stand the ticket number in the message. It's an immediate waste of 6-10 characters on meaningless metadata that GitHub can pull from the body and link for you. Further, the numbers are sequential, so frequently VERY similar, and highly prone to typos, which is harmful to anyone even minorly dyslexic.
`git commit -m "Fix actual words that humans read" -m "Fixes: #234353"`
You could probably script this based on branch if you're one of those people who names branches after ticket numbers.
But my actual advice is that I think -m is a bad pattern that leads to bad commit messages, and omitting the message argument opens your EDITOR, and you should use that.
I wonder if Atlassian encourages and sells this as an ideal use case [for JIRA]. What a great way to make it really painful to leave their system if in order to make sense of your repo history you to need to maintain the references to their system. They are making your repo dependent on their ticket system...
I look at past commits probably every week or two to track down why something is the way it currently is.
Having said that, I care very little about style and grammar of commit messages. Ideally I want them to say why the change was made, what the intended effect was, and where I can look for related work (bugs that tracked the work and have related commits or investigation attached to them, docs, whatever)
Hit rate on the information I'd like is not great, but if I get at least one of the three things then it's something I can work with.
I'd say I look at commits ~weekly, especially when trying to understand 'WTF did he do that?' - context of tracing the commits (and offtimes back to individual PR when commit is poorly formatted) is valuable.
Blame & diffs are much more useful to me in the context of GitHub. I don't care about someone's subjective ~50 char description of a change. The important part is the pull request itself and that it references whatever issue prompted the change.
Commit messages have only ever been useful in tracking my own intermediate work products. Often times, I will leave helpful bread crumbs in my commit notes if I know I won't be working on something for a few days. We strictly do squash+merge, so I don't have to worry about these things causing trouble for others. All of our commits into master have some standardized "Issue #1234" note as automatically copied from the PR title (which we do have ~standardized).
If we didnt have some git wrapper like GitHub available, then I suspect we'd be significantly more aggressive with policy around our commit messages.
I do it almost daily too. What _is_ rare, is finding a commit message that tells me anything useful that I couldn't have derived from the diff also. I don't need your life story, but sometimes some context about why something was needed is great.
Author here. Yes, the diff should tell you more, but you shouldn't need to look at the diff until you have an idea what the contents of that diff are about. It's like the links on the hacker news home page -- clicking them will always tell you more, but the index should be geared for the use case, which is scanning, typically.
I'm not sure that suits everyone's most common use case. (I'm not arguing that it may well suit yours.) I guess on paper it makes sense that commit messages are the top-level entry point to changes. But since we're packaging up a lot of our work in merge requests, the commit messages already get snowed under. Folks arrive in a merge request, and often can't care less about the commit messages that make up the merge request, they may read the message on the merge request, familiarize themselves (ideally) with the story in (ugh) Jira, and then jump to the diff for the review they came there to do. It's the later version of you that is browsing through the code base and wants to understand how a certain current piece of code came about, who then clicks on the gutter in, let's say, IntelliJ, and clicks through to the commit, where both the diff and the commit message are shown.
Indeed, and since we don’t do pull requests (we pair, mob, and/or do live code reviews), the after-the-fact git blame or the git log to see what has changed recently is what tends to matter.
I agree that when I did PRs I didn’t tend to care about the commit messages when I was reviewing them, but even when I did do PRs I cared about them later any time I was trying to understand how, why, or who forensically.
Exactly. Im in git history all the time. But the messages themselves…eh. Im basically always looking for a change in a specific file or set of files, and then I just pull up the diffs.
It is so odd to me as I don't remember a single instance during my 10 years of experience having done that. Only if I was asked by someone else in the org why something is the way it is have I gone to history to see what the ticket number is for the referred change to get an explanation and this is more related to the product.
I wonder what is so different about our workflows, that some people do it all the time and others almost never.
Are we working on somehow very different things.
To me I find that code is the source of truth and current state, it doesn't really matter to me as much why or how something came to be as I look more into how I can make the current thing work with what I am to do.
I do, and I also work on legacy systems. It just never occurs to me that commit history would somehow help me instead of just looking at the current state of the code.
I would say however that the commit messages mostly are just squashed pull requests with ticket names and single sentence so I'd agree that they wouldn't be very good for that purpose, but I don't see anyone at the org complaining about this.
It's mostly on the Internet I read how others use the git history, but I haven't seen anyone really doing a lot of that in real life.
> I would say however that the commit messages mostly are just squashed pull requests with ticket names and single sentence so I'd agree that they wouldn't be very good for that purpose, but I don't see anyone at the org complaining about this.
For me it's usually blame, look for a line that hasn't been touched by some triviality, then from that commit message either to a ticket reference, or, if absent or unavailable, to history shortly before and after. So yes, I do look at old history. In a codebase that has seen a lot of squash rewriting that won't be of much help of course.
I personally look more often at past commits, but I've still never really had a need to scan a list of changes like this. I really don't see the point of the advantage they're claiming, even if it's real.
What I normally do is, while investigating a tricky bug and finding some lines that don't make sense, look through history to see why those lines were introduced (to see if they fix something else), and perhaps when (to see which versions may be affected by a hard-to-reproduce bug). But that typically pinpoint a commit already, and I just need the commit message to explain why it was there.
It does sort of feel a bit cargo culty sometimes. I expect commit messages are situationally important, as a function of the size and maturity of the project, the size of the development team (esp. the amount of developer churn).
That said, there's probably some fringe benefit to describing what you do in some fashion. It's a great way of making sure you understand what's happening yourself.
How does that work? Like a cargo cult behavior is a ritual you do because you think it will bring about some good through unknown means. Disregarding commit messages is the absence of a ritual.
The difference is that if disregarding commit messages has no effect, then it's still beneficial to do since you're saving effort vs writing elaborate commit messages.
Maybe version control is a cargo cult, then? You might as well just make commits without commit messages if the revision comment is useless. Maybe just squash all commits between versions so you just have one commit per version.
This is a demonstrably faulty conclusion. Even without commit messages, version control lets you do things that you would not be able to do otherwise, such as to roll back a failed refactoring attempt, for example.
It's still a substantial improvement to have known snapshots of the state of the code. It doesn't have to be in VCS, like you can roll it old school and save full zips of the code base, but snapshots are a tangible benefit that VCS offers that does not hinge in the slightest upon well written commit messages.
<10 instances of looking at the commit history in >10 years is exceptionally little. I think you may have a different use-case if your commits are write-only and crafting commit message texts is probably a waste of time at that rate
Which is a very interesting take, for what it's worth. Might be worth a blog post what circumstances lead to this working well for your organization!
The subject-first style felt grating at first, but comparing the two I found myself reading more per line than I did in the verb-first approach.
Fix skip fix skip fix skip add skip
I knew nothing at the end. The second style had me grasp more, as there were no obvious hooks to skip without knowing what had changed.
But as others have said, commit messages are a cesspool of bikeshedding and ultimately useless, unless every commit is a self-contained chunk of work which it rarely is.
On the side, I enjoyed all of this author's thoughts in the repo. Good find.
Author here. Thank you! I agree that there can be plenty of bike shedding about them, which is why we have an approach that is based on human psychology and can be backed up in that way. We don't often put too much more in the content of a commit message -- the diff, plus the repo it's in (we have many repos) tell you most what what you need to know.
If you want more context, every team member has a daily work log that is available to the entire team, so you can see more about what they worked on and why in that work log.
> I agree that there can be plenty of bike shedding about them, which is why we have an approach that is based on human psychology and can be backed up in that way.
You optimized for a micro benchmark and presented no data.
It was a design decision, and like most design decisions, it was based on first principles and a-priori knowledge. If we had to base all of our design decisions on data and benchmarks, we wouldn’t be able to design anything. I can assure you, that if a better format with better properties came around, we would likely switch to it (assuming we accounted for switching cost and the variation we would be introducing).
eg:
cocoaui: fix images being incorrectly aligned on high dpi displays
We were calculating the position of the image in logical pixels, but not converting that to actual display pixels for rendering.
I think it's a really nice style that makes commit messages really easy to scan in a short log, and lets you ignore commits easily that don't touch the area you're interested in
Spent 5 minutes trying to figure out where field_name is in the second list... I thought it was the same commits rewritten to be more scannable, but they are from a different project.
(I scanned, naturally, the article itself!)
Or more generally, I thought the point of the article would be that the function / variable names from the first list would be moved to the beginning of sentences. But in the 2nd example they are completely absent, favoring much vaguer / high level descriptions. So we are unable to see the before/after effect for the same list of commits, I think most of the potential impact of the author's intended point is lost.
(Also, the first half extolls the virtues of being able to read quickly, while the second half tells you to use longer sentences for everything for apparently no reason?)
Author here. I chose two real sets of commit messages rather than to fabricate one. Fabricating one may have made the second list easier to scan -- you'd already know what you were looking for and you would have some familiarity with the content. In any case, it's certainly not a perfect experiment or demonstration.
The specific function/variable names do not often make appearances in our commit messages (though they do sometimes, as do class names). They do, however, make very clear appearances in diffs. Commit messages stand on their own for their own purpose -- it's an index, and indexes are for scanning. If you need to read further, you zoom in.
Also, I wasn't trying to say to "use longer sentences for everything," where did you get that from? I offered some specific clarifications on how we write some common commit messages, and they may be longer than some alternatives, but our reasons are for clarity and consistency. We don't mind slightly longer commit messages, in large part because the first few words tend to be the most important.
Like everything in software development, this is just personal preference without any actual evidence, a substitute for actually making software with quantitative quality metrics that users actually care about. Let's fiddle with our commit messages, PR formats, code formatting, etc. etc. etc. instead of addressing the fact that our website has 6 different fonts and makes 30 HTTP requests to show a few hundred bytes of text.
I totally agree, I have seen people get so fixated on this in addition to:
- review feedback format
- variable naming
- keeping line lengths to 80 lines
- log message formats (sweet jesus the amount of time wasted debating that alone)
Whenever I see these conventions about commit formatting they generally seem to focus on the first line (notable exceptions being Git and the Linux kernel among others I assume), one line is rarely enough to describe the change – I’m not advocating for writing an essay for a 5 line change though in some cases it might be warranted.
Professionally I rarely see useful commit messages, by useful I mean something that could be read by someone without context and get a general understanding of why a change was made. Frequently I have seen “updates” / “wip” etc. in the master branch.
More frequently I see that the quality of commit message decreases with an engineer’s seniority – though like every thing there are exceptions.
Author here. You're well within your rights to believe it is personal preference, but it's somewhat hard to refute the fact that it's easier to scan lists when you put the most significant (i.e., the part that varies the most) first. Imagine if every item on Amazon's list of products was of the format: "Product for purchase, Headphone, Sennheiser Momentum 4"
Author here. The psychology portion isn't subjective -- things are easier to scan if the more important part is first. One would have a hard time refuting that. Also, there is some familiarity bias that comes into play when reading a commit message style that is foreign, so the experiment I offer may not be super compelling. We are used to what we are used to, and that takes some time to overcome. I know it took me a while to overcome it both in writing, scanning, and reading them. It was worth it for me and our team.
> things are easier to scan if the more important part is first
You skipped the step where "most important" is defined.
> "Specifier can be configured when constructing a store"
This is just about the worst type of commit message, in my opinion, only marginally better than an empty commit message.
Ok. Specifier can be configured when constructing a store. Is it supposed to? Is it the state before or after the commit? Is this just context, and maybe this commit changes documentation to reflect that reality?
You even say that reading a bunch of them is uncomfortable.
> It's not about the self. It's about the code.
Actually, it's about the change. When something broke, what changed. When a new release comes out, what changed.
Some people prefer the past form "Added feature X" instead of "Add feature X". On that I just say "be consistent".
Also note that your first batch of examples are inconsistent. "Whitespaces", "We don't use `::` to denote class methods", "Avoid __method__", past/present verb form, etc. The difference isn't just the form, but whether commits actually follow any style at all.
> I eventually recognized the benefit of scannability
Scannability is hugely important. People who disagree with your style of commit message don't disagree with you on that.
> For example, "Widget tests" is preferred over "Widget tests are added".
What about the widget tests?
> When describing a version increase, be explicit about both the old and the new version. E.g., "Package version is increased from 1.1.1 to 1.2.0"
Are version numbers ever decreased?
(1.1.2 can be released after 1.2.0, yes, but that's not a decrease even if something was backported)
We didn't need it. It was based on a-priori knowledge. For example, we don't write email subjects like: "Convey information about new release", or "Ask question about time off". We know from how the human brain processes content (eye tracking studies) that they scan in an F-shaped pattern. This puts emphasis on the content on the left. If the left side has no differentiation, this is, de facto, harder to scan.
You don't have to agree with me. As my colleague said, there are other, even more subtle things we are trying to convey and reinforce: https://news.ycombinator.com/item?id=38836919
> You skipped the step where "most important" is defined.
No, we literally do this every time we write a commit message. This is a usability issue. What is the most important thing to the reader?
> Ok. Specifier can be configured when constructing a store. Is it supposed to? Is it the state before or after the commit? Is this just context, and maybe this commit changes documentation to reflect that reality?
And yet, when all of your commit messages are in a consistent voice and mood, there is no ambiguity. Also, all of your questions are answerable from a self-evident perspective -- if you can rely on your teammates to care about your ability to understand what they write. Of course it's supposed to, otherwise why would I commit it? Of course it's the state after the commit, why would I talk about what was (How many newspaper articles about a celebrity's death have a headline saying they are sick?)? Of course it's not about documentation, it would say it was if it was. In short, you can make up as many holes as you want to poke through.
> You even say that reading a bunch of them is uncomfortable.
I said at first. See: familiarity bias
> What about the widget tests?
This is convention that one could argue requires memorization. I don't actually think it does, however, because of how we as humans interpret subjects/headlines.
> Are version numbers ever decreased?
They have been, and I hope it's self-evident how that commit message would be constructed on our project.
In any case, I'm not here to quibble about commit message formats. I documented ours and gave supporting rationale which is based on UX knowledge (and research, at the very least, see eye tracking studies). If you think that "Fix", "Fix", "Add", "Change", are the most important part of the commit message, then I'm not really going to be able to or even bother trying to convince you otherwise.
>> What's your evidence or supporting research?
> We didn't need it. It was based on a-priori knowledge.
Well, you may hold these truths to be self-evident, but that's hardly an argument. And since you say "It's likely what your team uses, as it is the commit message style that is typically recommended as a "best practice" you acknowledge that you hold a minority opinion.
"What I'm saying is self-evidently true" is unlikely to sway anyone.
And I find it dishonest to phrase that as "The psychology portion isn't subjective".
From your colleague:
> The reason that the "subject-first" approach is used is to get contributors to talk specifically to the impacts made to the software itself, rather than talk about themselves and their labors.
I don't recognize that description at all about the majority opinion. Yes, some people write terrible commit messages, but that's not the best practice you seek to upend.
> Of course it's the state after the commit, why would I talk about what was
You'd be surprised. It's more common in bug reports, but I've seen this in commit messages too.
>> Are version numbers ever decreased?
> They have been,
Oh, really? Do you have an example?
> In any case, I'm not here to quibble about commit message formats.
Thanks for clarifying that you base this policy on nothing but your subjective opinion, that there is no research to support it, and that everyone else is doing it wrong, because your subjective is clearly objective.
Nothing wrong with that. I do have a problem with you misrepresenting this as objectively better, with a handwavy "scan in F shape". Don't say it's based on research when it's just your "common sense". That's unprofessional.
> you acknowledge that you hold a minority opinion
An opinion being minority has no reflection on whether or not it is correct. See hand washing in the medical profession before it was understood, or batch sizes in the automobile industry before Toyota ate the west's lunch.
> And I find it dishonest to phrase that as "The psychology portion isn't subjective".
You can find it dishonest if you wish, but I'm trying to point to a body of knowledge I can't adequately convey in an article or a hackernews comment. I'm not trying to pull one over on you, I promise. As I said in another response, this is a design decision. Most design decisions are based on first principles, a-priori knowledge, expertise, a wide range of studies and observations over time, etc. If a person can't write down (convincingly) all of the tacit knowledge required to make the same decision, they are not being dishonest.
I'll say it this way: We made this decision based on objective factors. I do not expect everyone to be able to see our decision or our reasons in the same objective light and I'm not going to try to convince someone that doesn't. But telling me it's purely subjective is gaslighting, and impugning my integrity over it is just rude.
> You'd be surprised.
No, I wouldn't. I would be surprised to see it on our team, or any other well-disciplined team. This article is a tiny (and frankly pedestrian) part of our overall development process. I'm sure you'd disagree with just about everything else that we do as well, and that's OK as well.
> Oh, really? Do you have an example?
Sure:
mail package version is decreased from 2.8.0 to 2.7.1
This happened when we were forced to pin to an older version of the mail gem because of a regression in 2.8.0.
Note that the scenario you described: "(1.1.2 can be released after 1.2.0, yes, but that's not a decrease even if something was backported)" isn't relevant to our practice. We don't maintain multiple major/minor versions at one time of any of our packages. If we did, they would be on separate branches as the version number can only exist in one place (the gemspec) and the same exact rules would apply. That is, it wouldn't be a version decrease, it would be an increase from 1.1.1 to 1.1.2 on the 1.1 branch.
> An opinion being minority has no reflection on whether or not it is correct.
I entirely agree. But if you find yourself in the minority, then I encourage you to objectively consider what's more likely: Are you right, or are most other people right? Or maybe there is no right and wrong (aside from being consistent)?
Sometimes the minority is right. But usually that's not the case.
> See hand washing in the medical profession before it was understood.
Ooookayy. I would not compare any technical decision in this space to that. A bit grandiose. More traditionally one says "10k flies can't be wrong: poop must taste great!".
> If a person can't write down (convincingly) all of the tacit knowledge required to make the same decision, they are not being dishonest.
I agree. As long as they don't claim otherwise. I'm sure you're an expert, with lots of experience. Experience that creates a good judgement, and good intuition. But that also describes the majority of the experts, who, turns out, disagree with you.
>> You'd be surprised.
> No, I wouldn't.
Then... you disagree with the "of course", you said.
> mail package version is decreased from 2.8.0 to 2.7.1
What does that mean? A dependency version decrease? If this is a commit on the mail package itself, then what does it mean?
I'm going to assume that's a dependency version change.
"10k flies can't be wrong" is one thing, but "the majority expert opinion in my field is wrong"... is almost always wrong.
Most theories are wrong, even by experts. It's not even embarrassing to be wrong.
Commit summaries are typically viewed as lists (though not always, and in that case you often have more context, e.g., the rest of the message and the diff)
We optimize them as we would list entries in any other thing we design for human consumption.
Btw, we didn't refer to these specific articles when we made our decision, I found them just now.
I updated my article with links to these. Thanks again for the discussion.
> Ooookayy. I would not compare any technical decision in this space to that. A bit grandiose.
One has to be able to understand an analogy for the point the person is trying to make -- it's not an equivalence. Obviously most software issues don't rise to the level of life or death.
That said, let's try a few others:
- Waterfall
- DCOM
- CORBA
- Web Services
- Microservices (as implemented by web developers -- per-entity web api to web api to web api cascading failure)
- Monorepos (not globally, with sufficient support these can work)
- Devs shouldn't write tests
- Fat Model, Skinny Controller
- Write tests. Not too many. Mostly integration.
- Single Page Apps for all
- SCRUM
- Story points
Most, if not all of those were embraced by "most experts", but are now demonstrated problematic or at least are knowably wrong from fundamental design principles. Note that statistically speaking, "most experts" are, at best, early majority. There aren't very many in the Innovators and Early Adopters, and crossing that chasm is rare, and hard.
We have a history in our community of intransigence, repeating mistakes, and expert beginners being the most outspoken portion of our community. They are the ones that have the most to gain by blogging and self-promotion and there are significantly more of them than people who may have different things to say for good reason. Note that right now, I am speaking generally, not debating a commit message approach.
> Then... you disagree with the "of course", you said.
No, the "of course" was: of course no one I work with would make that mistake (more than once). It would be an extremely beginner mistake, and we would have failed as coaches on our team to not pair with a person long enough to help them understand how to write proper commit messages.
> I'm going to assume that's a dependency version change.
Yes, it's a pinned version for that dependency. You don't have any context on that particular commit, so it's not surprising that it's ambiguous.
> Most theories are wrong, even by experts. It's not even embarrassing to be wrong.
Indeed! I have been wrong many times in my career. Hell, I invented the "auto-mocking container" which, in hind sight was so stupid (and btw, my colleague you referenced warned me about it at the time -- he's always been a few steps ahead) that I never should have popularized it. Thankfully, it fell into the past along with IoC containers (for the most part).
I have plenty of other stories about being wrong as well, and as I've said elsewhere, if we find a better way to write commit messages we will do that. For now, from our analysis, the way we do it is best for us -- observably (I'll say that, rather than objectively, though I do believe we were as objective as possible in our analysis).
"Change List", Perforce's name for the rough equivalent of a commit.
It's called a change list even though it's a single commit because Perforce tracks changes per file, so if you modify multiple files in a single commit, you are introducing a list of changes in their model.
In my experience consistency is more important than any specific style of commit messages. Similar to style guides for code; there's plenty of ways to approach it, but the real value is the consistency in how things are written across all developers.
> The second style likely feels foreign, and possibly uncomfortable. It's passive voice and present tense — all the things that we aren't supposed to make our commit messages.
Nitpick: it's passive voice and indicative mood (and sometimes subjunctive) -- there's nothing wrong with present tense. "Fix typo" is active voice, present tense, and imperative mood. (The Rails commits that dip into past tense bother me slightly, but whatever.)
Broadly, I have a few opinions on commit messages. The style doesn't really matter as much to me, although I'm a relatively strong adherent to "one-line summary, followed by paragraph(s) of additional context" (as is
standard in the Linux kernel, and supposed best practice at Google even if it isn't universal by a long shot).
One is that the commit message should be useful for anchoring a search for potentially relevant changes, and for providing broader context re: why the change was made.
At the same time, I waffle between putting more description in the commit message, versus just commenting the code (or making the code clearer).
The last is more pragmatic: when I'm searching for a specific change, I'm often looking at the history of a particular file (blame or otherwise). I can quickly filter out all the "fix typo" messages or "[LSC]" (large-scale change, term used at Google for various company-wide code health refactors). Or if I'm trying to figure out which change introduced a bug, I'll probably bisect it one way or another (I often try to short-circuit that bisection toward the end to save the last few iterations). Either way, I don't actually spend that much time reading through commit messages until I've identified a potentially problematic change.
We don't tend to put much in the commit messages themselves. The code often speaks for itself (and if it doesn't, it isn't likely that the author would have had much more to say about it at the time, as our norm is to comment "why" when necessary). We also have work logs we can refer to that sometimes have the why, or at least some of the context for the work we are better trying to understand.
there are pros and cons to both verb-first and subject-first commit styles.
verb-first commits can be organised in a few categories.
since they are few commit verbs (fix, add, refact..), always use the same verb prefix for bug, features messages and you will have nice categories to scan/filter:
- add feature
- fix bug
- refactor logic
On the other hand, subject-first indeed put the important part first and let you search for a term:
- Instance configure template method called from constructor
- Store's project method is an alias for fetch
- Title is changed
Depending on the use case, are you often searching for something? or would you like to highligh the nb of bugs/features in the last release?, one style is better suited than the other.
For important changes, I like the linux kernel style:
oneline summary, details (problem, impact, solution..)
I like to be able to read the commit line following the sentence: "This commit shall ...".
For example from TFA:
"Fix code example in the field_name method"
gives:
"This commit shall fix code example in the field_name method".
OK, cool, I may or may not merge it, pull it, whatever-it. But I know what it'll do should I use it.
Now from what TFA recommends:
"Default reader batch size is 1000"
Means nothing. Tells nothing. Is 1000 good or bad? No clue. Is it causing a bug? A performance issue? Was it 1000 before the commit or after applying it? Zero information.
I'll pass and keep using the "best practice".
Using the best practices the commit would read:
"Change default reader batch size to 500" or "Change default reader batch size to 1000" or maybe "Add default reader batch size" (btw the commit line in TFA is so bad that I'm not sure at all it's 500 or 1000 or something else I should put here to make my point).
The "best practices" aren't a bias. They're the result of people thinking long and hard as to how to make commit lines as clear as possible.
I look at how multi-million lines codebase like Linux or Emacs are doing it and use that as the authority. If it works well enough for these projects, it works well enough for smaller projects.
> I like to be able to read the commit line following the sentence: "This commit shall ...".
I'd guess there's a near zero chance you do this when looking at (scanning) a list of commits. This is a mnemonic for writing commit messages in the commonly accepted style. If you do that in order to help you understand a commit message every time you read one, that may be a personal thing.
> Means nothing. Tells nothing. Is 1000 good or bad? No clue. Is it causing a bug? A performance issue? Was it 1000 before the commit or after applying it? Zero information.
> Using the best practices the commit would read:
> "Change default reader batch size to 500" or "Change default reader batch size to 1000"
You realized you could ask the same questions about the versions you put forward, yes? Aside from the "Was it 1000 before the commit..." which, on our team, would be such a ridiculous question it wouldn't even enter the mind. As I said somewhere else, the headline for a celebrity dying does not ever read: "Some Celebrity Is Sick"
> I'll pass and keep using the "best practice".
Works for me :)
> I look at how multi-million lines codebase like Linux or Emacs are doing it and use that as the authority. If it works well enough for these projects, it works well enough for smaller projects.
Linux and Emacs have completely different circumstances than most software projects. The idea that cargo culting their practices is good for "smaller projects" is exactly why I wrote my article on best practices. By the way, I contribute to Emacs and I don't find their particular commit message format good or useful. It's particularly onerous, redundant, and does not tend to provide useful additional information (in particular the mandated change log portions -- people do often supply additional information, which can be useful). I tend to find the devel list and bug list more useful for the context of a change, which fits with our experience as well with work logs and project logs.
I prefer the first style. It reads way less weirdly. I spend way more time reading commit messages than "manually searching" by scanning them. I either arrive at the commit from git blame, or I use this really cool feature of computers called "ctrl-F".
- Blame view, where I usually want to know the ticket number so I can track down a story which has the business rules the dev was implementing.
- Bisect, where I want to know if this works or not without having to do a test run every time.
This format is alright I guess, but the best thing you can do honestly is just
TICKET: [done/works but wip/does not work] [describe what you’re thinking]
AB-1234: does not work, trying to rewrite this controller to remove all the duplication, /api removed temporarily
To me, that doesn’t have to be set in stone. No one likes working with a commit message stickler when standardized formats are only useful ~20 times a year to save 10 minutes each. (In my experience)
Error messages. Get to the meat of it in the first 3 words, there's a high chance that's the only part that will get read, or even displayed. It must be the most meaningful part of the error message.
I find commit messages are more useful when we consider machines to be primary the consumers of them.
Using conventional commit style messages allow us to generate changelogs and modify our semver versions automatically. Generating changelogs by hand is extremely tedious. Modifying semver by hand leads to caring too much about the number.
Like others have pointed out, the context and "why" can easily be tracked by linking out to the ticket/task that the work is associated with.
I love that (when) commit messages are for people first. I don’t want to read stilted and robotic commit messages just because someone wanted to shave off five minutes when creating whatever software artifact like a changelog.
ADDED for new features.
CHANGED for changes in existing functionality.
DEPRECATED for soon-to-be removed features.
REMOVED for now removed features.
FIXED for any bug fixes.
SECURITY in case of vulnerabilities.
I’m against any style or enforcement of style of commit messages that breaks changelog
I use git log history for changelogs with a little dressing for markdown. I could pull this from Jira but it’s worse. Stories aren’t changes.
So I understand that some folks use git log for history and care about their code style so much as to enforce commit message stylecop, it’s overkill in my world and would only cause you to have to amend your commit or write a new message on squash.
The problem with the experiment is that when I’m scanning, typically I’m never scanning and then getting tested on the contents after the fact. It actually took me longer to spot any mention of store in the second example after being told what I was looking for than it did for me to find field_name after being told to look for it. YMMV.
The rationale of subject-first is clear. But what about the other rules: correct instead of fix, rather than instead of rename... They seem pretty arbitrary to me.
For example, correct reinforces the notion that things can be wrong, mistakes can be made, so we correct them. This is essential to our team culture.
"Rather than" came from Eventide's norms and I wasn't there when it was discussed initially. It works for concepts other than renames, so it is more flexible, but I can't say that was the reason for the initial decision.
Author here. Commits aren't actions -- they are a log of an action being taken. You, the committer, are committing. What you commit, goes into a commit log. We choose to make our commit messages describe the change, rather than the person that made the change. It's about the work, not the worker.
I'm trying to imagine my commits coming alive and doing things to my code. I think you may be taking the "When this commit is merged..." thing a bit far, as commits don't actually do anything. They just are. We don't tend to do ourselves any services by anthropomorphizing aspects of our work. I'm happy to concede that we see things very differently.
I was responding to the author of the comment I was responding to, who was anthropomorphizing the commit itself claiming that it did something.
I agree that the person who write the code is irrelevant, though I realize that in a different way. I also recognize that it varies from Git/Linux/etc. We do it for usability and other reasons and one only needs to look at git itself to recognize that the original authors did not have usability in mind when designing it. What they built is incredibly powerful, but it's rare for usability and technical prowess to overlap.
> I was responding to the author of the comment I was responding to, who was anthropomorphizing the commit itself claiming that it did something.
Uh huh. The comment that you replied to is pretty much in line with the wider “patch does X to the code” which isn’t about anthropomorphizing. That’s the spin that you put on it.
It’s about as much anthrop. as saying that a falling piano crushed someone’s leg.
> We do it for usability and other reasons and one only needs to look at git itself to recognize that the original authors did not have usability in mind when designing it. What they built is incredibly powerful, but it's rare for usability and technical prowess to overlap.
It’s also rare to find commit message quality that is on the level of the regular contributors to that project. Which has nothing to do with how unfortunate the UX of the tool is.
Again, I was responding to this particular text: "And the work does an action on the code." You can interpret that the way that you interpret it. I saw it as a part of a wider tendency in the development community and responded to it accordingly. Perhaps it wasn't what was intended. At this point, it's rather a distraction so I'm happy to concede that it is ambiguous and the author was likely attempting to describe the common "When applied, this commit will..." inferred prefix.
> It’s also rare to find commit message quality that is on the level of the regular contributors to that project
Ehhh... maybe. One implicit trade-off here is how low detail the second set is, and that'll happen any time you force a structure - mandated structure takes more space on average, sometimes significantly more. Passive voice wastes a lot of space too (in English anyway).
Personally I think this is better solved by a changelog doc (which you can wordsmith for outsiders, which is an entirely different target than internal developers) and `git blame` (I never browse all of history to find changes to a method, just blame it for the latest or `git log -p -L` to check a series of changes to a file/func/etc).
- There are additional organizational and process mechanics in-play that are out-of-scope for just a write-up on commit messages
- The team is made up of around 15 people supporting around 400 repos and about 8 live products
- The team DOES NOT use pull requests. Contributors are expected to be socially capable and fully engaged in the inter-personal, realtime communication that supports high-performance, low-handoff work.
- The ticketing system is NOT the primary means of communication between team members. It's merely a catalog of work items with only high-level supporting notes. A work item, as the old saying goes, is a "placeholder for a conversation", rather than a work order that can be executed upon by a contributor operating in isolation.
- The original author is describing an exceptional software product organization. He's not debating the merits of various subjective perspectives on what kind of personal flair might drive the content of commit messages. Instead, he's describing one small piece of a methodology that explicitly discourages personal flair and explicitly encourages (and provides supports for) direct communication, and rigorous attention to the team's well-considered, proven, and communicated norms.
- Contributors on the team are expected to maintain their own work logs in the spirit of engineer's logs from engineering industries. If the "why" of a change is important, it will be recorded in the work log. Senior technical leaders (at a minimum) keep up with the individual work logs looking for hazards and maintaining an understanding of the learning that's taking place in the team.
- Contributors on the team are expected to maintain equipment logs for the components that they're making changes to. This isn't the same thing as a SCM log, as an SCM log is for the SCM, not the product under development.
- The reason that the "subject-first" approach is used is to get contributors to talk specifically to the impacts made to the software itself, rather than talk about themselves and their labors. It's about training developers (and designers, and operators, etc) to be more objective about the work they do, and to communicate it in an objective way. Mentions of "changed", "fixed", etc, are records from the perspective of the worker, not the product. When scanning the list of changes made to a work product, the team members are primarily interested in the impacts to that piece of equipment first and foremost. The work product change logs are voiced in terms of the work product, and not the worker. Again, the reflections, learning, questions, observations, etc, of the humans involved in the work go into the human's own logs, and are supported by the person-to-person processes that are beyond the scope of the article on commit messages.
- These 15 people with their 400 repos and 8 live products was winning industry awards within a couple months of go-live within the first year of the team's charter. There's nothing average about the team, the product, and the methods in-play. The commit messages are just one detail of the finely-honed process and culture that were intentionally designed, socialized, and painstakingly-supported to achieve its objectives.
- The author of the article is describing one facet of an overall product development system for a high-performance software organization. He's not trying to spark a dialog about the myriad opinions of how commit messages are written across the breadth of software development within processes and cultures that may not go much beyond the cross-referencing of items between ticketing systems with an SCM system, which is arguably a higher-ceremony approach from an artifacts perspective, but also a "bare minimum" from a process and culture perspective.
It's not entirely pertinent, but just in case there's some question as to whether this work is being done in a business of some scale or other at some particular time or other in a business's lifecycle that accounts for how the development system works, and whether it's a special case: the business context of team's work is a $1B multi-national with about 2,000 employees working in the regulatory side of corporate law, finance, and venture capital.
The business context isn't the enabling factor or the deciding factor. The product development system in-effect is the key factor, and it was shaped explicitly for its outcomes irrespective of the business context. The development system has been used at various companies up-and-down the axis of business scale and scope.
The single greatest contributing factor of the team and its work is that they carry very little technical debt, and the methodology required to work this way touches every molecule of process, organization, and culture. It would not be recognizable from the perspective of typical mid-curve software development, and it would not be understood by examining any single process element in isolation.
> The team DOES NOT use pull requests. Contributors are expected to be socially capable and fully engaged in the inter-personal, realtime communication that supports high-performance, low-handoff work.
I am interested to know more about your workflow given that you do not use pull requests, could you please elaborate on that?
I cannot imagine that you’re all just pushing to master.
sbellware's reply says just about everything of import, but from a tactical perspective, yes, we do push to master. Sometimes we push regular commits and sometimes (if we branched) we push merge commits. We have hundreds of repos and it's very rare for there to be conflicts of note. We do reviewing as we work via pairing (and on demand, live, when necessary).
I've always found it odd (at least since 2001-ish) that developers see the 1-work-for-1-person as some kind of desirable norm, rather than some arbitrary fixation that has been burned into them since the start of their careers.
It's wishful thinking at best to expect effectiveness and productivity come from the isolation of developers to their own isolated work items. It's hopes and dreams of promoted-beyond-their-competency managers to parallelize the workforce to maximum capacity, rather than to focus on the ability to make progress without slowing down, and then optimize from that baseline.
It's odd to me when someone has to call out "pairing", rather than just assume that work items are taken on by work cells and work groups, and that the more important objective is continuity of knowledge and understanding amongst the workers so that momentum can always be transferred immediately without having to reset and relearn.
The 1-work-for-1-person ethos is so obviously counterproductive that I struggle to understand how it's still so prevalent. Except that human history is rife with popular mythology, and humanity's efforts have been undermined by it for as long as there have been humans.
It speaks to the dark ages period of software that we live in that even the production systems in software work are so antiquated that they still try to grasp at the most implausible straws, not to mention ones that have already been debunked and disproven by industry at large.
It speaks to the reality that software development largely still operates as a craft rather than an industry. Software developers barely pay attention to industry as a body of knowledge, and the sorry state of both software products and software jobs reflects it.
When the day comes when "pairing" never even has to be stated, software work will have entered its modern age of industry. It should be presumed that the processes of momentum transfer between knowledge workers is the singular element that dictates whether a team can continue to build momentum in perpetuity, or sputters out prematurely early in a project's lifetime.
We always have to mention "pairing" when talking to people from that dark ages, and those people will never have a good frame of reference for it due to seeing it through the lens of a more primitive grasp of industry and the body of knowledge that surrounds it. It's almost impossible to explain a high-functioning production system to anyone who has made no time in their professional lives to become as qualified in industrial theory, organizational theory, and production systems as they have to learn the latest convoluted front-end toolkit.
We're like cavemen working with tools gifted to us by some advanced civilization. We've been trained to use high-tech, but we barely understand the systems of work that produce it, and we don't allow ourselves to benefit from the things we can learn from more advanced and more mature fields.
But then, as long as the software tooling available to us is a bad is the things that we make with it, we'll never have the buffer of spare time needed to make the study. And down and down we go.
Pull requests are a relatively recent addition to software development. What did you do before pull requests? If you haven't been around for that long, and have had a career that always had pull requests, then pull requests seem like an inevitable and irreducible inherent feature of software work. But they're not inherent to software work.
I'm going to generalize "pull request" into other terminology so that I can describe the deeper issue:
A pull request is an asynchronous handoff, and asynchronous handoffs are a known sub-optimizer of production processes of all kinds, not just software. It causes batching and queueing, and the harmful effects of those on processes are also well-understood.
Why would I want to introduce asynchronous handoffs into a process that we've worked so hard to optimize at every level and in every turn?
Our process is what it was before GitHub popularized pull requests, and popularized workflows that increased the engagement with their digital product.
The problem is that developers aren't asking whether pull requests are even a good thing to build a development process around. Given the detrimental effects of instituting asynchrony as a desirable target, I find surprising that software workers have gone all-in on something that is an obvious production system anti-pattern.
But to a degree, I'm also not surprised. There's a prominence of a certain kind of psychology in software development that wants to work in isolation without having to be bothered by things like realtime communication and coordination. It's seen as a kind of privilege in that cohort of software developers, but it's one of the most harmful things that can be done to software work (or any work where more than one person is involved).
Software developers tend to be antisocial, and there are enough antisocial developers gathered in this industry that they can enable and reinforce the idea that being antisocial is desirable.
We have a zero-tolerance posture when it comes to antisocial tendencies amongst developers. Our processes are all realtime and synchronous. The level of productivity that we're after requires zero-tolerance posture with regards to batching and queueing and handoffs. We collapse the whole rotten mess of software development down to the fundamentals of production systems and reap many multiples of productivity relative to industry norms specifically by harnessing the momentum that comes from always moving forward, rather than contending with the pervasive start-stop interrupted processes that are common in software development due to failure to recognize the effects of batch handoffs.
The specific details of the work flows aren't as important to disregarding pull requests as the character and behavior of the people in the process. We're not antisocial, and we don't accept antisocial behavior in the team. We work very closely with each other in tightly coordinated and highly-communicative ways in everything we do. At any given time, a very small number of active work items are being worked by a single developer.
In the end, we know more about production systems itself as a field of study and body of knowledge than the average team of devs with 5-to-10-years of experience. And we know a lot about structural design. We marry these two concerns into a process that would be quite recognizable to someone who's studies Lean and the work methods that Edwards Deming championed that changed the face of human productivity decades ago - in every field of human endeavor EXCEPT software development.
So, for a team that is irretrievably mired in asynchrony, and operating on a belief system that has never questioned the workflow fundamentals and the detrimental effect that asynchrony has on the upper potential of productivity, I can see the utility of pull requests. But it's not something that I'd willfully volunteer for. I won't trade the luxuries of working in asynchronously isolation for the vastly-improved lifestyle of rejecting the myth that technical debt is inevitable. This is a lesson that American production industries learned when Japanese producers started beating western producers at their own game by exploiting the virtuous relationship of quality and productivity.
The lesson is that productivity and quality are indistinguishable at the the upper limits of high-performance work and work teams. The level of quality that can generate multiples of end-to-end productivity of an entire process and production system doesn't survive in the presence of a tolerance of asynchronous handoffs.
We don't use pull requests because we want to have a more higher quality of software development life, and we can't have that if we indulge our endemic sense of entitlement to working in the dark all by ourselves.
There's an old saying that was oft-repeated in the days before Scrum executed its ethnic cleansing campaign on XP: If code reviews are good, do them constantly and continuously.
We don't do pull requests because every moment of our work is a code review simultaneously with code production. When you learn how to put together an entire production system with these fundamentals as the non-negotiable parts of the work, the productivity that will emerge will be revolutionary.
Most software teams can't imagine how good and how easy software work can be when the target is the upper echelons of quality.
That said, software work has been removed from this body of knowledge for so long (largely due to the rapid increase in the software development labor pool that started around 2010) that software development culture at present at large no longer has possesses the knowledge of what mistakes are in software work.
Until software developers gain a fluency in software structural design, no amount of fiddling with team workflows will result in anything my negligible, middling improvements.
Erl and Yourdon's books are definitely still relevant, but they have to be put into historical context and historical perspective at this point in order to extract the enduring value that they have.
I wouldn't literally build a SOA system the way that it was done back in the mid-2000s, so I know what to disregard in a legacy SOA book, and what to give attention to. Same for an OO book from the 90s. But without having that sense of history, and a fluency in fundamentals and principles, the necessary background that puts these books into context and makes them useful today would simply be missing.
It's not uncommon for developers to pick up a "classic" software book and come away with all of the wrong ideas.
> The words "equipment log" and "work product" sound pretty unique.
We call them "material logs", as equipment logs are more or less a subset of those. Just think about the sheet in every bathroom at a store that says when it was last cleaned, or the clipboard attached to the factory machine that lists its maintenance record and problems. Or the patient record outside of the patient's door.
"work product" just means, the product of our work. Nothing special about that one, I don't think.
> Are Erl's SOA books and Yourdon's on OOP still relevant?
I haven't read either, personally, though I have Yourdon's book en route. We use the term "structural design" quite a bit, so I'm curious to see Yourdon's take on what they call "structured design". From everything I've read about it, it sounds very similar to a lot of how we think about things as it seems to be the basis for much of the coupling and cohesion thought in our industry.
I'd also recommend studying Lean (not Lean Startup, which has much to do with Lean as non-fat yogurt does).
Right. We don't name the methodology. If we did, people who claim to be "doing it" by mentioning the name of the thing. It's the individual practices that are talked about, and that's on-purpose. We can't afford to have the system of practices be obscured by the kind of grasping at status that can come from being able to claim something by name. It's a means of protecting the methodology itself from decay, and from protecting the people on the team from the shallowness that inevitably making something "a thing".
Meh, as in this doesn’t feel like it really matters. Like other commenters, there are exceptionally few situations when I find myself just reading the commit log, so the claimed psychological advantage appears moot in the context of the common/standard software development process (which is very much not what author’s team does, so I acknowledge this approach brings specific measurable advantages for them).
When I do need to read the commit log, it’s typically for a third party component and I look for broad categories I need to pay attention to (security fix? Behaviour change?) — this aligns well with the conventional commits standard and well managed release notes. But, to not have to read commit logs, we also do the rest of the conventional software development workflow (PRs, team meetings, etc) that serves to build the awareness of what’s being done.
I prefer the first version, I didn't find the second version to be more "scannable" and it seems to me to be more esoteric requiring study to realize what was recorded. The first version is clear, each commit makes sense and is what I'd be grepping for when trying to find a commit.
99% of the time when I'm looking at commit history, I'm not just generically scanning. I'm usually looking for a particular change, or when something touched a particular file, etc. Even the example given in the linked page isn't really relevant to me, because I'm rarely "quickly scanning" a whole commit log for something in particular - I'd always just use a search feature (in GitHub or some git IDE plugin) for that.