This is one of those interesting bits where things feel more like personality types as opposed to ways to create maintainable software. I have yet to see a project fail because commit messages were or were not in imperative form or if the first letter was capitalized. I have no problem doing it if someone in the team finds it important, and I have no issue if someone on the team does not do it.
The same is generally true about rebasing etc.
Due to the high visibility of commit messages, a lot of details certainly look pretty bike shed-ish.
What can be important is simply a general concept of meaningful messages when they make sense. For example, bug fixes that may not be obvious by looking at the code, should probably have a reasonably detailed description of the bug. Commits implementing a brand new feature and involving a ton of changes, probably don't need a whole lot more than what the feature is and what is the expected behavior. Possibly some info about potential future bugs that could happen. When updating the readme, "Updating documentation" is just fine plenty of times.
Otherwise, probably just make sure you include some important keywords to make it easier for people to git-grep and I think most projects will be just fine.
There ar certain personality types that really enjoy organization and rules. And the feeling of security that comes from having a well defined set of procedures everyone must follow. It really isn't about making software more maintainable based on any existing research we have about how maintainable software is. But if it benefits some of my coworkers mental health by following those rules, it really doesn't take me any additional effort to follow those rules so I have no problem doing it.
I don’t completely agree. By taking the time to write commit messages you are documenting your code at a fine-grained level. Think of it like a short email that explains to your (future) colleagues and future self what you changed and especially why you did it.
A good git log prevents unnecessary headaches and saves a lot of time. Typical situation: you just found a couple of lines whose purpose is not completely obvious. If the author is still there, you both waste time discussing. If she/he left the company and you have to figure it out by yourself.
Now imagine using git log to identify when these lines where introduced/changed and getting an explanation as to why.
So yeah, writing good commit messages can feel like a waste of time. Like writing good code can feel like a waste of time. You should not be doing it to please your colleague, but rather to contribute to a maintainable codebase. It’s worth the extra minutes here and there.
> Think of it like a short email that explains to your (future) colleagues and future self
Note that commit messages don’t live with the code. If you find yourself writing why something should be in the new state, consider putting that “explain why” as a comment next to the code, so anyone’s text editor can read it.
Hmm, I think it depends on the case. Sometimes it's appropriate to leave comments on the file, while sometimes trivial changes don't need to be mentioned and explained only at a commit message.
Oh and while it doesn't "live" with the code, I can at least look up a line in question and see which commits have changed it, along with their commit messages.
>Oh and while it doesn't "live" with the code, I can at least look up a line in question and see which commits have changed it, along with their commit messages.
The first job I had, I was working on software that had originally been developed by a software engineer but had then been passed to the stewardship of non-programmers.
Originally, it had been checked into CVS (yeah, it was that old), but the non-programmers didn't know what that was or how to use it, so when they started making experimental edits, they just copied the source folder to a new location and labelled it "NEW <project>". By the time I got involved there were over 200 of these folders on the network drive, it was unclear which one if any represented the head revision, and the machine with CVS on it had long ago been lost and existed only as a legend.
This wasn't insignificant software by the way. I don't feel like telling you exactly what it was, but it was an intense physics simulation and it has quietly impacted probably around a million lives over the last few decades.
The point is, don't assume your commit messages are as permanent as your code, they're not. Non-developers just barely grasp source code being important but they don't understand VCS history /at all/. Put important comments in your code.
> By the time I got involved there were over 200 of these folders on the network drive, it was unclear which one if any represented the head revision, and the machine with CVS on it had long ago been lost and existed only as a legend.
Fortunately, DVCSs don't have that problem because the meta-information is stored in a folder within the source. So unless the non-developer deletes that folder in every single project, then the information won't be lost (but it wouldn't get updated either).
That actually depends on how it gets copied and pasted.
I can think of at least two ways off the top of my head to move the source from folder A to folder B while leaving hidden folders behind. They both assume you're using a graphical file explorer, like non-developers are likely to do.
I think a good way to look at it is to think of "explain why" in commit messages as explaining why the code was changed from the previous state to the new state. This may need explanations on how some of the code works, but it doesn't have to.
The "explain why" that explains how the current code state works however should sit next to the current code.
They're associated with a line with a file. git and other SCMs provide tools that allow you to see what commit is associated with a given line, what the state of the file was at the time the commit was made and when lines in that file were removed subsequent to that commit.
> If you find yourself writing why something should be in the new state, consider putting that “explain why” as a comment next to the code, so anyone’s text editor can read it.
The problem here is that there's no guarantee that the comment gets updated when the code does. Whereas, a commit is always associated with the code at the time it was written. If the code is changed, then that old commit is no longer associated with it.
The guarantee would be the same guarantee you get when you update a codebase: that you (and code reviewers) look at the nearby code and see if it still makes sense in the context of your change.
> if the code is changed, then that old commit is no longer associated with it
Right, this is my problem with putting “why” always only in the commit message: Sometimes the “why” is both durable and shouldn’t be buried by later commits. Suppose you refactor the code: you’ve put it in a new place or you add trailing commas or you replace positional args with keyword args. Now the most recent commit is the one that explains that transition. Code is a UI that a future engineer will use to understand the system. An important prose explanation is part of that UI and you might decide it should present itself as an affordance.
This is an exercise of technical-communication judgement: I’m proposing that you should ask, where the explanation should go. Should it be hidden for those looking through the history? Probably. Should it be clearly visible alongside the code? If it is important and durable.
Nobody questions it is important to write clear commit messages. The issue is being nitpicky about what exact form is being used like whether to capitalize first letter or not.
When I look at the repository I am happy if I can understand the messages at all. It's challenging to get people to group changes in a logical way so that the commit makes sense. I am not going to pick on somebody because he used wrong tense or did not capitalize first word of the message if the changes are grouped logically and the commit mentions the correct ticket and purpose of the commit.
I used to be in the camp that cared a lot about formatting, but over time realized that it's more my OCD than any practical reason.
At this point, the only reason why I like consistent code formatting rules is because they prevent bickering between people who have strong opinions on how things should be. When there's a rule, even if people don't like it, they abide by it. When there are no rules, everybody who thinks that there should be rules starts bikeshedding what they ought to be.
But as far as code quality goes, I don't think it matters much.
I'm in that camp that doesn't enforce consistent styling in code or even particularly believe in the value of linting at all! ^_^
I'm pretty much just echoing what others have said, but when I look back at the things that have caused problems / mattered in a codebase, things like formatting are generally ranked an order of magnitude below everything else.
If people care, I'm happy to oblige, but it always seems like a missing the forest for the trees kind of thing.
Because the commit messages, just like email messages, isn't part of the product. They are part of communication between developers. You could as well require people on mailing list to never top-post and always spell correctly, to reduce cognitive load. Some people actually do that, but they are pretty annoying. :)
People absolutely do question if it is important to write clear commit messages. I've had quite a few conversations where people have said things along the lines of "why do you even care? no one ever looks at the commit meesages".
> The issue is being nitpicky about what exact form is being used like whether to capitalize first letter or not.
How many code bases have you worked with that don't bother enforcing some type of standard with spacing/usage of tabs, indentation, variable names, method names, import order etc? People don't typically view those rules/conventions as nitpicky. Why should that change with commit messages?
If that's the case, why have standards when it comes to writing code, e.g. when to use camel case? Why have styles at all for writing in a language like English that tell you how titles should be formatted? Consistent format and structure reduces cognitive load IMO, because it breeds familiarity. Also, Git itself uses the imperative mood whenever it creates a commit on your behalf. Consistency is good.
Also it's a good litmus test for whether you really know what you changed. If you can't describe what you did in a couple of lines, you either did too much or lacked clarity.
Commit messages are a poor place to document code because they don't stay with the code. If the code isn't obvious, comments work wonders and often make the commit message more focused on documenting commit history not actual code.
I agree that when writing confusing code, it is best to add a comment to explain it. However, in cases where the line isn't commented being able to `git blame` a section/line and read the commit message it was part of to understand the context for it can be invaluable at time.
A comment in the code can do both, and it should. You should explain why your code is the way it is, not just what it does. Anybody can see what code does by just reading the code.
There's still a distinction between describing why the code is the way it is, and why it evolved the way it is over time. The latter is what commits capture.
Commit messages are great for explaining the order in which things happened and talking about changes that apply to disparate pieces of code.
I would still argue that comments should explain why code has changed over time -- comments should provide context for someone editing the code, which can include descriptions of what the code used to do, why it has changed, why it hasn't changed (what was already tried and why it wasn't good enough), what else would be impacted by a change, and information about the quality of the code (is it a quick hack, or has it been optimized extensively?)
Naturally you don't need to supply all of this, and if the code is changing very frequently it might be easier to put that stuff in a commit message. (It also helps when you use small, pure functions, because they tend to not need much context to understand.)
> Commit messages are a poor place to document code because they don't stay with the code.
They actually do stay with the code only as long as the code is present in the code base. The code can be updated independently of its associated comment, but the associated commit message will change.
They stay with the code as long as the code is absolutely untouched, which often is a lot shorter than a comment about code remains relevant. Code gets moved, reformatted, ... without details about it becoming stale.
Unless you have large chunks of code moved around wholesale, you can usually track all commits that touched a particular line of code. I've done that while debugging codebases that were over a decade old. And seeing those commits was incredibly helpful, despite the code itself being heavily commented.
> I have yet to see a project fail because commit messages were or were not in imperative form or if the first letter was capitalized.
This is probably unintentional - but i'm pretty sure this is a straw man - No one was implying not having commit message rules means project failure, but it might waste more time if they ever need to be reviewed by others or revisited in a git blame etc. Just like other project wide rules, it helps improve legibility and communication. However I find these rules to be few and general enough, well reasoned about to apply to any project.
I glanced at these years ago and just remembered them (50/72 + imperative mood, the remaining ones are just basic language stuff). I've never imposed them on anyone else but have recommended them, I mainly do it for myself (and I have benefited). The three I mention are the most important, the rest emerge from them (i.e imperative mood encourages you to talk about the effect, not just describe what the code already describes).
> There ar certain personality types that really enjoy organization and rules
I have similar disdain for excessive numbers of rules enforcement, but don't draw conclusions about how to utilise these. "rules" here, are just a way of formalising some principles that may help you write more useful and consumable commit messages - they do not have to be integrated into your org or project or whatever and enforced in order to be beneficial.
I have yet to see a project fail because commit messages were or were not in imperative form or if the first letter was capitalized
Well, I have seen failed projects where the complete lack of commit sanity wasn't the cause, but a very clear sign, of what was wrong. Perfect examples are commits labelled 'no changes' (the author's way of saying 'whitespace/formatting only changes', I think) but where addition of braces in languages sensitive to that actually caused behaviour changes. And more bugs on top of the existing trainwreck.
What can be important is simply a general concept of meaningful messages
This basically. All these commit message rules, whe applied, are just another way to force the comitter to reflect on what he/she is writing (both the message and the code), whether it makes sense, and so on. I've seen so many cases where someone had problems describing in simple terms what a commit was doing and why, just because the code was, well, a mess. Forcing the author to dissect both the commit messsage and code then usually makes it very clear how to improve both.
> I have seen failed projects where the complete lack of commit sanity wasn't the cause, but a very clear sign, of what was wrong.
Clean commit messages fall into the same area as “good git hygiene” for me.
When I was consulting (mostly in the areas of DevOps and CI/CD), git hygiene was the very first thing of any new assignment I would take a close look at.
I joined a project that one of the main contributing factors in its failure was poor git hygiene. The real killer was a mess of branches that everyone had direct push access to. One of their repositories which contained cfengine infrastructure code was branched per environment. The configuration drift between the branches was so great that the dev team decided it was easier to throw away the entire codebase and rebuild production from scratch.
This is how I fell into training developers how to use git, because clean history is a very good indicator of the overall heath of the project.
One of the things I taught myself and then started teaching others is “always keep the head of you branch in a clean working state”. I recognized the common development commit pattern “do a thing” and then immediately afterwards “fix the thing I just did”.
Another, very popular "personality trait" I see is people that have to constantly upgrade and replace tools and frameworks for minute or unknown benefits. This instead of asking themselves, "What is absolutely the best use of my time that will bring most benefits to the project?"
I have recently joined to help failing internal project that had constant stream of production failures ranging from not being able to deploy new version to production for 6 consecutive deployment weekends to 10% of user interactions (like user submitting something to happen and this happening twice, or reporting success where nothing happen or just blew up in his face leaving him hanging).
It's been interesting to observe seemingly intelligent people avoiding really necessary tasks (setup oversight and monitoring, understand underlying problems and their impact, prioritize, fix). Instead, everybody spent their time attempting another Java upgrade (from 6 to 8), Spring upgrade, switch from Subversion to Git, from Maven to Gradle, setup automated code formatting rules or pick hard on others because they try to write simple code ("no, you can't check null with ==, that's what Optional is for...")
I understand some of that was trying to shift the blame but these guys didn't really have to shift blame for the application being in sorry state. Most of them joined recently and haven't been able to contribute anything substantial because of various rules that were designed to prevent anything substantial can get done.
I really liked literally 30s of silence after in a meeting with management and the team I asked if it is their conclusion that back in the days where Java 6 and Spring 2 were contemporary it was not possible to build reliable applications with them.
Sounds to me like someone looked at the production failures and declared "These failures are due to technical debt and poor code quality. What can we do to fix that?"
And "we have dependencies that have been EOLed and need to be upgraded" is easy to quantify and fix - whereas "The codebase is too large, complex and unfamiliar for us to recognise bugs in one another's contributions" is difficult to quantify and fix.
> necessary tasks (setup oversight and monitoring, understand underlying problems and their impact, prioritize, fix
Necessary tasks are boring and it takes persistence to continously do what needs to be done. On the other hand checking various frameworks and endless rewriting may be considered fun by some. They're learning new things.
For the record I know how it feels as I've got a similar experience at work, people flock to new and shiny where old and tested works well and gets the job done.
Political players will use these things to gain power and benefits. They will get someone fired because the bike shed had a minor issue, and meantime will get everyone to ignore the power backups to the power plant.
1000 times this. There is also a significant drawback to having to much rules about commit messages which is that it can lead to horrible passive aggressive behavior: "Dear XYZ, I have undone your commit abf9de because the commit description was improper. In future, please follow the guidelines found at http://ABC. It saves work for all of us. Thanks in advance."
Keep in mind that there are different degrees here. Commit messages should be informative, yes, but worrying about whether they start with a capital letter or are written in past or present tense is waste of energy.
I wouldn't be surprised if having to read several pieces of any kind of text, be it a commit message or code, which are not written in a consistent manner is more of a cognitive load than reading pieces which are. (And higher cognitive load probably leads to litereally more energy being used). So if those assumptions hold, and assuming you want tto waste the least amount of energy possible, the question becomes: what process costs more energy? Making sure the text is consistent and adhering to certain standards, or not doing that and waste more energy interpreting it and/or having to worry about that.
Generally speaking though, I rarely care about reading multiple commit messages at a time. Outside of a PR (where to be honest, I'm far more interested in the PR description than the individual commit messages provided they're not something horrible like "changes"), I generally only look at commits in the context of a blame, which means I really need to understand one commit in isolation.
Good commit messages are super important in this case, of course, but the capitalization, verb tenses, and where line breaks occur have little to no real benefit, particularly if they're focused on over actually having a good commit message.
Well, then you can go and prove that. :) Because the lack of evidence in either direction (wrt cognitive load and commit messages) points us towards the null hypothesis. In my own experience, fixing bugs in C code long abandoned by the original authors, it doesn't matter one bit if the commit message reads like a novel or literally just "fsjdlkfdjlsfl fskjl". All that matters is the change is self-contained and references bug tracker ids and mailing list discussions.
Also in my experience, passive-aggressive power trippers love anal retentive rules. Hence adding more than necessary carries a significant risk.
A simple rule of thumb to prevent this: if it made past code review, it stays, because, by definition, nobody cared enough to notice it and/or call it out.
(This goes only for style guide violations, of course, not behavioral changes.)
>I have yet to see a project fail because commit messages were or were not in imperative form or if the first letter was capitalized
Project failure is a pretty useless metric to go by; very few non-trivial/obvious reasons exist that would bring down an entire project on its own. And ofc, small things add up, larger than sum of their parts and all that.
But anyways, I agree this is a bit of a personality thing, but the most important aspect is that these kinds of issues fall into the bucket of “all options are fine, as long as only one is used”. That is, consistency is key; the flavor of it is minor. And thats trivially obvious in most situations; its much easier to read git logs if they all look/feel the same. It’s certainly more difficult otherwise. Not project-ending, but not nothing, and I’d argue its probably a enough more than nothing that its worth establishing a baseline.
> I have yet to see a project fail because commit messages
That could be said about a lot of things. I have not seen a project fail because they didn't use tests, so you shouldn't bother to write any?
To use a close analogy, surprisingly many professional developers need to be reminded about the proper use of indentation and comments. Not because the project doesn't build otherwise but because code generally needs to be read much more than it is written. The only code that is never read again is failed code.
The same is true of version control. A couple of years ago it was very popular to think in terms of storytelling. Your version control system is the story about why your software ended up the way it did. It's a story that every new developer will need to read, partially. Even if you are the masterful engineer that instinctively knows why "made bugfix" was written that way and not another, at least put in the effort for those of us who aren't.
> That could be said about a lot of things. I have not seen a project fail because they didn't use tests, so you shouldn't bother to write any?
This is a false equivalency. Commit messages do not validate that the program functions. I can just throw random garbage into the commit messages and the program with tests will still function.
I know what you are trying to convey, I just think it's a bad comparison.
This is the start of a Strunk&White-level style guide. There are reasons for things beyond what the OP goes into; for example, the imperative form in English is also the infinitive, which is the most basic form of the word and the one that someone ESL would know or could easily recognize and look up ("caught exception" vs "catch exception"). This is consistent with wording your bug report titles the same way.
I played around with several different tenses and styles for awhile and eventually came to nearly the same conclusions as the OP. If another style has genuine merit, then by all means, do what feels right. But if you've never given it a thought, just do these basic things. Use `git add -p` (to make atomic commits) and spend a few seconds crafting each your commit messages to conform to this standard style. In the long run having a standard style guide actually makes things easier for you as a committer.
Agreed, I think too many people assume their code will live in their small team forever and ever. When you have a few thousand people committing daily it's a different story you need to keep consistency high and the jumping between past and present tense is gnarly.
Also the capitalising your sentence isn't a stupid nitpick. Broken window theory is real and if you're too lazy to capitalise your sentences in published work I'd hate to read your emails.
> This is one of those interesting bits where things feel more like personality types as opposed to ways to create maintainable software. I have yet to see a project fail because commit messages were or were not in imperative form or if the first letter was capitalized.
I think this is the same thing as using a linter and code formatter to enforce a certain coding convention for a given code base. I personally find projects that are written in inconsistent ways harder to reason about and harder to maintain because you have to keep track of multiple ways of doing various things rather than using a standard way of doing those same things.
With commit messages, having a consistent style makes them faster to read and easier to parse (whether manually or through tooling).
> Commits implementing a brand new feature and involving a ton of changes, probably don't need a whole lot more than what the feature is and what is the expected behavior
Describing the motivation behind adding a particular feature and it's possible advantages and disadvantages could be useful. It takes a lot less time to type that up than to figure it out by asking a bunch of other people or reading through disparate sources of incomplete documentation.
The same is generally true about rebasing etc.
Due to the high visibility of commit messages, a lot of details certainly look pretty bike shed-ish.
What can be important is simply a general concept of meaningful messages when they make sense. For example, bug fixes that may not be obvious by looking at the code, should probably have a reasonably detailed description of the bug. Commits implementing a brand new feature and involving a ton of changes, probably don't need a whole lot more than what the feature is and what is the expected behavior. Possibly some info about potential future bugs that could happen. When updating the readme, "Updating documentation" is just fine plenty of times.
Otherwise, probably just make sure you include some important keywords to make it easier for people to git-grep and I think most projects will be just fine.
There ar certain personality types that really enjoy organization and rules. And the feeling of security that comes from having a well defined set of procedures everyone must follow. It really isn't about making software more maintainable based on any existing research we have about how maintainable software is. But if it benefits some of my coworkers mental health by following those rules, it really doesn't take me any additional effort to follow those rules so I have no problem doing it.