The most productive engineers I work with just stack PRs. No diff is ever more than 200 lines changed. Sure, sometimes it's annoying to fix things in the stack, but I've found that keeping PRs small is an overall velocity (and safety) win in the long run.
Diffs that are > 500 lines either tend to sit in code review for several days or get the lazy accept and break something. If I absolutely cannot stack for some reason, I'll put up the PR, then set up time with some teammates to review it together. That way it gets merged that day and I avoid massive rebase conflicts (or I take their feedback and go back to the drawing board).
More info about PR stacking — I wish there was an easier git/GitHub interface for doing this:
I often do this because I move fast when I get momentum. But to be honest, even the best reviewers I’ve worked with find it a pain on their end. It doesn’t necessarily lead to better change sets or better reviews. And if there’s significant rework in a prerequisite, it’s significant cognitive overhead to keep the stack tidy. In a legacy codebase sometimes significant rework isn’t a few easily rebased/edited changes, sometimes it’s an entirely different effort, even if the downstream work still depends on it.
It’s good to do small iterative changes, but it’s not a silver bullet for review challenges.
Re an easier tool, I'm one of the co-founders of Graphite (https://graphite.dev), we built this tool literally to help you stack PRs on GitHub (former FB'ers who missed a stack-based workflow). It does not require your team to change their workflow. If you're interested in giving it a spin, happy to let anyone here through the waitlist, feel free to just dm me on twitter (@tomasreimers) or email me at tomas@graphite.dev
Crap! I just switched to a workplace that doesn't use GitHub. This looks amazing. Certainly better than the quirky collections of shellscripts I used before...
I've heard good things from our customers about spr for GitHub (https://github.com/ejoffe/spr). It extends git with useful commands and hooks directly into GitHub PRs. We'll be adding some direct support for the tool in Reviewable shortly.
> The most productive engineers I work with just stack PRs. No diff is ever more than 200 lines changed. Sure, sometimes it's annoying to fix things in the stack, but I've found that keeping PRs small is an overall velocity (and safety) win in the long run.
Sadly, the culture in some places mandates one pull/merge request per Jira issue/feature and therefore this isn't viable.
And even when it's not a top down decision, your colleagues might just say that actually they want to see all of the changes together once they're done because that's what they are used to, which will lead to you getting feedback pretty late in development and will probably make you need to refactor large parts of your code.
On GitHub, you can change the PR target branch to the previous PR in the stack to get an incremental diff. If you merge a PR lower in the stack, a PR pointing to that will auto update its branch.
With that in mind, you can change the top of the stack PR to the default branch to get a complete diff.
I know this, and I suspect the person you replied to also knows this, but I have never in my career as a programmer found anyone whose review I needed, who would have been comfortable doing this without my help.
I haven't always been working with "best and brightest" although that's not a dig, as the people I'm working with now are probably in the top 0.001% of Git users, nobody who does not work on Git as a discipline in and of itself can live up to this standard.
> I know this, and I suspect the person you replied to also knows this, but I have never in my career as a programmer found anyone whose review I needed, who would have been comfortable doing this without my help.
Well, i wouldn't say never, but you are spot on about the rest!
Many don't want to deal with the additional cognitive load. Of course, not everyone uses GitHub either: as far as i know, GitLab doesn't really have good UI for stacking merge requests (their terminology for the equivalent of GitHub's pull requests) and as a consequence you end up having to write comments about the order in which things should be reviewed.
Now, admittedly, i've personally done it in cases where it makes sense (e.g. separate bunches of refactoring or preparatory work from the introduction of new features), but it was definitely the exception, not the normal way of doing things.
I feel like if i started stacking my changes, some of the more established developers would furrow their brows and complain in the stand-up about needing to spend extra time on review, as they have in the past. Essentially, it's not enough for an idea to be good (or at least a good fit, given its drawbacks), but you also need to read the room!
I feel like if you `git rebase —-interactive` before you open a PR, and squash commits together that are a logical PR above (writing a good commit message), you get the same thing. Commits are the atomic unit of change and “stack” implicitly.
I think you are talking of the opposite of what the comment you are responding to are.
I think the comment you responded to is talking about deconstructing a single chunk of work into multiple commits and opening a PR for each one, rather than combining several units of work into one PR.
Its both the opposite and the same, depending on the reviewer. I have two colleagues, one prefers reviewing the full changeset and one prefers reviewing commit by commit.
The first might be pretty happy to see a bundle of smaller patches, and the second would be very upset because his workflow is now X times larger if they want to request changes.
I sometimes found myself going, "I could split this PR up into smaller ones but I'll skip it just this once" and it's been a mistake every single time.
Few things are a mistake so reliably as that.
These days, I always split it up until multiple PRs.
I try to manage most large change sets in the same way, never knew it had a name. For me, the main pain point that keeps me from using it all the time is working across forks, but that depends on how your organization handles code and might not be a problem for you. The same approach is also useful when you want to chunk off a part of a feature for review, but keep working on further functionality while the first part is in review.
I really like the idea, but I find that there is a big gap in the tooling to do this (and make it easy to review)
It would be nice to open just one PR with several sets of changes where you could provide context for each set of changes.
With GitHub you could do this by having a commit per each set of changes, but you can't add a description for each commit.
You end up having a PR that points to the previous PR, and then you have to decide if you run CI in that PR or just the last PR, and then having some merge issues. With all this hassle people usually just open a big PR with all the changes or decide against doing that small refactor (because it would have to do it in another PR and the hassle is not worth it), overall this ends up decreasing code quality in the codebase.
We once upon a time had a thing where we would have feature branches, and off of those we'd make task branches that would be merged (or a PR) back into the feature branch, then a final review and validation before the feature branch goes back into main. It kinda worked for us. This was nearly ten years ago.
- 200 lines of statically typed, pure functional code with high test coverage
- 20 lines of code with I/O, parallelism, mutable state and/or interacting with external APIs
I find it way less cognitively draining to review a big chunk of blue ocean code with good patterns, than even small amounts of code which requires me to maintain any kind of state in my head, or try to think about the consequences of how this change interacts with the broader system. I'd say that some kinds of code are easily 10x more cognitively draining than others.
I'd like to see someone try to review a 200 line regexp. They might take about the same amount of time to write per line, but they definitely don't take the same amount of time to read.
It sounds like you have an illegible, compact regex in mind. If you write regexes like that for others to read, I feel sorry for the other people.
If you write the regex like you would production code, i.e. with whitespace, newlines, and comments[1], I imagine 200 lines of regex would be about the same as 200 lines of any other type of production code, be it APL, Haskell, or conjunctive normal form.
I realise this is a bit of a circular argument: if you write the code such that 200 lines is about n units of review in a specific language, then 200 lines will be about n units of review in any language. But that's also kind of the point. There's a very specific logic density that you can accomplish for production code to be readable, and that results in about that density being reviewed.
[1]: In Python you can do this with the re.VERBOSE flag, in other languages you can do it by breaking the regex apart into separate variables that are then combined together.
Here’s another strategy: submit with you PR a sufficient argument as to its correctness. This could be unit tests at a bare minimum, constructive types, property tests, documentation, formal proofs, etc. Whatever sufficient is for what you’re doing.
Reviewers: don’t read every line of code like you’re the lone avenger standing against all the “bad code,” in the world. Stop being a gatekeeper. Check that the author did their homework: is the argument sufficient and convincing enough? Review their tests, documentation, types, etc. It’s a lot easier to debug their proof than it is to piece together every line of code and verify it by hand.
Handle formatting and linting with automation. Write style guides so that style is consistent across the code base. Build empathy for your operations, packagers, maintainers by ensuring the release process has affordances to test code in a production environment before going live to all users. Write specs and designs before hand, as your building the software, and in retrospect.
The other kind of fatigue is author fatigue. Here’s 4 hours of work plus another hour pre-reviewing everything and wrapping it all up. Fingers crossed you don’t find any nitpicks that will put off my code from merging another day because you couldn’t be bothered to review anything until the last hour before EOD.
It turns out what few empirical studies exist on code review effectiveness there are seem to agree that reviewers are noticeably ineffective at catching errors if they read more than 200-ish LOC per hour. Including your own code. So do consider making smaller changes. A 1200-line of change PR is difficult to review unless you’ve provided an iron-clad proof.
Exactly. IMO it is the task of the developer to provide some means to show correctness of her/his code changes. Most of the the time that would be unit- or integration tests.
When I receive a PR where tests could have been added but they are absent and the code is not trivial, I will not approve it, simple as that.
> Reviewers: don’t read every line of code like you’re the lone avenger standing against all the “bad code,” in the world.
I think what degree of review is relevant depends heavily on the area of code. In a core piece of something complicated it's imo necessary to look a lot more closely than what you describe.
Am I the only one that thinks this is acceptable? Sometimes you have a large change, and you can’t reasonably split that up into something smaller. We have tests to give us confidence that whatever the change did, it didn’t irrevocably damage the system.
To be fair, after things get into the master branch there’s still about 2 months of manual QA review for all releases, but if this weren’t the case we’d simply have to increase test coverage.
Am I the only one that thinks this is acceptable? Sometimes you have a large change, and you can’t reasonably split that up into something smaller.
No, you’re not, and yes, sometimes you have and you can’t. (Assuming you meant reviewing 500 lines at once, not summarily passing a huge change without really reviewing it at all.)
If someone is implementing something that is inherently complicated, I much prefer to see the complete picture and do one thorough review than be drip-fed little pieces for a series of smaller reviews. That way, I can take the time to understand what is really happening, and look for potential dangers like:
• misunderstandings about requirements
• missing cases
• inconsistencies between related cases
• deviation from expected protocols or data formats
• deviation from established patterns or architecture
• unnecessary duplication
• useful abstractions that have been overlooked
• sequences or combinations that can lead to an invalid state…
Code reviews of tiny partial changes tend to offer little benefit in my experience. Sure, maybe I’d spot a typo or something, but you don’t need me for that, it’s what your linter is for. The real value of a skilled human reviewer is recognising more fundamental dangers like the ones above, and it’s really hard to spot those when you can’t see the forest for the trees.
It doesn’t help that many diff/review tools emphasize code that has changed so much that almost everything else is often hidden by default. It’s not unusual for the majority of the feedback I give in a code review to depend on related code that wasn’t changed, because reading that is often when you realise something is inconsistent or has been overlooked.
This is also why I don’t mind reviewing 50 or even 500 lines of code, as long as those lines make sense together. It’s not as if I’d ever only look at 5 lines for most 5 line changes anyway. Yes, it can take time, but the whole point of code review is to be another person who has taken enough time to think about the code and maybe come to some different conclusions, which might lead to useful discussions and maybe making some changes.
> If someone is implementing something that is inherently complicated, I much prefer to see the complete picture and do one thorough review than be drip-fed little pieces for a series of smaller reviews. That way, I can take the time to understand what is really happening, and look for potential dangers
But isn't this what design review is for? So you don't end up in a position where you suddenly have a 500 line change set based on misunderstood requirements, and you have to review the code to find out?
Many of the things you mention should be caught in design review, and then even more of them would be caught in an early draft review (say 50–150 lines of code) with most of the actual implementation stubbed out.
That way, when you finally get to the big 500 line full implementation, almost all of the high-level concerns have been worked out at an earlier (cheaper!) stage so only minor comments remain, and those it is effective to be drip-fed.
Ideally, yes, those kinds of things would probably happen and they’d reduce the load on any final code review. In practice, I find a lot of development groups do code reviews prior to accepting changes into whatever deployment pipeline they use but aren’t so comprehensive or consistent with reviews at earlier stages.
Even with useful reviews at the design stage, I would still always look out for those kinds of issues as part of a code review, though. The approach might have evolved during implementation, or that implementation might have unintentionally deviated from the intended design. Also, sometimes the code is doing something inherently complicated and there just isn’t much to see until a whole algorithm is substantially implemented.
For PR that is mostly new code, I'd much prefer to review a single 1000 PR than 5 200 line PRs. It's so much easier to understand how all the pieces fit together than the myopic view you get with tiny stacked changes.
I think whether it's acceptable or not is heavily dependent on the team and the work. In my personal experience (mostly Java) when people say they can't reasonably split something up it's more often a "not willing to" than "actually can't". Generally speaking the strategies that make smaller code reviews possible also have a whole bunch of positive side benefits, a la "make the change easy, then make the easy change".
I personally save many-hundred-line reviews for 100% automated refactorings or string find/replace or similar, though I acknowledge I'm often doing extra work just to make the PR process smoother (even if it's a throwaway intermediate step).
* Code review shares more credit and blame for code they approve
* Change the word "review" -> "collaborate"
* Involve the "collaborator" earlier in the process, before it's supposedly finished. If you helped write the code then the approval process will be much smoother.
Regardless of the word, this should be the approach. That’s what code review is: two or more people collaborating to make a change successful.
The reasons I’m iffy on the others:
> Code review shares more credit and blame for code they approve
I’m on board in spirit, but…
> Involve the "collaborator" earlier in the process, before it's supposedly finished. If you helped write the code then the approval process will be much smoother.
I’m also on board with the first part of this, but have a strong negative reaction to the second part. Helping write the code might seem like it’s more collaborative (because it is), but it weakens review because both collaborators develop shared biases/narrowed vision of the impact.
Review works because people collaborate from different perspectives. Pairing (as another comment suggests) sounds more in line with this, but I’d still want review to remain a separate process so people not involved in implementation can catch oversights by those who are involved. And I still want reviewers in a position to apply that incentive to succeed in a critical way so it increases success rates.
> Code reviews help towards bonuses or promotions
Maybe for junior devs who don’t already have the instinct to review/be reviewed? If you’re senior, part of your responsibility is to ensure that your team’s baseline expectations are being met, and to proactively try to improve the effectiveness of those expectations (whether by clarification, process improvement, coaching/mentoring, or just anticipating the need and improving it directly).
If I’m in a position to evaluate comp for a senior engineer and they’re dragging their feet on reviews, I’m more inclined to have a discussion with them about the expectations of the role than to offer monetary incentives. (And I’ll reward the positive outcomes that come with good review practices as well as all the other ways people in the role excel.)
The only exception I can really think of here is helping a team/org improve its review process overall. But that’s rewarding leadership, not personal development.
> Regardless of the word, this should be the approach. That’s what code review is: two or more people collaborating to make a change successful.
Nah, it is review. It is not collaboration, that is just newspeak for review. Collaboration works differently. It does not expect one person to do stuff and other to comment on in when it is expected to be almost done.
And involving that person sooner is just buying compliance here, so that review is smoother.
> Nah, it is review. It is not collaboration, that is just newspeak for review.
Not everything is an Orwellian conspiracy. Some people actually just think about things differently and value different things about them/in their usage. I have experienced very collaborative review processes, probably no more so than on my present team. This…
> expect one person to do stuff and other to comment on in when it is expected to be almost done.
… isn’t at all how our review process works! For even small changes, we’ve typically had a discussion about approach/design before any work begins; for larger changes a more thorough discussion about potential pitfalls and implications. We tend to have chats and review comments intermittently throughout the dev process. Nothing is “almost done” by the time it reaches review.
The thing that’s review is that typically one person is leading a given effort, and one or more people are validating it through the whole process. That is collaboration, and it’s always based in a combined effort to ensure the effort is both productive and successful.
> For even small changes, we’ve typically had a discussion about approach/design before any work begins; for larger changes a more thorough discussion about potential pitfalls and implications. We tend to have chats and review comments intermittently throughout the dev process.
None of that is code review. Like, literally none of that communication and activity counts as code review. Code review just does not mean "discussing design before any work begins".
> Involve the "collaborator" earlier in the process, before it's supposedly finished. If you helped write the code then the approval process will be much smoother.
This forces collaboration on everything. Not everything needs collaboration. Some PRs just need a sanity check from a second pair of eyes. That's it.
> Code reviews help towards bonuses or promotions
Classic example of perverse incentives. This will only encourage people to comment unnecessarily on PRs, wasting the author and other reviewers time. I know this from first hand experience.
We include code reviews in promo/annual review materials, and actually, reviews with unnecessary/nitpicky comments on them will count AGAINST you. Submitting shitty time-wasting reviews in the packet is worse than having no reviews at all. People will actually go read all the comments you left, and judge whether they were necessary/important/improved the code.
At my employer, we do "Code reviews help towards bonuses or promotions" -- we're required to include at least a few examples of good thoughtful code reviews into promotion and annual review materials.
I think it's a good policy and it helps ensure some minimum bar of code review from everyone, but it's not sufficient by itself. A person who is only motivated by the promo/review process is going to give "just doing this to check the box"-quality reviews.
I've found that people are much more inspired by all of the intrinsic motivations for doing code review, like learning from others' code and helping your teammates improve. But putting it into promo packets is a good first step to start building that culture.
One thing I have found beneficial is for the developer making a big change like this to actually write a "human readable" explanation of the changes. Something the reviewer can read and use to understand the context for and purpose of the changes.
These are the most important cases to commit incrementally, even if they’re automated or tooling-assisted in some way. That way you can review the substantive changes separately from the chores.
I hold the (perhaps inflammatory) opinion that most shops are using code review as a crutch to paper over all the other gaps in the process.
Are your code reviews producing a bunch of design feedback? You probably should have design reviews long before a feature gets to the PR stage. Code reviews producing formatting feedback? That's the job of an opinionated auto-formatter (a la Go or Rust). Code reviews finding a lot of logic bugs? That's what automated tests and fuzzers are for...
How do you deal with people who don't adhere to the process? It's all fine and good to require people to write tests, but often I catch bugs in their test code too.
I implemented a policy of "loom video, for all PR", at my last company.
Every video should include: an explanation of the problem, a walkthrough of the patch, a recording of the result.
PRs never sat long, because the willpower it took to understand the problem, the solution, and how to replicate/test manually (if you feel that necessary) was drastically reduced.
It shifts the burden of extra work to the requestor, who has motivation to get things merged.
Additionally, git blame became infinitely more useful.
We do this as well for anything that touches the UI, and oh my god has it made it easier for me to approve things (or deny them) when it's a bunch of UI stuff
I like this. We also strongly suggest that engineers do PRs first thing in the day. This has the benefit of minimizing interruptions, and a daily cadence to PRs is typically pretty good. Basically, interrupting your work to do PRs is generally bad. So that basically leaves the beginning of the day, right after a big interruption. (we try to not be religious about this though)
I've started pushing back on changes that don't start with a narrated video discussing the changes against current production. I wanted them to supply the page, element, and change description, but video just works better for them.
Everything in the article rings true. We as authors/reviewers can do a lot more to prove that the code works and much of that doesn't actually involve reading diffs.
However I think people often underrate how bad their basic code review tools are how much that might be holding them back! GitHub's default tooling is just not that great and you can easily miss a lot of the most important code or discussions.
So ... shameless plug. That's why I built CodeApprove (https://codeapprove.com). If you're reviewing code on GitHub, CodeApprove will make your job easier. Your reviews will be faster and more direct. If you layer the tips in this blog post on top of a good tool like CodeApprove you'll definitely notice a difference.
> Ask them for a narrower review — e.g., a design review or an architecture review
I like this technique for PRs of any size (except very trivial changes). Code reviews can be a great learning opportunity, made even better if the author has specific aspects of the code they'd like feedback on.
For example, I opened a PR the other week with a bunch of refactoring work. One chunk of the refactor changed how a complex data transformation was being done by replacing some unnecessary local variables with a chain of map, filter, and forEach functions. I highlighted this as a particular point of interest, which spawned a good discussion on testability, how small our tests should be, and if there's any value testing specific bits of the transform vs. the chain as a whole.
I really have never gotten this attitude. Like, at all. As a leader on my team, code review is simply part of my job. I don't get to put off writing code, talking to the designers, or any of the other things. Why should PRs be a singled-out point of such revulsion? It's not about catching bugs so much as collaboration (with the more senior people) and mentorship (with the more junior people). Helping to share information about the codebase.
I love getting suggestions on my PRs -- I'm not perfect, there's almost always something that I could improve or align better with the existing code. And I take pride in giving good suggestions, because it makes my team better in the long run.
Okay really new to me that code reviwers are not always running the code. I think it depends on the kind of project and ease of checking out the branch and just running it but I'd argue it always make sense to check it out.
The main indicator of how easy a review is is for me simply the number of changed files and commits. We try to enforce smaller differences, even if it is not possible to make features that small. You can always merge it with feature flags into production.
Also urging developers so self-review the code makes a HUGE difference already as well.
In order to get around some of the issues here we try to have lots of micro-PRs rather than large ones. Often this means doing small refactors in their own PRs, letting them be reviewed separately, and using them as the basis for the actual change (which will, again, sit in its own PR).
My favorite is the recursive PR.
Big PR gets split into 10 small parts.
PR X, someone notices that they should all have some change, because of some eventuality or commonality that is obvious in part X.
Best rule ppl should remember when it comes to code review is that “ppl want to help you deliver something better if they took the time to give you a constructive feedback”.
But its ofcourse up to YOU how you are going to react on that.
Diffs that are > 500 lines either tend to sit in code review for several days or get the lazy accept and break something. If I absolutely cannot stack for some reason, I'll put up the PR, then set up time with some teammates to review it together. That way it gets merged that day and I avoid massive rebase conflicts (or I take their feedback and go back to the drawing board).
More info about PR stacking — I wish there was an easier git/GitHub interface for doing this:
- https://0xc0d1.com/blog/git-stack/
- https://kurtisnusbaum.medium.com/stacked-diffs-keeping-phabr...