GitHub Stacked PRs
github.github.com516 points by ezekg 7 hours ago
516 points by ezekg 7 hours ago
As someone who used phabricator and mercurial, using GitHub and git again feels like going back to the stone ages. Hopefully this and jujutsu can recreate stacked-diff flow of phabricator.
It’s not just nice for monorepos. It makes both reviewing and working on long-running feature projects so much nicer. It encourages smaller PRs or diffs so that reviews are quick and easy to do in between builds (whereas long pull requests take a big chunk of time).
I'm so glad git won the dvcs war. There was a solid decade where mercurial kept promoting itself as "faster than git*†‡" and every time I tried it wound up being dog slow (always) or broken (some of the time). Git is fugly but it's fast, reliable, and fugly, and I can work with that.
What is kind of funny here is that you're right locally. At the same time, the larger tech companies (Meta and Google, specifically) ended up building off of hg and not git because (at the time, especially) git cannot scale up to their use cases. So while the git CLI was super fast, and the hg CLI was slow, "performance" means more than just CLI speed.
I was never a fan of hg either, but now I can use jj, and get some of those benefits without actually using it directly.
>At the same time, the larger tech companies (Meta and Google, specifically) ended up building off of hg and not git because (at the time, especially) git cannot scale up to their use cases.
Fun story: I don't really know what Microsoft's server-side infra looked like when they migrated the OS repo to git (which, contrary to the name, contains more than just stuff related to the Windows OS), but after a few years they started to hit some object scaling limitations where the easiest solution was to just freeze the "os" repo and roll everyone over to "os2".
Right, and I'm glad there are projects serving The Cathedral, but I live in The Bazaar so I'm glad The Bazaar won.
The efforts to sell priest robes to fruit vendors were a little silly, but I'm glad they didn't catch on because if they had caught on they no longer would have been silly.
I remember using darcs, but the repos I was using it with were so small as to performance really not mattering…
This matches my experience 100%. I was about to write a similar comment before I see yours.
Mercurial has a strictly superior API. The issue is solely that OG Mercurial was written in Python.
Git is super mid. It’s a shame that Git and GitHub are so dominant that VCS tooling has stagnated. It could be so so so much better!
> The issue is solely that OG Mercurial was written in Python.
Are we back to "programming language X is slow" assertions? I thought those had died long ago.
Better algorithms win over 'better' programming languages every single time. Git is really simple and efficient. You could reimplement it in Python and I doubt it would see any significant slowness. Heck, git was originally implemented as a handful of low level binaries stitched together with shell scripts.
Every time I've rewritten something from Python into Java, Scala, or Rust it has gotten around ~30x faster. Plus, now I can multithread too for even more speedups.
Python is absurdly slow - every method call is a string dict lookup (slots are way underused), everything is all dicts all the time, the bytecode doesn't specialize at all to observed types, it is a uniquely horrible slow language.
I love it, but python is almost uniquely a slow language.
Algorithms matter, but if you have good algorithms, or you're already linear time and just have a ton of data, rewriting something from a single-threaded Python program to a multithreaded rust program I've seen 500x speedups, where the algorithms were not improved at all.
It's the difference between a program running overnight vs. in 30 seconds. And if there are problems, the iteration speed from that is huge.
> [...], it is a uniquely horrible slow language.
To be fair, Python as implement today is horribly slow. You could leave the language the same but apply all the tricks and heroic efforts they used to make JavaScript fast. The language would be the same, but the implementations would be faster.
Of course, in practice the available implementations are very much part of the language and its ecosystems; especially for a language like Python which is so defined by its dominant implementation of CPython.
Fair! I guess I didn't mean language as such, but as used.
But a lot of the monkey-patching kind of things and dynamism of python also means a lot of those sorts of things have to be re-checked often for correctness, so it does take a ton of optimizations off the table. (Of course, those are rare corner cases, so compilers like pypy have been able to optimize for the "happy case" and have a slow fall-back path - but pypy had a ton of incompatibility issues and now seems to be dying).
> every method call is a string dict lookup
Doesn't the Python VM have inline caches? [0]
I think that's a new thing from like python 3.12+ or something after I stopped using Python as much.
It didn't used to.
EDIT: python 3.11+: https://peps.python.org/pep-0659/
> Better algorithms win over 'better' programming languages every single time.
That's often true, but not "every single time".
I've rewritten a python tool in go, 1:1. And that turned something that was so slow that it was basically a toy, into something so fast that it became not just usable, but an essential asset.
Later on I also changed some of the algorithms to faster ones, but their impact was much lower than the language change.
> git was originally implemented as a handful of low level binaries stitched together with shell scripts.
A bunch of low level binaries stitched together with shell scripts is a lot faster than python, so not really sure what the point of this comparison is.
Python is an extremely versatile language, but if what you're doing is computing hashes and diffs, and generally doing entirely CPU-bound work, then it's objectively the wrong tool, unless you can delegate that to a fast, native kernel, in which case you're not actually using Python anymore.
Well, you can and people do use Python to stitch together low level C code. In that sense, you could go the early git approach, but use Python instead of shell as the glue.
Python is by far the slowest programming language, an order of magnitude slower than other languages
One of the reason mercurial lost the dvcs battle is because of its performance - even the mercurial folks admitted that was at least in part because of python
> I thought those had died long ago.
No, it's always been true. It's just that at some point people got bored and tired of pointing it out.
You barely have to try to have Python be noticeably slow. It's the only language I have ever used where I was even aware that a programming language could be slow.
They died because everyone knows that Python is infact very very slow. And that’s just totally fine for a vast number of glue operations.
It’s amusing you call Git fast. It’s notoriously problematic for large repos such that virtually every BigTech company has made a custom rewrite at some point or another!
Now that is interesting too, because git is very fast for all I have ever done. It may not scale to Google monorepo size, it would ve the wrong tool for that. But if you are talking Linux kernel source scale, it asolutely, is fast enough even for that.
For everything I've ever done, git was practically instant (except network IO of course). It's one of the fastest and most reliable tools I know. If it isn't fast for you, chances are you are on a slow Windows filesysrem additionally impeded by a Virus scanner.
The fact that Git has an extremely strong preference for storing full and complete history on every machine is a major annoyance! “Except for network IO” is not a valid excuse imho. Cloning the Linux kernel should take only a few seconds. It does not. This is slow and bad.
The mere fact that Git is unable to handle large binary files makes it an unusable tool for literally every project I have ever worked on in my entire career.
Whatever your opinion on one tool or another might be - it does seem weird that the "market" has been captured by what you are saying is a lesser product.
IOW, what do you know that nobody else does?
Network effects and marketing can easily prevent better tools from winning.
I mean, in the fickle world that is TECH, I am struggling to believe that that's what's happened.
I personally went from .latest.latest.latest.use.this (naming versions as latest) to tortoise SVN (which I struggled with) to Git (which I also was one of those "walk around with a few memorised commands" people that don't actually know how to use it) to reading the fine manual (well 2.5 chapters of it) to being an evangalist.
I've tried Mercurial, and, frankly, it was just as black magic as Git was to me.
That's network effects.
But my counter is - I've not found Mercurial to be any better, not at all.
I have made multiple attempts to use it, but it's just not doing what I want.
And that's why I'm asking, is it any better, or not.
Mercurial has a more consistent CLI, a really good default GUI (TortoiseHg), and the ability to remember what branch a commit was made on. It's a much easier tool to teach to new developers.
Hmm, that feels a bit subjective - I'm not going to say X is easier than Y when I've just finished saying that I found both tools to have a lot of black magic happening.
But what I will point out, for better or worse, people are now looking at LLMs as Git masters, which is effectively making the LLM the UI which is going to have the effect of removing any assumed advantage of whichever is the "superior" UX
I do wish to make absolutely clear that I personally am not yet ready to completely delegate VCS work to LLMs - as I have pointed out I have what I like to think of as an advanced understanding of the tools, which affords me the luxury of not having an LLM shoot me in the foot, that is soley reserved as my own doing :)
Worse products win all the time. Inertia is almost impossible to overcome. VHS vs Betamax is a classic. iPod wasn’t the best mp3 player but being a better mp3 player wasn’t enough to claw market share.
Google and Meta don’t use Git and GitHub. Sapling and Phabricator much much better (when supported by a massive internal team)
That worse is better, and some people don't know better or care.
"better" in that sentence is very specific. Worse is also worse, and if you're one of the people for whom the "better" side of a solution doesn't apply, you're left with a mess that people celebrate.
Welcome to VHS and Betamax. the superior product does not always win the market.
Not always, but in this case the superior product (i.e. VHS) won. At initial release, Beta could only record an hour of content, while VHS could record 2 hours. Huge difference in functionality. The quality difference was there, but pretty modest.
I continue to use gerrit explicitly because I cannot stand github reviews. Yes, in theory, make changes small. But if I'm doing larger work (like updating a vendored dep, that I still review), reviewing files is... not great... in github.
Most editors have some kind of way to review github PRs in your editor. VSCode has a great one. I use octo.nvim since I use neovim.
Can these tools e.g. do per-commit review? I mean, it's not the UI what's the problem (though it's not ideal), it's the whole idea of commenting the entire PR at once, partly ignoring the fact that the code in it changes with more commits pushed.
Phabricator and even Gerrit are significantly nicer.
Unless you have a “every commit must build” rule, why would you review commits independently? The entire PR is the change set - what’s problematic about reviewing it as such?
tangled.org supports native stacking with jujutsu, unlike github's implementation, you don't need to create a new branch per change: https://blog.tangled.org/stacking/
I miss the Phabricator review UI so much.
Same here. Don't understand why Github hasn't supported this until now. I'm tired of reviewing PRs with thousands of lines of changes, which are getting worse nowadays with vibe coding.
What does Facebook use internally these days. I'm amazed that the state of review tools is still at or behind what we had a decade ago for the most part.
It’s still phabricator
Any idea if their internal version has improved dramatically since they stopped maintaining the public version?
Me too. And I'm speaking from using it at Rdio 15 years ago.
Nothing since (Gerrit, Reviewboard, Github, Critique) has measured up...
Oh, phabricator. I hated that tool with a passion. It always destroyed my carefully curated PR branch history.
See https://stackoverflow.com/questions/20756320/how-to-prevent-...
I might be missing something, but what I need is not "stacked PR" but a proper UI and interface to manage single commit:
- merge some commits independently when partial work is ready.
- mark some commit as reviewed.
- UI to do interactive rebase and and squash and edit individual commits. (I can do that well from the command line, but not when using the GitHub interface, and somehow not everyone from my team is familiar with that)
- ability to attach a comment to a specific commit, or to the commit message.
- better way to visualize what change over time in each forced push/revision (diff of diff)
Git itself already has the concept of commit. Why put this "stacked PR" abstraction on top of it?
Or is there a difference I don't see?
It's basically trying to bring the stacked diff workflow pioneered by Phabricator to GitHub.
The idea is that it allows you to better handle working on top of stuff that's not merged yet, and makes it easier for reviewers to review pieces of a larger stack of work independently.
It's really useful in larger corporate environments.
I've used stacked PRs when doing things like upgrading react-native in a monorepo. It required a massive amount of changes, and would be really hard to review as a single pull request. It has to be landed all at once, it's all or nothing. But being able to review it as smaller independent PRs is helpful.
Stacking PRs is also useful even when you don't need to merge the entire stack at once.
> stacked diff workflow pioneered by Phabricator
Ahem, pioneered by gerrit. But actually, I'm almost certain even that wasn't original art. I think gerrit just brought it to git.
To my knowledge, stacked diffs were first done in the Linux kernel as stacks of patches sent over email. From there they spread to Google and Facebook. (Source: I worked on Facebook's source control team from 2012-2018 and did a lot of work to enable stacked diffs there.)
Right, I was thinking from a web-based UI. The "pull request" term is from git (AFAIK), but git itself was built to accommodate the earlier concept of mailing patches around. (Source: I've been using version control since RCS/SCCS days and contributed here and there to git in its infancy. Also an early user/contributor to Gerrit.)
Congrats and thank you. You helped build one of the best devex experiences I've ever had the pleasure of working with :)
At some point, a derivative idea becomes so different from the original one that it’s a novel idea in essence. Just like SMS is ultimately a derivative of cuneiform tablets, and yet it isn’t in any meaningful sense.
I don't think mailing stacks of patches is that different? As someone who built this stuff it was pretty obvious to me that web-based patch stack management was a relatively small evolution over mailing lists. Tools like patchwork bridged the gap initially, and we were quite familiar with them.
Imagine gettting a cuneiform tablet by courier telling you that you have unpaid parking tickets in a state you've never driven in
I'm not in a large corporate environment, but that also means we're not always a well oiled machine, and sometimes i am writing faster than the reviewer can review for a period of time -- and i really need the stacking then too.
What if main/master moves in between reviews?
You head to the farthest branch in the chain, fetch the latest main, and run `git rebase --update-refs main` (I prefer interactive mode myself) and then force push all of the branches from start to the end.
1: https://git-scm.com/docs/git-rebase#Documentation/git-rebase...
Before this feature when you were doing it manually, it was a huge problem. One of the points of this feature, is it automates rebasing the whole stack.
you just rebase it? what's the big deal?
I don't use Github but I do work at one of the companies that popularized this workflows and it is extremely not a big deal. Pull, rebase, resolve conflicts if necessary, resubmit.
Constantly rewriting git history with squashes, rebases, manual changes, and force pushes has always seemed like leaving a loaded gun pointed at your foot to me.
Especially since you get all of the same advantages with plain old stream on consciousness commits and merges using:
git merge --no-ff
git log --first-parent
git bisect --first-parent
I find rebases are only a footgun because the standard git cli is so bad at representing them - things like --force being easier to write than --force-with-lease, there being no way to easily absorb quick fixes into existing commits, interdiffs not really being possible without guesswork, rebases halting the entire workflow if they don't succeed, etc.
I've switched over pretty much entirely to Jujutsu (or JJ), which is an alternative VCS that can use Git as its backend so it's still compatible with Github and other git repos. My colleagues can all use git, and I can use JJ without them noticing or needing to care. JJ has merges, and I still use them when I merge a set of changes into the main branch once I've finished working on it, but it also makes rebases really simple and eliminates most of the footguns. So while I'm working on my branch, I can iteratively make a change, and then squash it into the commit I'm working on. If I refactor something, I can split the refactor out so it's in a separate commit and therefore easiest to review and test. When I get review feedback, I can squash it directly into the relevant commit rather than create a new commit for it, which means git blame tends to be much more accurate and helpful - the commit I see in the git blame readout is always the commit that did the change I'm interested in, rather than maybe the commit that was fixing some minor review details, or the commit that had some typo in it that was fixed in a later commit after review but that relationship isn't clear any more.
And while I'm working on a branch, I still have access to the full history of each commit and how it's changed over time, so I can easily make a change and then undo it, or see how a particular commit has evolved and maybe restore a previous state. It's just that the end result that gets merged doesn't contain all those details once they're no longer relevant.
+1 on this, I also switched to jj when working with any git repo.
What's funny is how much better I understand git now, and despite using jj full time, I have been explaining concepts like rebasing, squashing, and stacked PRs to colleagues who exclusively use git tooling
The magic of the git cli is that it gives you control. Meaning whatever you want to do can be done. But it only gives you the raw tools. You'll need to craft your own workflow on top of that. Everyone's workflow is different.
> So while I'm working on my branch, I can iteratively make a[...]which means git blame tends to be much more accurate and helpful
Everything here I can do easily with Magit with a few keystroke. And magit sits directly on top of git, just with interactivity. Which means if I wanted to I could write a few scripts with fzf (to helps with selection) and they would be quite short.
> And while I'm working on a branch, I still have access to the full history of each commit...
Not sure why I would want the history for a specific commit. But there's the reflog in git which is the ultimate undo tool. My transient workspace is only a few branches (a single one in most cases). And that's the few commits I worry about. Rebase and Revert has always been all I needed to alter them.
Until someone merges master into their feature branch rather than rebasing it. (And then that branch later gets merged.)
This shouldn't be a problem if you stick to commits and merges. --first-parent will skip past commits, including merge commits, in merged branches.
I agree. PR merges for me are bisect points. That's when changes are introduced. Individual commits don't even always build.
And I don't rebase or squash because I need provenance in my job.
the best implementation i've worked with was SuperSmartLog (SSL) at Meta, which was open-sourced at interactive smartlog (https://sapling-scm.com/docs/addons/isl/). There are also extension for it in VSCode, etc.
Surprisingly it never gained the adoption it deserved.
Workflows can vary, but what I like:
PR/MR is an "atomic" change (ideally the smallest change that can be landed separately - smallest makes it easier to review, bisect and revert)
Individual commits (or what "versions" are in Phabricator) are used for the evolution of the PR/MR to achieve that change.
But really I have 2 use cases for the commits:
1. the PR/MR is still too big, so I split it into individual commits (I know they will land together)
2. I keep the history of the evolution of the PR/MR in the commits ("changed foo to bar cause its a better approach")
Finally!
I never understood the PR=branch model GitHub defaulted to. Stacked commits (ala Phabricator/Gerrit) always jived more with how my brain reasons about changes.
Glad to see this option. I guess I'll have to install their CLI thing now.
My only complaint off the bat is the reliance on the GH CLI, which I don't use either. But maybe by the time it's GA they'll have added UI support.
You can in fact do this from the web UI: https://github.github.com/gh-stack/guides/ui/#creating-a-sta...
I must have missed that. Amazing! From a reviewer's POV, this will be so nice to at the very least remove diff noise for PRs built on top of another PR. I usually refrain from reviewing child PRs until the parent is merged and the child can be rebased, for the sole reason that the diffs are hard to review i.r.t. what came from where.
damn, I missed it as well
presenting only cli commands in announcement wasn't a good choice
Stacked PRs can be created via the UI, API, or CLI.
You can also run a combination of these. For ex, use another tool like jj to develop locally, push up the branches, and use the gh CLI to batch create a stack of n PRs, without touching local state.
CLI is great because now I can tell my AI agent to do it. “Fix all dependabot security issues (copy logs) and run tests to validate functionality. Create each dependency as its own stack (or commit) so that contributors may review each library update easily.”
Wait 10 minutes and you’re done.
We're shipping a skill file with the CLI: https://skills.sh/github/gh-stack/gh-stack
Everyone will have their own way of structuring stacks, but I've found it great for the agent to plan a stack structure that mirrors the work to be done.
It seems partially exposed in the UI with that dropdown. There's an 'add' and 'unstack' button.
Probably relies on some internal metadata.
Huh interesting, my mental model is unable to see any difference between them.
I mean a branch is just jamming a flag into a commit with a polite note to move the flag along if you're working on it. You make a long trail, leave several flags and merge the whole thing back.
Of course leaving multiple waypoints only makes sense if merging the earlier parts makes any sense, and if the way you continue actually depends on the previous work.
If you can split it into several small changes made to a central branch it's a lot easier to merge things. Otherwise you risk making a new feature codependent on another even if there was no need to.
Does it fix the current UX issue with Squash & Merge?
Right now I manually do "stacked PRs" like this:
main <- PR A <- PR B (PR B's merge target branch is PR A) <- PR C, etc.
If PR B merges first, PR A can merge to main no problems. If PR A merges to main first, fixing PR B is a nightmare. The GitHub UI automatically changes the "target" branch of the PR to main, but instantly conflicts spawn from nowhere. Try to rebase it and you're going to be manually looking at every non-conflicting change that ever happened on that branch, for no apparent reason (yes, the reason is that PR A merging to main created a new merge commit at the head of main, and git just can't handle that or whatever).
So I don't really need a new UI for this, I need the tool to Just Work in a way that makes sense to anyone who wasn't Linus in 1998 when the gospel of rebase was delivered from On High to us unwashed Gentry through his fingertips..
Yes, we handle this both in the CLI and server using git rebase --onto
git rebase --onto <new_commit_sha_generated_by_squash> <original_commit_sha_from_tip_of_merged_branch> <branch_name>
So for ex in this scenario: PR1: main <- A, B (branch1)
PR2: main <- A, B, C, D (branch2)
PR3: main <- A, B, C, D, E, F (branch3)
When PR 1 and 2 are squash merged, main now looks like: S1 (squash of A+B), S2 (squash of C+D)
Then we run the following: git rebase --onto S2 D branch3
Which rewrites branch3 to: S1, S2, E, F
This operation moves the unique commits from the unmerged branch and replays them on top of the newly squashed commits on the base branch, avoiding any merge conflicts.Conflicts spawn most likely because PR A was squashed, and once you squash Git doesn't know that PR B's ancestors commits are the same thing as the squashed commit on main.
No idea if this feature fixes this.
Edit: Hopefully `gh stack sync` does the rebasing correctly (rebase --onto with the PR A's last commit as base)
> Conflicts spawn most likely because PR A was squashed, and once you squash Git doesn't know that PR B's ancestors commits are the same thing as the squashed commit on main.
Yeah, and I kind of see how git gets confused because the squashed commits essentially disappear. But I don't know why the rebase can't be smart when it sees that file content between the eventual destination commit (the squash) is the same as the tip of the branch (instead of rebasing one commit at a time).
Because at first your have this
main <- PR A <- PR B
Then you'll have main, squashed A
\
\-> PR A -> PR B
The tip of B is the list of changes of both A and B, while the tip of main is now the squashed version of the changes of A. Unless a branch tracks the end of A in the PR B, It looks like more you want to apply A and B on top of A again.A quick analogy to math
main is X
A is 3
B is 5
Before you have X + 3 + 5 which was equivalent to X + 8, but then when you squash A on on X, it looks like (X + 3) + (3 + 5) from `main`'s point of view, while from B, it should be X + (3 + 5). So you need to rebase B to remove its 3 so that it can be (X + 3) + 5.Branches only store the commits at the top. The rest is found using the parent metadata in each commits (a linked list. Squashing A does not remove its commits. It creates a new one, and the tip of `main` as its parent and set the new commit as the tip of `main`. But the list of commits in B still refer to the old tip of `main` as their ancestor and still includes the old commits of A. Which is why you can't merge the PR because it would have applies the commits of A twice.
I'm not sure I follow your workflow exactly. If PR B is merged, then I'd expect PR A to already be merged (I'd normally branch off of A to make B.)
That said, after the squash merge of A and git fetch origin, you want something like git rebase --update-refs --onto origin/main A C (or whatever the tip of the chain of branches is)
The --update-refs will make sure pr B is in the right spot. Of course, you need to (force) push the updated branches. AFAICT the gh command line tool makes this a bit smoother.
I agree that this is annoying and unintuitive. But I don’t see the simplest solution here, so:
All you need to do is pull main, then do an interactive rebase with the next branch in your stack with ‘git rebase -i main’, then drop all the commits that are from the branch you just merged.
I typically prefix my commit messages with the ticket number to make it easier to spot the commits to drop.
If I'm following correctly, the conflicts arise from other commits made to main already - you've implicitly caught branch A up to main, and now you need catch branch B up to main, for a clean merge.
I don't see how there is any other way to achieve this cleanly, it's not a git thing, it's a logic thing right?
I've no issue with the logic of needing to update feature branches before merging, that's pretty bread and butter. The specific issue with this workflow is that the "update branch" button for PR B is grayed out because there are these hallucinated conflicts due to the new squash commit.
The update branch button works normally when I don't stack the PRs, so I don't know. It just feels like a half baked feature that GitHub automatically changes the PR target branch in this scenario but doesn't automatically do whatever it takes for a 'git merge origin/main' to work.
> the "update branch" button for PR B is grayed out because there are these hallucinated conflicts due to the new squash commit
Those are not hallucinated. PR B still contains all the old commits of A which means merging would apply them twice. The changes in PR B are computed according to the oldest commits belonging to PR B and main which is the parent of squashed A. That would essentially means applying A twice which is not good.
As for updating PR B, PR B doesn't know where PR A (that are also in PR B) ends because PR A is not in main. Squashed A is a new commit and its diff corresponds to the diff of a range of commits in PR B (the old commits of PR A), not the whole B. There's a lot of metadata you'd need to store to be able to update PR B.
No, it's a Git thing arising from squash commits. There are workflows to make it work (I've linked the cleanest one I know that works without force pushing), but ultimately they're basically all hacks. https://www.patrickstevens.co.uk/posts/2023-10-18-squash-sta...
This is actually a reasonable workflow. Although requires some preparation. I’ll try it out!
Yep that's how I do it if I have to deal with stacked PRs. I also just never use rebase once anything has happened in a PR review that incurs historical state, like reviews or other people checking out the branch (that I know of, anyways). I'll rebase while it's local to keep my branch histories tidy, but I'll merge from upstream once shared things are happening. There are a bunch of tools out there for merging/rebasing entire branch stacks, I use https://github.com/dashed/git-chain.
Oh that's annoying, seems to me there wouldn't have been an issue if you just merged B into A after merging A into main, or the other way around but that already works fine as you pointed out.
I mean if you've got a feature set to merge into dev, and it suddenly merges into main after someone merged dev into main then that's very annoying.
You "just" need to know the original merge-base of PR B to fix this. github support is not really required for that. To me that's the least valuable part of support for stacked PRs since that is already doable yourself.
The github UI may change the target to main but your local working branch doesn't, and that's where you `rebase --onto` to fix it, before push to origin.
It's appropriate for github to automatically change the target branch, because you want the diff in the ui to be representative. IIRC gitlab does a much better job of this but this is already achievable.
What is actually useful with natively supported stacks is if you can land the entire stack together and only do 1 CI/actions run. I didn't read the announcement to see if it does that. You typically can't do that even if you merge PR B,C,D first because each merge would normally trigger CI.
EDIT: i see from another comment (apparently from a github person) that the feature does in fact let you land the entire stack and only needs 1 CI run. wunderbar!
Curious how / how well it deals with conflicts in the different branches that are part of the stack. Is there some support for managing that, or what happens when two of the branches don't rebase / merge cleanly?
As a solo dev I rarely need stacked PRs, but the underlying problem, keeping PRs small and reviewable, is real even when you're your own reviewer. I've found that forcing myself to break work into small branches before I start (rather than retroactively splitting a giant branch) is the actual discipline. The tooling just makes it less painful when you don't.
Curious whether this changes anything for the AI-assisted workflow. Right now I let Claude Code work on a feature branch and it naturally produces one big diff. Stacked PRs could be interesting if agents learned to split their own work into logical chunks.
The tooling for that already exists, since a PR can consist of multiple Git commits and you can look at them separately in the UI. I don't know whether agents are any good at navigating that, but if not, they won't do any better with stacked PRs. Stacked PRs do create some new affordances for the review process, but that seems different from what you're looking for.
Looking at multiple commits is not a good workflow:
* It amounts to doing N code reviews at once rather than a few small reviews which can be done individually
* Github doesn't have any good UI to move between commits or to look at multiple at once. I have to find them, open them in separate tabs, etc.
* Github's overall UX for reviewing changes, quickly seeing a list of all comments, etc. is just awful. Gerrit is miles ahead. Microsoft's internal tooling was better 16 years ago.
* The more commits you have to read through at once the harder it is to keep track of the state of things.
It's crazy that you're getting downvoted for this take.
This isn't reddit people. You're not supposed to downvote just because you disagree. Downvotes are for people who are being assholes, spamming, etc...
If you disagree with a take, reply with a rebuttal. Don't just click downvote.
Historically, hn etiquette is that it's fine to downvote for disagreement. This came from pg himself.
That said, while he hasn't posted here for a long time, this is still in the guidelines:
> Please don't post comments saying that HN is turning into Reddit. It's a semi-noob illusion, as old as the hills.
>It amounts to doing N code reviews at once rather than a few small reviews which can be done individually
I truly do not comprehend this view. How is reviewing N commits different from/having to do less reviews reviewing N separate pull requests? It's the same constant.
Small reviews allow moving faster for both the author and reviewer.
A chain of commits:
* Does not go out for review until the author has written all of them
* Cannot be submitted even in partial form until the reviewer has read all of them
Reviewing a chain of commits, as the reviewer I have to review them all. For 10 commits, this means setting aside an hour or whatever - something I will put off until there's a gap in my schedule.
For stacked commits, they can go out for review when each commit is ready. I can review a small CL very quick and will generally do so almost as soon as I get the notification. The author is immediately unblocked. Any feedback I have can be addressed immediately before the author keeps building on top of it.
Let's compare 2 approaches to delivering commits A, B, C.
Single PR with commits A, B, C: You must merge all commits or no commits. If you don't approve of all the commits, then none of the commits are approved.
3 stacked PRs: I approve PR A and B, and request changes on PR C. The developer of this stack is on vacation. We can incrementally deliver value by merging PRs A and B since those particular changes are blocking some other engineer's work, and we can wait until dev is back to fix PR C.
I have had a lot of success with Claude and jj, telling it to take the stack of work it's done and build me a new stack on top of trunk that's centered around ease of reviewing.
I once threatened Claude have to learn JJ after doing some crazy git rebase gymnastics. The problem is clearly that I don't know jj
It sometimes will hallucinate older CLI options, because jj has changed at various times, but it's pretty decent with it at this point. The harder part is that a lot of plugins hardcode git into them.
Maybe there’s a git trick I don’t know, but I’ve found making small branches off each other painful. I run into trouble when I update an earlier branch and all the dependent branches get out of sync with it. When those earlier branches get rebased into master it becomes a pain to update my in-progress branches as well
If I understood you correctly, you want to propagate changes in a branch to other branches that depend on it? Then --update-refs is for you[1]. That way, you only need to update the "latest" branch.
[1] https://andrewlock.net/working-with-stacked-branches-in-git-...
Stacking branches for any extended period of time is definitely a poor mixing of the concepts of branches and commits. If you have a set of changes you need to keep in order, but you also need to maintain multiple silos where you can cleanly allow the code to diverge, that divergence constitutes the failure of your efforts to keep the changes in order.
Until you can make it effortless, maintaining a substantial commit structure and constantly rebasing to add changes to the proper commit quickly turns into more effort than just waiting to the end and manually editing a monster diff into multiple sensible commits. But we take the challenge and tell ourselves we can do better if we're proactive.
This is what I understood as well, but it sounded like GP had success doing it; so I was curious if there was a trick I didn’t know about
I take from GP that they try to make their branches small, and keep the cycle of development->review->merging small, so that the problem stacked PRs seeks to solve doesn't materialize in the first place.
Stacked PRs in my experience has primarily been a request to merge in a particular order. If you're the only merger, as in GP's case, there's no need to request this of yourself.
Whenever I send a big diff. I spend some time annotating with comment first to helps the reviewer. A good summary of the changes in the description, the I annotate the diff of the PR, explaining approaches, the design of a specific changes, tricky part of the code, boilerplate,... Trying to guess the context is where the review bottleneck is, so I present it alongside the code.
There’s a startup callled Graphite dedicated to stacked PRs. I have been using them for a while now I always wonder why github doesn’t implement something similar to this. I probaly will try and switch to GitHub to see if it works flawlessly
I think the core conceptual difference between a stacked diff and PRs as we use them in open source is the following:
A PR is basically a cyberspatial concept saying "I, as a dog on the internet, am asking you to accept my patches" like a mailing list - this encourages trying to see the truth in the whole. A complete feature. More code in one go because you haven't pre-agreed the work.
Stacks are for the opposite social model. You have already agreed what you'll all be working on but you want to add a reviewer in a harmonious way. This gives you the option to make many small changes, and merge from the bottom
Very cool that GitHub actually put stacks in the UI vs. GitLab's `glab stack`[0] (which looks just like the `gh stack` part of GitHub's thing).
One part that seems like it's going to feel a little weird is how merging is set up[1].
That is, if I merge the bottom of the stack, it'll rebase the others in the stack, which will probably trigger a CI test run. So, if I have three patches in the stack, and I want to merge the bottom two, I'd merge one, wait for tests to run on the other, merge the second vs. merge just those two in one step (though, without having used it, can't be sure about how this'd work in practice—maybe there's some way to work around this with restacking?)
[0]: <https://docs.gitlab.com/cli/stack/>
[1]: <https://github.github.com/gh-stack/guides/stacked-prs/#mergi...>
> So, if I have three patches in the stack, and I want to merge the bottom two, I'd merge one, wait for tests to run on the other, merge the second vs. merge just those two in one step
As we have it designed currently, you would have to wait for CI to pass on the bottom two and then you can merge the bottom two in one step. The top of the stack would then get rebased, which will likely trigger another CI run.
Thanks for the callout - we'll update those docs to make it clear multiple PRs can be merged at once.
I really don't get the point of stacked PRs.
Just using git, you'd send a set of patches, which can be reviewed, tested and applied individually.
The PR workflow makes a patch series an undivisible set of changes, which must be reviewed, tested and applied in unison.
And stacked PRs tries to work around this issue, but the issue is how PRs are implemented in the first place.
What you really want is the ability to review individual commits/patches again, rather than work on entire bundles at once. Stacked PRs seems like a second layer of abstraction to work around issues with the first layer of abstractions.
Exactly! A stack of PRs is really the same beast as a branch of commits.
The traditional tools (mailing-lists, git branches, Phabricator) represented each change as a difference between an old version of the code and the proposed new version. I believe Phabricator literally stored the diff. They were called “diffs” and you could make a new one by copying and pasting into a <textarea> before pressing save*.
The new fangled stuff (GitHub and its clones) recorded your change as being between branches A and B, showed you the difference on the fly, and let you modify branch B. After fifteen years of this we are now seeing the option for branch A to be something other than main, or at least for this to be a well supported workflow.
In traditional git land, having your change as a first class object — an email or printout or ph/D1234 with the patch included — was the default workflow!
*Or some other verb meaning save.
The teams that I have worked with still apply the philosophy you’re describing, but they consider PRs to be the “commit”, i.e. the smallest thing that is sane to apoly individually.
Then the commits in the PR are not held to the standard of being acceptable to apply, and they are squashed together when the PR is merged.
This allows for a work flow in which up until the PR is merged the “history of developing the PR” is preserved but once it is merged, the entire PR is applied as one change to the main branch.
This workflow combined with stacked PRs allows developers to think in terms of the “smallest reviewable and applicable change” without needing to ensure that during development their intermediate states are safe to apply to main.
It’s useful for large PRs in large repos with many contributors. It reduces the burden for reviewers.
this works much better in Phabricator because commits to diffs are a 1:1 relationship, diffs are updated by amending the commit, etc., the Github implementation does seem a bit like gluing on an additional feature.
Right, a PR is "just" a set of commits (all must be in the same branch) that are intended to land atomically.
Stacked PRs are not breaking up a set of commits into divisible units. Like you said, you can already do that yourself. They let you continue to work off of a PR as your new base. This lets you continue to iterate asynchronously to a review of the earlier PRs, and build on top of them.
You often, very often, need to stage your work into reviewer-consumable units. Those units are the stack.
Is this going to be a part of triage task? If so, it makes sense. Whether a human developer or an AI made a big PR, AI goes review it and if necessary makes stacked PRs. I don’t see any human contributors using this feature to be honest because it’s an extra work and they should have found a better way to suggest a large PR.
I thrive on stacked PRs but this sure seems like a weird way to implement support for it. Just have each branch point to their parent in the chain, the end. Just native Git. I've been longing for better GitHub support for this but the CLI is not where I need that support: just the UI.
Rebasing after merging a base branch becomes a pain though, when you do this. IMO the CLI will be nice to automate the process of rebasing each branch on its parent.
Agreed. I do have tooling for a rebase + push flow, but it simply calls native git commands.
The CLI is completely optional, you can create stacked PRs purely via the UI.
Also the rationale for having a chain of branches pointing to each other was so the diff in a PR shows just the relevant changes from the specific branch, not the entire set of changes going back to the parent/trunk.
Curious how you're thinking about it?
+1 this isn’t something new, it’s been possible all along in native git if you’re willing to do branch management and rebasing yourself. Just without the fancy UI / stack map.
I find this puzzling. It does not seem to allow to stack PRs on top of other people's PRs?
There is already an option to enable review comments on individual commits (see the API endpoint here: https://docs.github.com/en/rest/guides/working-with-comments...). Self-stacking PRs seem redundant.
Still feels like an alpha version right now. I'm sure they will add it later.
Graphite (which they seem to be inspired by) has frozen branches exactly for that use case:
This API leaves a comment, on the commit; not quite the same thing since in GH, several operations are tied to PRs and not to commits.
Maybe this is just a skill issue, but even with several attempts I just can't figure out why I would use stacked diffs/PRs. Though maybe that's because of the way I work?
I notice a lot of examples just vaguely mention "oh, you can have others review your previous changes while you continue working", but this one doesnt make sense to me. Often times, the first set of commits doesn't even make it to the end result. I'm working on a feature using lexical, and at this point I had to rewrite the damn thing 3 times. The time of other devs is quite valuable and I can't imagine wasting it by having them review something that doesn't even make it in.
Now, I have been in situations where I have some ready changes and I need to build something on top. But it's not something just making another branch on top + rebase once the original is merged wouldn't solve.
Is this really worth so much hype?
We use this feature extensively at $dayjob.
Imagine you have some task you are working on, and you wish to share your progress with people in bite sized chunks that they can review one at a time, but you also don’t want to wait for their reviews before you continue working on your task.
Using a stacked set of PRs you can continue producing new work, which depends on the work you’ve already completed, without waiting for the work you’ve already completed to be merged, and without putting all your work into one large PR.
in Phabricator you either abandon the original diffs entirely, or you amend them. you don't just stack more commits with meaningless messages like "WIP", "lint fix", etc. on top.
> The time of other devs is quite valuable and I can't imagine wasting it by having them review something that doesn't even make it in.
this is now what stacked diffs are for. stacked diffs doesn't mean putting up code that isn't ready. for example you are updating some library that needs an API migration, or compiler version that adds additional stricter errors. you need to touch hundreds of files around the repository to do this. rather than putting up one big diff (or PR) you stack up hundreds of them that are trivial to review on their own, they land immediately (mitigating the risk of merge conflicts as you keep going) then one final one that completes the migration.
I also branch out, and rebase. Also, keep updating and rebasing until merged. It’s tedious when PR take ages for approval, as I keep creating new branches on top of each other.
So, when I saw this announcement seemed interesting but don’t see the point of it yet.
GitLab's UI around MRs (PRs) is IMO miles better than what GH's been offering. Try creating a PR from branch A to main, and then rebasing A. GitLab handles this fine and can show you changes between the two revisions; GitHub is completely lost.
> a chain of small, focused pull requests that build on each other — each one independently reviewable.
I have never understood what this even means.
Either changes are orthogonal (and can be merged independently), or they’re not. If they are, they can each be their own PR. If they’re not, why do you want to review them independently?
If you reject change A and approve change B, nothing can merge, because B needs A to proceed. If you approve change A and reject change B, then the feature is only half done.
Is it just about people wanting to separate logical chunks of a change so they can avoid get distracted by other changes? Because that seems like something you can already do by just breaking a PR into commits and letting people look at one of those at a time.
I’ve tried my best to give stacked-diff proponents the benefit of the doubt but none of it actually makes sense to me.
we have been stacking on tangled.org for a while now, you can see a few examples of stacks we have made here: https://tangled.org/tangled.org/core/pulls?state=merged&q=st...
for example, this stack adds a search bar: https://tangled.org/tangled.org/core/pulls/1287
- the first PR in the stack creates a search index.
- the second one adds a search API handler.
- the last few do the UI.
these are all related. you are right that you can do this by breaking a change into commits, and effectively that is what i do with jujutsu. when i submit my commits to the UI, they form a PR stack. the commits are individually reviewable and updatable in this stacking model.
gh's model is inherently different in that they want you to create a new branch for every new change, which can be quite a nuisance.
have written more about the model here: https://blog.tangled.org/stacking/
> - the first PR in the stack creates a search index.
> - the second one adds a search API handler.
> - the last few do the UI.
So you're saying you're going to merge (and continuously integrate, perhaps to production) a dangling, unused search index, consuming resources with no code using it, just to make your review process easier?
It's very depressing that review UX is so abysmal that you have to merge features before they're done just to un-fuck it.
Why can't the change still be a big branch that is either all merged or not... and people can review it in chunks? Why do we require that the unit of integration equals the unit of review?
The perverse logic always goes something like this:
"This PR is too big, break it up into several"
Why?
"It's easier to review small, focused changes"
Why can't we do that in one PR?
"Because... well, you see GitHub's UI makes it really hard to ..."
And that ends up being the root-cause answer. I should be able to make a 10,000 line change in a single commit if I want, and reviewers should be able to view subsets of it however they want: A thread of discussion for the diffs within the `backend` folder. A thread of discussion for the diffs within the `frontend` folder, etc etc. Or at the very least I should be able to make a single branch with multiple commits based on topic (and under no obligation for any of them to even compile, let alone be merge-able) and it should feel natural to review each commit independently. None of this should require me to contort the change into allowing integration partially-completed work, just to allow the review UX to be manageable.
The canonical example here is a feature for a website that requires both backend and frontend work. The frontend depends on the backend, but the backend does not depend on the frontend. This means that the first commit is "independent" in the sense that it can land without the second, but the second is not, hence, a stack. The root of the stack can always be landed independently of what is on top of it, while the rest of the stack is dependent.
> If they’re not, why do you want to review them independently?
For this example, you may want review from both a backend engineer and a frontend engineer. That said, see this too though:
> that seems like something you can already do by just breaking a PR into commits and letting people look at one of those at a time.
If you do this in a PR, both get assigned to review the whole thing. Each person sees the code that they don't care about, because they're grouped together. Notifications go to all parties instead of the parties who care about each section. Both reviews can proceed independently in a stack, whereas they happen concurrently in a PR.
> If you approve change A and reject change B, then the feature is only half done.
It depends on what you mean by "the feature." Seen as one huge feature, then yes, it's true that it's not finished until both land. But seen as two separate but related features, it's fine to land the independent change before the dependent one: one feature is finished, but the other is not.
If the layers of a stack have a disjoint set of reviewers things are viewed in separation which might lead to issues if there is no one reviewing the full picture.
That is why your forge will show that these two things are related to each other, and you may have the same person assigned to review both. It can show you this particular change in the context of the rest of them. But not every reviewer will always want to see all of the full context at all times.
> If you do this in a PR, both get assigned to review the whole thing. Each person sees the code that they don't care about, because they're grouped together.
There are two separate issues you’re bringing up:
- Both groups being “assigned” the PR: fixable with code owners files. It’s more elegant than assigning diffs to people: groups of people have ownership over segments of the codebase and are responsible for approving changes to it. Solves the problem way better IMO.
- Both groups “seeing” all the changes: I already said GitHub lets you view single commits during PR review. That is already a solved problem.
And I didn’t even bring up the fact that you can just open a second PR for the frontend change that has the backend commit as the parent. Yes, the second PR is a superset of the first, but we’ve already established that (1) the second change isn’t orthogonal to the first one and can’t be merged independently anyway, and (2) reviewers can select only the commits that are in the frontend range. Generally you just mark the second PR as draft until the first one merges (or do what Gitlab does and mark it as “depends on” the first, which prevents it from merging until the first one is done.) The first PR being merged will instantly make the second PR’s diff collapse to just the unique changes once you rebase/merge in the latest main, too.
All of this is to explain how we can already do pretty much all of this. But in reality, it’s silly to have people review change B if change A hasn’t landed yet. A reviewer from A may completely throw the whole thing out and tell you to start over, or everything could otherwise go back to the drawing board. Making reviewers look at change B before this is done, is a potential for a huge waste of time. But then you may think reviewers from change B may opt to make the whole plan go back to the drawing board too, so what makes A so special? And the answer is it’s both a bad approach: just make the whole thing in one PR, and discuss it holistically. Code owners files are for assigning ownership, and breaking things into separate commits is to help people look at a subset of the changes. (Or just, like, have them click on the folder in the source tree they care about. This is not a problem that needs a whole new code review paradigm.)
> fixable with code owners files.
Code owners automatically assigns reviewers. You still end up in the state where many groups are assigned to the same PR, rather than having independent reviews.
> I already said GitHub lets you view single commits during PR review.
Yes, you can look at them, but your review is still in the context of the full PR.
> And I didn’t even bring up the fact that you can just open a second PR for the frontend change that has the backend commit as the parent.
The feature being discussed here is making this a first-class feature of the platform, much nicer to use. The second PR is "stacked" on top of the first.
> You still end up in the state where many groups are assigned to the same PR
> Yes, you can look at them, but your review is still in the context of the full PR.
Why is this a bad thing? I don’t get it. This has literally never been a problem once in my career. Is the issue that people can’t possibly scroll past another discussion? Or… I seriously am racking my brain trying to imagine why it’s a bad thing to have more than one stakeholder in a discussion.
I can think of a lot of reasons why doing the opposite, and siloing off discussions, leads to disaster. That is something I’ve encountered constantly in my career. We start out running an idea past group A, they iterate, then once we reach a consensus we bring the conclusion to group B and they have concerns. But oh, group A already agreed to this so you need to get on board. So group B feels railroaded. Then more meetings are called and we finally bring all the stakeholders together to discuss, and suddenly hey, group A and B both only had a partial view of the big picture, and why didn’t we all discuss this together in the first place? That’s happened more times in my career than I can count. The number of times group B is mad that they have to move their finger to scroll past what group A is talking about? Exactly zero.
It's totally possible that you aren't the target audience for this sort of feature. It tends to be more useful in very large team and/or monorepo contexts.
This isn't about siloing discussions: it's about focus. You can always see the full stack if you want to go look at the other parts, the key is that you don't have to.
The goal is to get thoroughly reviewed changes. It's much easier to review five 100 line changes than one 500 line one, and it's easier to review five 500 line changes than it is a 2500 line one. Keeping commits small and tightly reviewed leads to better outcomes in the end. Massive PRs lead to rubber stamps of +1.
I agree that that scenario sounds like a nightmare. But I don't think that a PR is the right place to solve that problem: it sounds like something that should have been sorted before any of the code was written in the first place.
> It's much easier to review five 100 line changes than one 500 line one, and it's easier to review five 500 line changes than it is a 2500 line one.
This is true if the changes are orthogonal and are truly independent. One should always favor small independent changes if one can.
But when changes are all actually part of the same unit, and aren’t separable (apart from maybe the first of N of them which may be mergeable independently), proponents always seem to advocate that stacked diffs can somehow change this fact. “Oh if only we had stacked diffs we could break this into smaller changes”, ignoring the fact that no, they’d still be ordered and dependent on one another.
Stacked diffs seem like a UI convenience for reviewers… that’s fine I guess. GitHub is basically what you get when you ask the question “how can we make code review as tedious and unhelpful as possible”, and literally anything would be better than what we have (seriously I could fill a book with how bad GitHub is. I don’t think I could design a worse experience if I tried.) So, maybe I should just be happy they’re trying anything.
In stacked diffs systems, the idea is that the base of the stack (once reviewed) can always be merged independently, so you're totally right that like, if you just purely think you can split things up when they shouldn't be split up, that would be bad.
This is the model that the kernel uses, as well as tons of other projects (any Gerrit user, for example), and so it has gotten real-world use and at scale. That said, everyone is also entitled to their preferences :)
> This is the model that the kernel uses
Nah.
The kernel uses a mailing list, and a “review” means a mailing list thread. With some nice CLI tools to integrate with git when you want to actually apply the patch (or start a review thread.)
In that world, “[PATCH 2/5]” (or whatever) in the subject title, and a different CC list for each patch, is a nice way to be able to ensure different subsets of the patch series have different discussions. That’s great.
But if you’re going to compare this to a GitHub UI, you have to choose the basis for comparison, because the two are so utterly different. Choosing one aspect (can we make sure discussions are kept separate), and saying “therefore the kernel uses stacked diffs” is a huge misrepresentation of how different GitHub’s approach is.
Because the kernel approach is the platonic ideal of a code review: it’s a simple threaded discussion between stakeholders, centered around a topic (the patch, which is inlined right in the email.) I would wager close zero kernel maintainer actually look at the diffs exclusively via their email client. They probably just check out the changes locally and look at them, and the purpose of the mailing list is to facilitate focused discussion on parts of the change (which is all we really want, in the end.)
GitHub has so thoroughly shit the bed on actually developing a good model of “threaded discussion about a change”, that you have to change the way you think about git’s model to fix how awful GitHub is at allowing review discussion to stay focused. You shouldn’t need to think about stacked diffs and multiple PR’s. You should use git branches as intended, multiple commits representing changes, and a merge meaning “this branch makes it or not.” That GitHub’s UI for discussing subsets of a change is so abysmal, does not mean the model is wrong. It means their discussion system is so abysmal that a mailing list TUI can run circles around it. Fixing this is GitHub’s problem, and doesn’t require any changes to how PR’s should be split up.
If you have a 2500-line PR with 5 500-line commits, GitHub should not require you to split things up further in any way, just to unfuck their discussion system.
Random idea I spent 10 seconds thinking about: let me start a “here’s a thread discussing the UI changes” and add folks to it, and “here’s a thread discussing the backend changes”, and add folks to that. I can then say “let’s not merge this until both threads are green”. You still see the whole change in the UI. (You can click directories to drill into the changes, that solves the “but the diff is too big” issue.) Discussion on a chunk of the diff is scoped to a discussion thread, which you select when sending the message. Thus, all discussion on any part of the diff is still scoped to a “discussion thread” of arbitrary subsets of stakeholders.
None of this needs me to change how I split up my git branches, an entire logical change is still either “merged” or “not-merged” (seriously who cares about the Pyrrhic victory of merging only change 1/N), and if we want to limit scopes of discussion to subsets of a change, we can just… do that.