Modifying other people's software

natkr.com

89 points by todsacerdoti 7 days ago


skydhash - 3 days ago

Maybe I can't understand what TFA is describing, but from what I know a patch is usually tied to a specific commit, so a very specific point of time in the upstream lifetime. It does not make sense to have it lingering longer than that. Even in the case when you want to maintain a set of patches (package building,...) you usually revise it every new version of the software. In this case, the intent is much more important than the how (which quickly become history).

noirscape - 3 days ago

I had issues with similar things for a couple years too. The reality is that there's remarkably little existing advice for maintaining a soft fork that doesn't intent to upstream patches. (For reference, probably the most notable patch fork that can't/doesn't upstream anything, GNU IceCat, uses a bash file from hell to apply all of it's changes to the Firefox source code - it is not a scalable solution.)

Ultimately the solution I ended up using was git rebase; it just works the nicest out of all of them:

* Your patches are always kept on top at the git log.

* It's absolutely trivial to drop an unnecessary patch, add a new one in the chain or to merge two patches that do the same thing. (Use git rebase -i for that.) Fixing typos in patches is trivial too.

* Your history isn't so important for a patch fork; the patches are what matters, so don't fret too much about commit hashes changing. I promise you, it'll be fine.

* Git will complain if you try to do a rebase that doesn't work out of the box, using similar tools as resolving merge conflicts. You can instantly do a git pull from another upstream that rebases with git pull --rebase upstream/master . This does assume you've added the upstream as a second origin to git under the name upstream and that they push the code you want to patch onto the master branch.

As for drawbacks, I only wound up with two:

* CI tools and git server UIs often aren't prepared to handle a heavily rebased master branch - it leads to lots of builds that are linked to dangling commit hashes. GitHub also for some reason insists on displaying the date of the last rebase, rather than the date of when the patch was committed. Not sure why.

* Pushing your fork means heavy use of force pushes, which feels instinctively wrong.

The drawback isn't large enough for me to mind it in practice.

Opted to use rebase for this sort of fork after reading a bit about non-merge related git flows and wondering what'd happen if I did a rebase-based workflow but just... never send any patches. Turns out it works really well.

praptak - 3 days ago

You may have a look at Quilt. I doesn't solve the problem the author described but may help you once you accept there is no easy solution in sight.

Quilt is automation for the "bag of patches" model. I used it once when I needed to upgrade the internal bag of patches at $big_corp so as to apply them to a newer version of $public_app. It was predictably complex but somehow still manageable.

If you squint a bit then the [bag of patches] + [automated application in order] is a poor man's Git. If you keep this in a git repo then you're basically versioning repos (poor man's ones) in a repo. It almost sounds like the solution to author's problem :)

userbinator - 3 days ago

Many times I've just patched the binary even if source is available, because trying to reproduce the binary you currently have, with only the changes you want and everything else the same, can be an even more difficult exercise than simply changing a string or constant.

cyberax - 3 days ago

This is supercool. One my constant problem with self-hosting is that I often need to modify just a couple of files here and there, but then I'm stuck with a forked repo or a dirty work copy.

I'm going to try to make a frontend UI for it.

thwarted - 3 days ago

The process described reminded me of "pristine source" and RPM spec files that take the upstream pristine source and patch it during the build process. Maintaining that is always a little bit of a headache if you don't do it regularly, especially having to maintain (generate and apply) a separate set of patch files for the changes and express/apply the patches in the spec file. This looks to make light work of that.

anilakar - 3 days ago

I once wrote a small C++ wrapper for POSIX dlfcn.h. Someone sent a pull request that would have turned it into a Windows-only library.

ngcc_hk - 3 days ago

The title is v general. There are at least two kind of modifications - one to minimize the change but just change behaviour and the other is really change the program.

I work for a decade as mainframe technical support mostly install fix. And because of these lately when I spent 3 months as a hobby to change the turbo bridge to take external bridge card. I injected code or hacking of the code like jes2 exit and without touching much the host program modify the host program behaviour.

This is very different from my colleagues who are application programmer who can totally change a cics module involving even changing db2 schema.

What is a modification meant in this title … I wonder.

datadrivenangel - 3 days ago

Modifying source code like this is one method. For web software, bookmarklets are another great way to do that.

attila-lendvai - 3 days ago

whenever i rebase longstanding commits in my fork, i keep the previous branch by appending the date to its name.

reading the readme didn't make it clear to me how this app would make my life any easier (also considering the added complexity of a new tool).

vlovich123 - 3 days ago

Honestly I found a better strategy to name branches after the fork point and the date you started the fork. So you’d have main-2025-03-07 for a fork of main started 03-07 another main-2025-05-08 for a rebase. The patch set above that is just what you carry. I’m not sure maintaining them as literal patches is that helpful vs just keeping it as explicit patches to apply in git. But maybe this is the right strategy once your fork gets complicated but at that point you should be hard forking rather than soft forking IMO.