Pull request limits are cutting down the noise

101 points by ingve 8 days ago

The primary spam problem isn't that a single account opens many pull requests on a single repo, but that spammer accounts open many pull requests spread across many repositories. So limiting accounts to a couple of open PRs on my repository won't help much.

I'd rather enforce a limit based on the number of PRs that account opened across all public repositories it doesn't have write access to within the last week. And PRs that were closed without getting merged should be held against the account somehow (perhaps via a "close as unwelcome" option for the maintainer).

Ferret7446 - 38 minutes ago

Is it? What is the motivation, and why would such spammers use PRs and not any other types of spam like comments or bugs?
I think it's more likely a lot of these are well intentioned individuals or people trying to build a resume. They'd want a lot of accepted PRs on one account, not lots of accepted PRs accross a lot of burner accounts
freedomben - 2 days ago

> And PRs that were closed without getting merged should be held against the account somehow
That strikes me as a bad solution. I've sent plenty of PRs over the last two decades that were things I wasn't sure if upstream wanted or not, but I did the work and wanted to offer it to them. If you get penalized for not having a PR merged, it's going to incentivize selfishness
- CodesInChaos - 2 days ago
  
  That's why I suggested an explicit "close as unwelcome" option (label to be bikeshed). And the impact of the rejection should decay over time.
  In any case, my proposal is a rough sketch of how I'd approach the problem, not a production ready algorithm. But I'd expect even that basic approach to work a lot better than github's approach.
  - Tade0 - a day ago
    
    Ultimately what kills any effort to curb this behaviour is the fact that the perpetrator can always open another account.
    If I was a maintainer of an open-source project, I would have a two-tier system:
    -PRs from previous contributors.
    -All others, sorted by lines of code, ascending.
    Reasoning:
    -Large PRs from someone without a track record are rare.
    -It's not a huge ask to have people first solve a smaller problem.
    -Small PRs are easy to verify - it's especially easy to tell if a given one-liner is impactful or just spam. Should also be easier to summarise it in the title.
    -Don't quote me on that but I think LLMs are still bad at clear, concise, meaningful changes.
  - danuker - 2 days ago
    
    Really it should be a "report as spam" option
- dleeftink - 2 days ago
  
  Hence the cooldown period? I think the mechanism proposed here should be perfectly fine for targeted PRs, while mitigating those that sit above baseline.
- RugnirViking - 2 days ago
  
  I assume the idea is that you probably weren't doing that 20+ times a day. For me, I was raising at most one or two open source PRs per week at most, and I only had time to focus on one or two repos during that time. I think thats a good baseline, with a big overhead for exceptional circumstance, but I don't see a world where someone should be able to make 10 PRs a day, every day, for weeks
jdxcode - 2 days ago

Your proposal wouldn't help me at all. I wouldn't say that the problem I'm having is even "spam" per se. (For context I receive hundreds of PRs each week across my OSS projects like mise)
In my case I sometimes get a flurry of PRs from over-exuberant contributors, not necessarily low quality even! Using this I can at least put some back-pressure on that and help keep things more fair across my contributors.
- ares623 - 2 days ago
  
  why not both?
QuantumNoodle - 2 days ago

Good point. I've (even with agents) never made more than like 5 PRs in one day internal to a company and if I have they typically included accompanying proto or submodule changes. Heck give a factor of safety of 2x and cap at 10 daily PRs per account for repos that youre "untrusted"
lenkite - 2 days ago

Github needs a "Human Trust" score where "AI Slop" label applied to a PR reduces the score and denies all PR's for the next X hours, delay increasing exponentially for every PR labelled with "AI Slop". Repeated "AI Slop" labels given by a maintainer reduces "Human Trust" Score like dropping off a cliff.
Every successful merge for a PR spread across N days slowly increases "Human Trust" score. (So a slow of fake merged PR's cannot fake increase Human Trust score). Just like the real world, Human Trust should be hard to gain and easy to lose.
If your Human Trust Score becomes negative due to too much "AI Slop", then you are banned from all PR submissions for a quarter. Your profile picture is also replaced with the Robot Identicon to indicate to the world that your human brain has been replaced by AI and urgent health-check is needed.
- Terr_ - 2 days ago
  
  That could be abused the other way against well-intentioned humans. Simply let someone accumulate PRs at a not-abusive rate, and then once enough exist falsely report them all in a day.
  > banned from all PR submissions
  So then the person can't even make a PR against their own repository? Or when we're there a maintainer, known contributor, or a member of an organization that might be their workplace?
  - lenkite - 2 days ago
    
    "AI slop" reporting also comes with a budget. You can only apply N "AI slop" labels to a user per week.
    Obviously ban is only for PRs against other repositories belonging to different users/organization. Not PRs towards same user/org.
srdjanr - 2 days ago

At the end they mentioned they're exploring global trust signals

trjordan - 2 days ago

If you didn't take the time to write it, why should I take the time to read it?

This is a band-aid. Maybe even a good band-aid, because it'll keep individual contributors from flooring the zone. But the core problem is Github's model that assumes code is worth reading.

I'm much rather see the agent logs stapled to PRs. Make it easy to understand if there's a brain behind the suggested changes before engaging.

cameldrv - 2 days ago

> If you didn't take the time to write it, why should I take the time to read it?
This is the fundamental problem. You have to look at the equilibrium. When you submit a PR, you're asking for some of my time. I have to figure out if it's likely to be worth it for me. If you have a track record of producing useful software that I have merged before, you're putting your reputation at risk when you submit a new PR, so it's probably good. If you start sending AI slop, I'm going to downgrade your reputation.
If you have no track record though, I'll probably at least take a glance since even if I'm not sure, at least you had to spend some time to write the code and put together the PR. Now that's not true.
My guess is we're going to have to create some new systems for reputation, maybe bond posting, maybe "sponsored" PRs, where someone trusted vouches for it, etc.
Incidentally, this doesn't just apply to PRs. It's emails, all kinds of other messages, reports, etc.

frankfrank13 - 2 days ago

I think this is a really solid move. This gives OSS contributors a lot of flexibility. You could set the limit to 0, and manually add contributors. You could set it to 1-3 to allow people to get their foot in the door. But the de facto limit today is infinite, which is spammed. Imagine if GMail did this! If I don't whitelist or reply within `n` emails, youre done. I would KILL for that.

SoftTalker - 2 days ago

I get a lot of emails that I want to get but never reply to. I would not want to have to remember to whitelist all of those.
- frankfrank13 - 2 days ago
  
  Yeah fair, then you could set it higher, even 100. Or default it off.
jasonpeacock - 2 days ago

HEY email service does something like that:
https://www.hey.com/features/the-screener/
IshKebab - 2 days ago

> Imagine if GMail did this!
It would be very annoying? Spammers can still spam one message, but now your friends can't email you twice. Awesome.
This is a barely-better-than-nothing blunt tool.

Unfunkyufo - 2 days ago

I don't often give GitHub credit, because I work with it every day and I encounter something frustrating or broken nearly every day ending in "day", but kudos to them for working on addressing the some of the big problems.

I also like the other features mentioned in the blog post. It won't make a difference to me and my daily work, but I'm glad that they are taking the criticisms seriously.

Though I have to admit that I'm a bit conflicted about this. Part of me also wants more people to move off of GitHub to help break their monopoly on code on the web, but I also don't want the people making and maintaining open source to give up their projects due to burnout and slop spam.

- 2 days ago

[deleted]

SaucyWrong - 2 days ago

Somebodies are already making “loops” <facepalm> that will add the noise back. If PR isn’t merged in <time> close and open a new one, either the same one or a new one of higher priority.

If <time> is set low enough, the noise still exists

mbaloch2136 - 8 days ago

[dead]

righthand - 2 days ago

[flagged]

MeetingsBrowser - 2 days ago

how so?
- righthand - 2 days ago
  
  How else would it appear as a popular platform without bots everywhere?
  If there’s even any minor truth to dead internet theory then it extends to Github most certainly.

arjie - 2 days ago

[flagged]

csiegert - 2 days ago

There is also the solution of: No merge requests, just feature wishes and bug reports. All code is written solely by the maintainers (with the help of LLMs).
- - 2 days ago
  
  [deleted]
- arjie - 2 days ago
  
  I believe that SQLite is like this. All code is internally written. Yep, also very reasonable. Amounts to how much external input you want, I suppose.
- parliament32 - 2 days ago
  
  Add a mechanism to donate tokens towards the maintainers' LLMs for a particular ticket and this whole class of problems will be resolved all at once.
  - wereHamster - 2 days ago
    
    > Add a mechanism to donate tokens
    Or donate money. Crazy idea, eh?
    
    toomuchtodo - 2 days ago
    
    Some people have tokens but no money. Tokens, like Amazon gift cards and Tide detergent [1], are a form of currency in a way. If people have a currency equivalent they want to spend for your benefit, or the collective benefit, it makes sense (depending on level of effort) to enable them to do so.
    [1] How Tide Detergent Became a Drug Currency - https://news.ycombinator.com/item?id=5023204 - January 2013 (124 comments)
    (edit: maybe put AI tokens on stablecoin rails as value tokens? could be fun, could move them around instantly between participants on the value rails and could consume them programmatically, if someone implements this idea, buy me a beer!)
    
    nozzlegear - 2 days ago
    
    > Some people have tokens but no money.
    This sounds like a piece of worldbuilding from a Daniel Suarez novel. Who has tokens but no money?
    
    da_grift_shift - 2 days ago
    
    Some AI folks want to bifurcate the meaning of "token" towards a spendable store of value, rather than characters processed by a tokenizer.
    "Donate tokens". "Gift tokens". Semantic drift?
    
    toomuchtodo - 2 days ago
    
    People who were issued AI token credits.
  - SoftTalker - 2 days ago
    
    And creates a new class of problems. Why not just fork the project and modify it yourself at that point, and cut out the maintainer middleman.
    
    arjie - 2 days ago
    
    I actually do that quite often these days. Keeping synced with upstream is trivial these days with a modern agent. Even just pi with DeepSeek V4 Flash can do it. It's a huge free-rider issue, but there's no way for me to contribute even human changes upstream because I'll be lost in the AI contributions so I don't bother.
    So almost everything is forked and I then just have the agent keep my changes in sync with upstream. Works like a charm. I suspect my pattern is commonplace.
    
    freedomben - 2 days ago
    
    Yep, same here. I hate it a lot, but it's the new reality. It's easier/better for me to just fork, change whatever the hell I want, and push it to my fork. If I become away that upstream wants it I'm happy to put in the work to get a clean merge, but I'm not wasting anymore time pushing things upstream without some indicator that my time is valued by them. Been burned too many time now. It wasn't this way pre-AI, but AI peed in the pool and there isn't a good way to clean it yet
    
    skydhash - 2 days ago
    
    > If I become away that upstream wants it I'm happy to put in the work to get a clean merge, but I'm not wasting anymore time pushing things upstream without some indicator that my time is valued by them. Been burned too many time now.
    Do you realize that all the major package system on BSD and Linux works that way. You take upstream, patch it to get it to compile on the system, and then build a package. That is what open source is about. It's not about building a community and what's not.
    
    - 2 days ago
    
    [deleted]
    
    parliament32 - 2 days ago
    
    Why fork at all? Why not just vendor the dependency and slop the changes you want on top of it? You can even pull from upstream down the line for the latest updates.
    The problem is sloppers really, really want other people to use their code, so they feel useful for doing a bit of prompting, probably to rationalize how much they pay Anthropic et al to do the actual work for them. I just wish they'd direct that money directly to the projects they find useful instead of trying to insert themselves as middlemen.
    
    geon - 2 days ago
    
    Because it is vibecoded garbage.
  - esafak - 2 days ago
    
    That's the same as donating money, which you can already do.
    
    parliament32 - 2 days ago
    
    Well, yes, exactly. And yet nobody but the biggest corp-sponsored projects get anything more than negligible donations. So what does this tell us? These "contributors" are happy to throw money at open source projects as long as they think they're doing something by prompting the LLM?
qazxcvbnmlp - 2 days ago

Being able to submit an issue, description, test criteria along with a token budget would be pretty cool.
- arjie - 2 days ago
  
  There's an amusing attempt at this here: https://clevercrow.io/
  It was on HN here: https://news.ycombinator.com/item?id=48621645
ramraj07 - 2 days ago

Thats just a coding agent the "peopple" use via you, with extra steps.
- 2 days ago

[deleted]

cyanydeez - 2 days ago

also, close all issues and open them as you plan to work on them.

esafak - 2 days ago

We should have agents to triage PRs. Their "smarter bypass signals" is already implemented by Mitchell Hashimoto's Vouch system: https://github.com/mitchellh/vouch

- 2 days ago

[deleted]