After outages, Amazon to make senior engineers sign off on AI-assisted changes

564 points by ndr42 20 hours ago

https://www.ft.com/content/7cab4ec7-4712-4137-b602-119a44f77... (https://archive.ph/wXvF3)

https://twitter.com/lukolejnik/status/2031257644724342957 (https://xcancel.com/lukolejnik/status/2031257644724342957)

This “mandatory meeting” is just the usual weekly company-wide meeting where recent operational issues are discussed. There was a big operational issue last week, so of course this week will have more attendance and discussion.

This meeting happens literally every week, and has for years. Feels like the media is making a mountain out of a mole hill here.

davidclark - 18 hours ago

The article claims:
>He asked staff to attend the meeting, which is normally optional.
Is that false? It also discusses a new policy:
>Junior and mid-level engineers will now require more senior engineers to sign off any AI-assisted changes, Treadwell added.
Is that inaccurate? It is good context that this is a regularly scheduled meeting. But, regularly scheduled meetings can have newsworthy things happen at them.
- djb_hackernews - 12 hours ago
  
  When an SVP asks you to do something in a mass email, it's very much optional. Dave Treadwell is an SVP, his org is likely in the 10's of thousands, there is no way to even have a mandatory meeting for that many people.
  My SVP asks me to do things all the time, indirectly. I do probably 5% of them.
  - MikeTheGreat - 11 hours ago
    
    > org is likely in the 10's of thousands, there is no way to even have a mandatory meeting for that many people.
    Ok, this is pretty off-topic, but is this still true? I get that you can't have 10K people all actively participate in the meeting at the same time, but doesn't Zoom have a feature where you can broadcast to thousands and thousands?
    Doesn't X/Twitter have a feature like this? (Although, to be fair, the last time I heard about that it was part of a headline like "DeSantis announcement of Presidential run on X/Twitter delayed for hours as X/Twitter's tech stack collapses under 200K viewers")
    But still - nowadays it seems like it should be possible to have 10K employees all tune in at the same time and then call it a meeting, yes?
    
    hibikir - 11 hours ago
    
    Yes, but at that point it's an all-hands presentation, and you are basically doing a very careful presentation, thinking about every minute, because of how many hours the "meeting" is costing you.
    Very different from the typical weekly/montly outage meeting, where discussion is actually expected, instead of being a ritual.
    
    helsinkiandrew - 3 hours ago
    
    > but doesn't Zoom have a feature where you can broadcast to thousands and thousands?
    They have webinar/event support for 5000+ participants, viewers can raise hands/use chat feedback for questions etc. and the meeting host can invite people to be visible.
    
    sheept - 11 hours ago
    
    The meeting isn't the hard part—after all, shareholder meetings have huge audiences too. Enforcing mandatory attendance for myriads of employees is the hard part, so it's more likely mandatory in name only.
  - javcasas - 12 hours ago
    
    With tens of thousands in a meeting, cracking a 30-second stupid joke is probably costing several thousand dollars.
    
    hyperpape - 11 hours ago
    
    Right, but if you say something essential in a meeting with 10 people and it has to percolate through five levels of management to reach the front-lines and gets watered down, that could be much more lost, even millions.
    Scale cuts both ways.
    What matters isn't how big the meeting is, it's how important the material is, and how well presented it is.
    
    wolvoleo - 10 hours ago
    
    I don't think I've ever heard a top leader say anything essential in such a meeting. The stuff they work on is not related to my job at all. It's all gartner level strategy stuff. In our company they do take time talking about it in large calls but it's always boring and never relevant. And a lot of political spin you have to poke through to see the real message.
    If I ever attend it just put it on mute and look at the slides while I do some real work. That way my attendance gets registered and it doesn't stress me out later with too much stuff left hanging.
    That percolation is also translation of what they say to things that are relevant at my level. Like what we will be working on next year, if there's going to be bonus or job losses.
    I couldn't give a crap about the company's strategy as a whole and that's not my job anyway. Why should I. I'm not here because I believe in some holy mission. I just wanna do something I like and get paid.
    
    hyperpape - 10 hours ago
    
    Most of those meetings are pretty damn fluffy. No one goes back to their desk and does anything different because they've introduced new company values and the acronym is S.M.I.L.E.
    But this meeting is a course correction for how they're using AI, which is a huge initiative. He'll be trying to sell the right balance of "keep using the technology, but don't fuck anything up."
    Too cautious, everyone freezes and there's a slowdown[0]. Too soft, everyone thinks it's "another empty warning not to fuck up" and they go right back to fucking everything up because the real message was "don't you dare slow down." After the talk, people will have conversations about "what did they really mean?"
    [0] If you hate AI, feel free to flip the direction of the effect.
    
    wolvoleo - 10 hours ago
    
    Well this is the main problem with AI right now isn't it? How to use it successfully without having it fuck up.
    How are they expecting some juniors to do this when the industry as a whole doesn't know where to begin yet?
    Like that Meta AI expert who wiped her whole mailbox with openclaw. These are the people who should come up with the answers.
    Ps I mostly hate AI but I do see some potential. Right now it feels like we're entering a fireworks bunker looking for a pot of gold and having only a box of matches for illumination.
    What we need to know from management is exactly what you mention. Do we go all out and accept that shit will hit the fan once in a while (the old move fast and break things) or do we micromanage and basically work manually like old. And that they accept the risk either way. That kind of strategy is really business leader kind of work. Blaming it on your techs when it inevitably goes wrong is not.
    Because the tech as it is right now is very non-deterministic. One day it works magic and the next day it blows up.
    And yes that SMILE thing was a good example. Been in too many of those time wasters.
    
    swader999 - 11 hours ago
    
    It's worth 10x that because they are all AI powered super devs now /sarc
    
    tmoertel - 10 hours ago
    
    Unless that 30-second stupid joke is what gets the audience to take your request seriously. Sometimes people will help you when you don't come across like a self-interested corporate tool.
    
    encom - 10 hours ago
    
    I have never in my long life heard a joke from upper management during a meeting/presentation that wasn't awkward and cringe. Just get to the point - tell us how many people are getting fired, so the people who aren't fired can get back to work, and you go back to running this company into the ground.
    Sorry, I got flashbacks...
    
    FuckButtons - 11 hours ago
    
    If you assume everyone is making 100k it only takes 20 people in a meeting for it to cost 1k.
    
    airstrike - 11 hours ago
    
    Wasn't it Shopify who had a system for tracking how much each meeting cost based on attendees? I may be misremembering the company though
    
    LPisGood - 9 hours ago
    
    I was thinking about this in recent weeks and I think I’ve actually changed my mind on it.
    It’s not really possible to measure how much it would cost to not have a meeting, and I think it’s pretty obvious that if there were no meetings ever, it would hurt a company a lot
    
    airstrike - 8 hours ago
    
    Yeah, I agree it's a silly metric. But it's kinda also a good reminder that meetings do have a cost associated with them, so they should stay short, focused, and held only when necessary.
    "This could have been an e-mail" should never need to be said.
    
    tibbar - 12 hours ago
    
    i think closer to tens-of-thousands-of-dollars, by my napkin math!
    
    RealityVoid - 12 hours ago
    
    Worth it!
  - ljm - 10 hours ago
    
    Is that because you delegate or descope?
    Why is an SVP doing this if it's just gonna be ignored?
  - hnguyen1412 - 8 hours ago
    
    are you saying SVP’s words are not important and should be ignored? This is not what I remember back in the day when Bezos sent his email with a question mark (or maybe !)
  - messh - 11 hours ago
    
    so.... is RTO optional
- skeeter2020 - 17 hours ago
  
  That's not really what the headline attempts to communicate though. It specifically emphasizes "Mandatory" and "AI breaking things". Nobody was going to click on "Regularly scheduled Amazon staff meeting will include discussion on operational improvement"
  - ceejayoz - 13 hours ago
    
    > He asked staff to attend the meeting, which is normally optional.
    If I get a note from my boss like that, I consider it mandatory.
    
    mock-possum - 4 hours ago
    
    Yeah I don’t understand why people are pretending not to understand this -
    > He asked staff to attend the meeting, which is normally optional.
    Clearly means that while normally the meeting would be optional, this time it’s not
    
    idiotsecant - 11 hours ago
    
    But it gets less mandatory the more layers up you go. If I get an email from an SVP that is CC: the entire division saying everyone should go to a meeting I will almost certainly be able to ascertain the contents of that meeting in 10 seconds from someone else who did attend
    
    brewdad - 11 hours ago
    
    Surely your boss notices your non-attendance.
    
    delecti - 11 hours ago
    
    If it's actually really mandatory, my manager will probably also relay that directly to me. And that resets the count for "less mandatory the more layers up you go".
    
    dpark - 10 hours ago
    
    Starting to wonder if some people who complain about all day meetings just don’t realize they are optional.
- the_arun - 5 hours ago
  
  Days are not far, where my agents are going to attend meetings & share my opinions, collect summary for me. If everyone do same - agents run meetings & share summary with parent (humans). Each of us have LLMs/Agents with our contextual data. It is another level of multi tasking.
  - xp84 - 4 hours ago
    
    Then I spin up another agent to listen to the agent who went to the meeting and make any necessary adjustments to the output of my coding agents based on the new rules it heard about from the meeting agent.
- s3p - 13 hours ago
  
  >>He asked staff to attend the meeting, which is normally optional. >Is that false?
  Judging from the comment above, no, the meeting happens every week, and this week they were asked to attend.
- cobolcomesback - 17 hours ago
  
  It’s not false. But it’s also weaselly worded.
  Note that the article doesn’t say that he told staff they have to attend the meeting. It says he “asked” staff to attend the meeting. Which again, it’s really really normal for there to be an encouragement of “hey, since we just had an operational event, it would be good to prioritize attending this meeting where we discuss how to avoid operational events”.
  As for the second quote: senior engineers have always been required to sign off on changes from junior engineers. There’s nothing new there. And there is nothing specific to AI that was announced.
  This entire meeting and message is basically just saying “hey we’ve been getting a little sloppy at following our operational best practices, this is a reminder to be less sloppy”. It’s a massive nothingburger.
  - BigTTYGothGF - 16 hours ago
    
    > It says he “asked” staff to attend the meeting
    Being "asked" by your boss to attend an optional meeting is pretty close to being required, it's just got a little anti-friction coating on it.
    
    cobolcomesback - 12 hours ago
    
    That really isn’t the culture at Amazon. There are all-team meetings that happen all the time, and every now and then there is a reminder that “hey we’re gonna be talking about an interesting topic so you might want to join”, but it is certainly not a mandate or expectation that everyone will join.
    Different companies have different cultures. Weird that people can’t grok this.
    
    ryandrake - 11 hours ago
    
    "If you could just go ahead and attend that meeting, that would be greaaaaaaat..."
    "Did ya get the memo... about that meeting? I'll just have my secretary forward you another copy of that memo, OK? Yeaaaaaaah..."
    
    ragall - 13 hours ago
    
    Exactly. It's just West coast passive aggressive managerial behavior.
  - i_cannot_hack - 15 hours ago
    
    Your characterization of the event as a simple reminder to follow established best practices is directly contradicted by the briefing note of the meeting, which specifically mentions a lack of best practices related to AI. Which makes me skeptical of your assessment of the situation in general.
    > Under “contributing factors” the note included “novel GenAI usage for which best practices and safeguards are not yet fully established”.
  - 8note - 16 hours ago
    
    > senior engineers have always been required to sign off on changes from junior engineers.
    definitely a team by team question. if it was required it would be a crux rule that the code review isnt approved without an l6 approver.
    
    BikiniPrince - 11 hours ago
    
    It’s part of the change management process that all code is reviewed. This is needed as per several different compliance agreements. What’s probably happened is poor peer reviews from other junior engineers gets missed. That’s a lot of code reviews to send upstream.
CoolGuySteve - 17 hours ago

It didn't seem to make the news but at least in NYC the entire Amazon storefront was broken all afternoon on Friday.
Items weren't displaying prices and it was impossible to add anything to your cart. It lasted from about 2pm to 5pm.
It's especially strange because if a computer glitch brought down a large retail competitor like Walmart I probably would have seen something even though their sales volume is lower.
- malfist - 17 hours ago
  
  Over the weekend I was trying to return a pair of shoes and get a different size and I kept getting 500s trying to go to the store page for the shoes.
  - chatmasta - 12 hours ago
    
    Funny, I was automatically refunded for a pair of shoes that Amazon thought I never received even though I’m wearing them right now. I couldn’t even find a way to dispute the refund so I just took the win…
    
    BikiniPrince - 11 hours ago
    
    That explains why it kept changing the estimated received date. It was doing weird things.
- m3047 - 15 hours ago
  
  Sometimes you squeeze clay and it comes out the oddest places. There were other stressors last week.https://www.pcmag.com/news/amazon-cloud-services-disrupted-i...
- kotaKat - 17 hours ago
  
  A little birdie told me someone pushed duplicate data into one of Amazon’s core noSQL systems that runs most of e-commerce. The front end of the site broke in weird ways but it certainly wasn’t taking orders.
groundzeros2015 - 9 hours ago

It’s always sobering to see a news story about something you have insider perspective on.
belval - 18 hours ago

I am not in that specific meeting but it made me chuckle that a weekly ops meeting will somehow get media attention. It's been an Amazon thing forever. Wait until the public learns about CoEs!
- cmiles74 - 12 hours ago
  
  A weekly ops meeting where they talk about ensuring PRs with AI contributions get extra scrutiny? I think that's significant news.
  - osigurdson - 11 hours ago
    
    Exactly. This is real world pushback on the "software is solved" narrative from AI labs. Also, most orgs try to copy Amazon for some reason more than big tech firms. "At our org, we disagree and commit" - yeah you made that one up yourself. Anyway, this is going to have a lot of impact in my view.
  - cobolcomesback - 9 hours ago
    
    There was nothing mentioned in the meeting or messaging about PRs with AI contributions. There are no extra requirements for review or scrutiny of AI-generated-code. The media reports about this have been excessively misleading about this.
  - falsemyrmidon - 10 hours ago
    
    It's not extra scrutiny. Doing code reviews for every commit is a standard practice at Amazon and has been for a decade plus.
- 8note - 16 hours ago
  
  id.expect COEs to be coming up with AI code action items though, not to have more thorough human checks
  - coredog64 - 13 hours ago
    
    There's an explicit tension: SWEs would love that as a "get out of jail free" card, but their management chain is being evaluated by ajassy on AI/ML adoption. Admitting AI code as the root cause of a CoE is gonna look really bad unless/until your peers are also copping to it.
    
    8note - 10 hours ago
    
    I think its a question 2 or 3 in a why chain, but 4 and 5 need to be why the agent screwed up, and there needs to be action items that are around giving the ai better guardrails, context, or tooling.
    "get a person to look at it" is a cop-out action item, and best intentions only. nothing that you could actually apply to make development better across the whole company
otterley - 18 hours ago

> Feels like the media is making a mountain out of a mole hill here.
That's been their job ever since cable news was invented.
- ses1984 - 17 hours ago
  
  It’s been a bit longer than that.
  https://en.wikipedia.org/wiki/Yellow_journalism
  It probably goes back as long as they have been shouting news in the town square in Rome or before that even.
  - lukan - 13 hours ago
    
    Word around the campfire is, telling stories and exaggerating them to get people attention, is as old as humanity.
    But good journalism is still something else.
  - otterley - 16 hours ago
    
    True enough!
furyofantares - 12 hours ago

This reply chain is confusing but I'm guessing got merged from another thread that had a different title?
Must have as the comments are hours older than OP.
embedding-shape - 17 hours ago

> This meeting happens literally every week, and has for years. Feels like the media is making a mountain out of a mole hill here.
Are you completely missing the point of the submission? It's not about "Amazon has a mandatory weekly meeting" but about the contents of that specific meeting, about AI-assisted tooling leading to "trends of incidents", having a "large blast radius" and "best practices and safeguards are not yet fully established".
No one cares how often the meeting in general is held, or if it's mandatory or not.
- skeeter2020 - 17 hours ago
  
  >> Are you completely missing the point of the submission
  no, and that's what people are noting: the headline deliberately tries to blow this up into a big deal. When did you last see the HN post about Amazon's mandatory meeting to discuss a human-caused outage, or a post mortem? It's not because they don't happen...
  - ummonk - 11 hours ago
    
    Amazon has had a really bad string of various outages recently. Assuming they're internally treating this as business as usual in post-mortems then perhaps the newsworthy thing is actually that they aren't taking their outages seriously enough.
  - thepasch - 13 hours ago
    
    > the headline deliberately tries to blow this up into a big deal
    I do not understand how “company that runs half the internet has had major recent outages and now explicitly names lax/non-existent LLM usage guidelines as a major reason” can possibly not be a big deal in the midst of an industry-wide hype wave over how the world’s biggest companies now run agent teams shipping 150 pull requests an hour.
    The chain of events is “AWS has been having a pretty awful time as far as outages go”, and now “result of an operational meeting is that the company will cut down on the use of autonomous AI.” You don’t need CoT-level reasoning to come to the natural conclusion here.
    If we could, as a species, collectively, stop measuring the relevance of a piece of news proportionally by how much we like hearing it, please?
    
    mattgreenrocks - 13 hours ago
    
    The defensiveness is almost as interesting as the meeting itself.
    
    emp17344 - 11 hours ago
    
    Way too many people have tied their egos to the success of AI.
    
    cobolcomesback - 9 hours ago
    
    And too many people have their egos tied to its failure, too.
    Im a massive AI skeptic. If anyone were to be jumping up and down on the corpse of AI and this incessant drive to use it everywhere, it’d be me. But I also work at Amazon. I got the email. I attended the meeting. I can personally attest that there are no new requirements for AI-generated code. The articles about this in the meeting at extremely misleading, if not outright wrong. But instead of believing the person that was actually there in the room, this thread is full of people dismissing my first-hand account of the situation because it doesn’t align with the “haha AI failed” viewpoint.
    
    autoexec - 11 hours ago
    
    Not just their egos, but their paychecks. This place is either going to get very quiet or really weird when the hype train derails and the AI bubble bursts.
    
    shermantanktop - 7 hours ago
    
    The subject of the media coverage is not AWS, it is a peer organization to AWS that runs using significant amounts of non-AWS infrastructure. They are both part of an umbrella called Amazon but are not at all the same thing.
    Maybe your CoT-level reasoning isn’t so robust.
    
    saghm - 3 hours ago
    
    It's hard to that this objection seriously. The publication is literally called the Financial Times. It's not exactly crazy for them to think that their readers might care about the entity that shows up the stock ticker rather than how the company happens to divide up things internally.
    Even if it weren't a finance publication, I have trouble imagining you making this argument if a headline said something like "Google deals with outages in the cloud" because of the idea that it's misleading to refer to it as anything other than GCP. I think you're fundamentally not understanding how people communicate about this sort of thing if you actually think that someone saying "Amazon" is misleading in any meaningful way.
    
    cobolcomesback - 13 hours ago
    
    The message and meeting being discussed here have nothing to do with AWS or any outages AWS has faced recently. I think you’re missing the point of the discussion.
    I don’t blame you, because this is just bad reporting (and potentially intentionally malicious to make you think it’s about AWS). But the meeting and discussion was with the Amazon retail teams, talking about Amazon retail processes, and Amazon retail services. The teams and processes that handle this are entirely separate from any AWS outages you are thinking of.
    The outages that Amazon retail has faced also have nothing to do with AI, and there was no “explicit call out” about AI causing anything.
rahbert - 3 hours ago

This is correct. We ran them on Wednesday’s in Alexa. Jessy actually used to come and sit in ours once a quarter or so when he was running AWS.
- 15 hours ago

[deleted]
cmiles8 - 16 hours ago

The core message of the article is that Amazon has been having issues with AI slop causing operational reliability concerns, and that seems to be 100% accurate.
- coredog64 - 13 hours ago
  
  /with AI slop//
- inquirerGeneral - 12 hours ago
  
  [dead]
age1mlclg6 - 13 hours ago

What has really happened is that those employees were made into "reverse centaurs":
https://www.theguardian.com/us-news/ng-interactive/2026/jan/...
Clent - 16 hours ago

Who is the media you're accusing here? This is a twitter post. As far as I can tell they do not work a media company.
What is worth being pointed out is how quickly people blame "The Media" for how people use, consume and spread information on social networks.
- otterley - 16 hours ago
  
  The source is not a Twitter post, it's a Financial Times article (that the poster failed to cite).
- - 16 hours ago
  
  [deleted]
niwtsol - 17 hours ago

I believe it is by group - AWS started the weekly operations meeting, effectively every service's oncall from the last week had to attend. Then it grew massive, so they made it optional. Alexa had a similar meeting that tried to replicate what AWS did. A lot of time spent reviewing load tests getting ready for holiday season, prime day, and the superbowl (super bowl ads used to cause crazy TPS spikes for Alexa). And a lot of finger pointing if there was an outage from one team. While it probably did help raise the operational bar, so much time wasted by engineers on busywork/paperwork documenting an error or fix vs improving the actual service.

happytoexplain - 18 hours ago

>Junior and mid-level engineers can no longer push AI-assisted code without a senior signing off

Review by a senior is one of the biggest "silver bullet" illusions managers suffer from. For a person (senior or otherwise) to examine code or configuration with the granularity required to verify that it even approximates the result of their own level of experience, even only in terms of security/stability/correctness, requires an amount of time approaching the time spent if they had just done it themselves.

I.e. senior review is valuable, but it does not make bad code good.

This is one major facet of probably the single biggest problem of the last couple decades in system management: The misunderstanding by management that making something idiot proof means you can now hire idiots (not intended as an insult, just using the terminology of the phrase "idiot proof").

ardeaver - 17 hours ago

When I was really early in my career, a mentor told me that code review is not about catching bugs but spreading context (i.e. increasing bus factor.) Catching bugs is a side effect, but unless you have a lot of people review each pull request, it's basically just gambling.
The more expensive and less sexy option is to actually make testing easier (both programmatically and manually), write more tests and more levels of tests, and spend time reducing code complexity. The problem, I think, is people don't get promoted for preventing issues.
- VorpalWay - 11 hours ago
  
  This depends on the industry. I work on industrial machine control software, and we spend a huge amount of time on tests. We have to for some parts (human safety crtitical), but other parts would just be expensive if they failed (loss of income for customers, and possibly damaged equipment).
  The key to making this scalable is to make as few parts as possible critical, and make the potential bad outcomes as benign as possible. (This lets you go to a lower rating in whatever safety standard applies to your industry.) You still need tests for the less critical parts though, while downtime is better than injury, if you want to sell future machines to your customers you need to have a good track record. At least if you don't want to compete on cost.
  - happyghost - 10 hours ago
    
    > make as few parts as possible critical, and make the potential bad outcomes as benign as possible
    This is a good lesson for anyone I think. Definitely something I’m going to think more about. Thanks for sharing!
- asdfman123 - 11 hours ago
  
  One of the major things code review does is prevent that one guy on your team who is sloppy or incompetent from messing up the codebase without singling him out.
  If you told someone "I don't trust you, run all code by me first" it wouldn't go well. If you tell them "everyone's code gets reviewed" they're ok with it.
  - behehebd - 7 hours ago
    
    Everyone is sloppy sometimes. I wonder if what code review does is prevent velocity (acts a a brake) so that things dont change too fast (which is often a good thing).
    You don't get paid for features or code shipped. People don't pay $200 a head for fine dining based on the number of carrot chops or garlic crushes. The chops and crushes are necessary but not what you should be optimizing for.
- bluGill - 17 hours ago
  
  > people don't get promoted for preventing issues.
  they do - but only after a company has been burned hard. They also can be promoted for their area being enough better that everyone notices.
  still the best way to a promotion is write a major bug that you can come in at the last moment and be the hero for fixing.
  - tartoran - 17 hours ago
    
    That could work but plenty of quiet heros weren’t promoted for fixing critical bugs.
    
    recursive - 16 hours ago
    
    They fixed it too soon. You have to wait until the effect is visible on someone's dashboard somewhere.
    
    marcta - 16 hours ago
    
    Goodhart's Law strikes again... "When a measure becomes a target, it ceases to be a good measure."
    
    bluGill - 16 hours ago
    
    You have to make sure it doesn't arrive at you before it is on the dashboard. Otherwise you are why it is blowing up the time to fix a bug metric. Unless you can make the problem so obscure other smart people asked to help you can't figure it out thus making you look bad.
  - joquarky - 11 hours ago
    
    That is in no way guaranteed. Sometimes finding too many security issues makes you unpopular.
    Two years afterward, we got hit with ransomware. And obviously "I told you so" isn't a productive discussion topic at that point.
  - johnnyanmac - 15 hours ago
    
    That's not preventing the issue, though. The closest you can get to this is to have another competitor be burned hard and demonstrate how your code base has the exact same issue. But even that isn't guaranteed. "that can't happen here" is a hard mindset to disrupt unless you yourself are already a C suite.
- kqr - an hour ago
  
  Code review is great for spreading context, but they also are very good at finding bugs. If you want to find bugs, review is one of the best ways to do it. https://entropicthoughts.com/code-reviews-do-find-bugs
- bloppe - 11 hours ago
  
  I think of code review more about ensuring understandability. When you spend hours gathering context, designing, iterating, debugging, and finally polishing a commit, your ability to judge the readability of your own change has been tainted by your intimate familiarity with it. Getting a fresh pair of eyes to read it and leave comments like "why did you do it this way" or "please refactor to use XYZ for maintainability", you end up with something more that will be easier to navigate and maintain by the junior interns who will end up fixing your latent bugs 5 years later.
  - brianwawok - 10 hours ago
    
    Alternately, have a small team where you trust everyone.
- 8note - 16 hours ago
  
  > The problem, I think, is people don't get promoted for preventing issues.
  cleaning up structural issues across a couple orgs is a senior => principal promo ive seen a couple of times
- wiseowise - 2 hours ago
  
  > When I was really early in my career, a mentor told me that code review is not about catching bugs but spreading context (i.e. increasing bus factor.) Catching bugs is a side effect
  This bs is what I say my juniors when I want them to fuck off with their reviews and focus on my actual work.
  Sounds very insightful though.
marginalia_nu - 17 hours ago

Expert reviews are just about the only thing that makes AI generated code viable, though doing them after the fact is a bit sketchy, to be efficient you kinda need to keep an eye on what the model is doing as its working.
Unchecked, AI models output code that is as buggy as it is inefficient. In smaller green field contexts, it's not so bad, but in a large code base, it's performs much worse as it will not have access to the bigger picture.
In my experience, you should be spending something like 5-15X the time the model takes to implement a feature on reviewing and making it fix its errors and inefficiencies. If you do that (with an expert's eye), the changes will usually have a high quality and will be correct and good.
If you do not do that due dilligence, the model will produce a staggering amount of low quality code, at a rate that is probably something like 100x what a human could output in a similar timespan. Unchecked, it's like having a small army of the most eager junior devs you can find going completely fucking ape in the codebase.
- locusofself - 17 hours ago
  
  If you spend 5-15x the time reviewing what the LLM is doing, are you saving any time by using it?
  - happytoexplain - 17 hours ago
    
    No, but that's the crux of the AI problem in software. Time to write code was never the bottleneck. AI is most useful for learning, either via conversation or by seeing examples. It makes writing code faster too, but only a little after you take into account review. The cases where it shines are high-profile and exciting to managers, but not common enough to make a big difference in practice. E.g AI can one-shot a script to get logs from a paginated API, convert it to ndjson, and save to files grouped by week, with minimal code review, but only if I'm already experienced enough to describe those requirements, and, most importantly, that's not what I'm doing every day anyway.
    
    brandensilva - 15 hours ago
    
    I'm finding it in some cases I'm dealing with even more code given how much code AI outputs. So yeah, for some tasks I find myself extremely fast but for others I find myself spending ungodly amounts of time reviewing the code I never wrote to make sure it doesn't destroy the project from unforseen convincing slop.
  - ritlo - 17 hours ago
    
    A related Dirty Secret that's going to become clear from all this is that a very large proportion of code in the wild (yes, even in 2026—maybe not in FAANG and friends, IDK, but across all code that is written for pay in the entire economy) has limited or no automated test coverage, and is often being written with only a limited recorded spec that's usually fleshed out only to the degree needed (very partial) as a given feature is being worked on.
    What do the relatively hands-off "it can do whole features at a time" coding systems need to function without taking up a shitload of time in reviews? Great automated test coverage, and extensive specs.
    I think we're going to find there's very little time-savings to be had for most real-world software projects from heavy application of LLMs, because the time will just go into tests that wouldn't otherwise have been written, and much more detailed specs that otherwise never would have been generated. I guess the bright-side take of this is that we may end up with better-tested and better-specified software? Though so very much of the industry is used to skipping those parts, and especially the less-capable (so far as software goes) orgs that really need the help and the relative amateurs and non-software-professionals that some hope will be able to become extremely productive with these tools, that I'm not sure we'll manage to drag processes & practices to where they need to be to get the most out of LLM coding tools anyway. Especially if the benefit to companies is "you will have better tests for... about the same amount of software as you'd have written without LLMs".
    We may end up stuck at "it's very-aggressive autocomplete" as far as LLMs' useful role in them, for most projects, indefinitely.
    On the plus side for "AI" companies, low-code solutions are still big business even though they usually fail to deliver the benefits the buyer hopes for, so there's likely a good deal of money to be made selling companies LLM solutions that end up not really being all that great.
    
    ansibsha - 15 hours ago
    
    > better-specified software
    Code is the most precise specification we have for interfacing with computers.
    
    xp84 - 4 hours ago
    
    Sure, but if you define the code as the only spec, then it is usually a terrible spec, since the code itself specifies bugs too. And one of the benefits of having a spec (or tests) is that you have something against which to evaluate the program in order to decide if its behavior is correct or not.
    Incidentally, I think in many scenarios, LLMs are pretty great at converting code to a spec and indeed spec to code (of equal quality to that of the input spec).
    
    tmaly - 13 hours ago
    
    There are some cases where AI is generating binary machine code, albeit small amounts. What do we have when we don't have the code?
    
    marginalia_nu - 13 hours ago
    
    Machine code is still code, even if the representation is a bit less legible than the punch cards we used to use.
    
    - 12 hours ago
    
    [deleted]
    
    interestpiqued - 11 hours ago
    
    You’re missing the point of a spec
    
    unselect5917 - 4 hours ago
    
    The spec is as much for humans as it is the machine, yes?
    
    interestpiqued - 4 hours ago
    
    Spec should be made before hand and agreed on by stakeholders. It says what it should do. So it’s for whoever is implementing, modifying, and/or testing the code. And unfortunately devs have a tendency of poor documentation
    
    slopinthebag - 16 hours ago
    
    Re. productivity, if LLM's are a genuine boost with 1/3 of the work, neutral 1/3 of the time, and actually worse 1/3 of the time, it's likely we aren't really seeing performance improvements as 1) people are using them for everything and b) we're still learning how to best use them.
    So I expect over time we will see genuine performance improvements, but Amdahl's law dictates it won't be as much as some people and ceo's are expecting.
    
    dboreham - 13 hours ago
    
    Bingo. Hopefully there are some business opportunities for us in that truth.
    
    _wire_ - 17 hours ago
    
    > because the time will just go into tests that wouldn't otherwise have been written
    Writing tests to ensure a program is correct is the same problem as writing a correct program.
    Evaluating conformance is a different category of concern from ensuring correctness. Tests are about conformance not correctness.
    Ensuring correct programs is like cleaning in the sense that you can only push dirt around, you can't get rid of it.
    You can push uncertainty around and but you can't eliminate it.
    This is the point of Gödel's theorem. Shannon's information theory observes similar aspects for fidelity in communication.
    As Douglas Adams noted: ultimately you've got to know where your towel is.
    
    layer8 - 11 hours ago
    
    A competent programmer proves the program he writes correct in his head. He can certainly make mistakes in that, but it’s very different from writing tests, because proofs abstract (or quantify) over all states and inputs, which tests cannot do.
    
    - 11 hours ago
    
    [deleted]
  - shimman - 17 hours ago
    
    These companies don't care about saving time or lowering operating costs, they have massive monopolies to subsidize their extremely poor engineering practices with. If the mandate is to force LLM usage or lose your job, you don't care about saving time; you care about saving your job.
    One thing I hope we'll all collectively learn from this is how grossly incompetent the elite managerial class has become. They're destroying society because they don't know what to do outside of copying each other.
    It has to end.
  - SchemaLoad - 11 hours ago
    
    The submitter with their name on the Jira ticket saves time, the reviewer who has to actually verify the work loses a lot of time and likely just lets issues slip through.
  - marginalia_nu - 17 hours ago
    
    To be honest, some times it's still beneficial.
    For fairly straightforward changes it's probably a wash, but ironically enough it's often the trickier jobs where they can be beneficial as it will provide an ansatz that can be refined. It's also very good at tedious chores.
    
    misnome - 15 hours ago
    
    And spotting stuff in review! Sometimes it’s false positives but on several occasions I’ve spent ~15-30 minutes teaching-reviewing a PR in person, checked afterwards and it matched every one of the points.
  - bluGill - 17 hours ago
    
    Some, but not very much. Writing code is hard. Ai will do a lot of tedious code that you procrastinate writing.
    
    hard24 - 17 hours ago
    
    Also when you are writing code yourself you are implicitly checking it whilst at the back of your mind retaining some form of the entire system as a whole.
    People seem to gloss over this... As a CEO if people don't function like this I'd be awake at night sweating.
    
    bonesss - 16 hours ago
    
    That’s the reverse-centaur issue I see: humans are not great at repetitive nuanced similar seeming tasks, putting the onus on humans to retroactively approve high volumes of critical code has them managing a critical failure mode at their weakest and worst. Automated reviews should be enhancing known good-faith code, manual reviews of high volume superficially sound but subversive code is begging for issues over time.
    Which results the software engineering issue I’m not seeing addressed by the hype: bugs cost tens to hundreds of times their coding cost to resolve if they require internal or external communication to address. Even if everyone has been 10x’ed, the math still strongly favours not making mistakes in the first place.
    An LLM workflow that yields 10x an engineer but psychopathically lies and sabotages client facing processes/resources once a quarter is likely a NNPP (net negative producing programmer), once opportunity and volatility costs are factored in.
    
    demosito666 - 12 hours ago
    
    > Even if everyone has been 10x’ed, the math still strongly favours not making mistakes in the first place
    The math depends on importance of the software. A mistake in a typical CRUD enterprise app with 100 users has zero impact on anything. You will fix it when you have time, the important thing is that the app was delivered in a week a year ago and was solving some problem ever since. It has already made enormous profit if you compare it with today’s (yesterday’s ?) manual development that would take half a year and cost millions.
    A mistake in a nuclear reactor control code would be a total different thing. Whatever time savings you made on coding are irrelevant if it allowed for a critical bug to slip through.
    Between the two extremes you thus have a whole spectrum of tasks that either benefit or lose from applying coding with LLMs. And there are also more axes than this low to high failure cost, which also affect the math. For example, even non-important but large app will likely soon degrade into unmanageable state if developed with too little human intervention and you will be forced to start from scratch loosing a lot of time.
    
    bluGill - 12 hours ago
    
    I have found ai extreemly good at finding all those really hard bugs though. Ai is a greater force multiplier when there is a complex bug than in gneen field code.
    
    bluGill - 16 hours ago
    
    Sortof. I work on a system too large for anyone to know the whole thing. Often people who don't know each other do something that will break the other. (Often because of the number of different people - most individuals go years between this)
    
    raw_anon_1111 - 15 hours ago
    
    No I’m keeping up with the system as a whole because I’m always working at a system level when I’m using AI instead of worrying about the “how”
    
    ansibsha - 15 hours ago
    
    No you’re not. The “how” is your job to understand, and if you don’t you’ll end up like the devs in the article.
    We as an industry have been able to offload a lot of “how” via deterministic systems built by humans with expert understanding. LLMs give you the illusion of this.
    
    raw_anon_1111 - 14 hours ago
    
    No in my case the “how” is
    1. I spoke to sales to find out about the customer
    2. I read every line of the contract (SOW)
    3. I did the initial requirements gathering over a couple of days with the client - or maybe up to 3 weeks
    3. I designed every single bit of AWS architecture and code
    4. I did the design review with the client
    5. I led the customer acceptance testing
    > We as an industry have been able to offload a lot of “how” via deterministic systems built by humans with expert understanding. LLMs
    I assure you the mid level developers or god forbid foreign contractors were not “experts” with 30 years of coding experience and at the time 8 years of pre LLM AWS experience. It’s been well over a decade - ironically before LLMs - that my responsibility was only for code I wrote with my own two hands
    
    ansibsha - 11 hours ago
    
    Yes, and trusting an LLM here is not a good idea. You know it will make important mistakes.
    I’m not saying trusting cheap devs is a good idea either. I do think cheap devs are actually at risk here.