When I reject AI code even if it works

vinibrasil.com

61 points by vnbrs 3 hours ago


Aurornis - an hour ago

Even using Fable (while it was briefly available), having it refine a plan, and directing it to make only small incremental changes, I still found reasons to reject its first pass at a lot of work. There was a lot of “You’re right to push back” responses. A lot of incidents where it would creat some giant complex set of abstractions to accomplish something that I could find ways to do much more elegantly and in a more maintainable manner.

It’s really eye opening to work with these tools on a codebase you know deeply because these problems are everywhere.

However if I opened an unfamiliar project in another language and I wanted to add a little feature with no intention of maintaining it, I’d happily accept the changes and loop until it worked well enough for my temporary needs.

The scary middle is when you’re dealing with coworkers who don’t care about anything other than closing tickets and collecting credit. With enough of a token budget you can now wrap loops around an LLM and have it try things until the program appears to work. Ask it to do a code review and then submit the PR without having understood what it was doing. There are a lot of workplaces where there isn’t a good mechanism to push back on this and the tech debt just keeps growing.

ecshafer - an hour ago

If we rephrased this to "When I reject my coworkers code even if it works" and give the same reasons there would be zero dissent. There is this weird idea that seems to come up with AI that any solution must be good and adequate. Software Engineering is all about rejecting code that works for the right code that works.

krupan - 23 minutes ago

And again this makes me wonder, is AI really helping if this much review and rework is needed for all the code it writes?

AmareshHebbar - 30 minutes ago

If I can't explain the code without rereading the diff, I probably shouldn't merge it.

summerlight - an hour ago

My personal rule of thumb: I am usually okay with agents driving e2e implementations if this won't make life noticeably worse when it does not work. Some analytical code? Perfectly fine. Hobby projects? Fine, though I prefer doing a fun part myself. Refactoring production code generating 10x more revenue than my salary? You'd better be at least understanding what it does.

datadrivenangel - 2 hours ago

"The reality is that code that runs and makes the CI green can still be a bad solution, and engineering has always been about implementing adequate, scalable, and extensible solutions."

Adequate often means done and cheap

rvz - 43 minutes ago

> Before coding agents, when given a task, I would explore the codebase, think of different solutions, experiment, and only then implement. That could take days of consolidating all that context. When I finally submitted that PR, confidence was higher, and explaining each of my changes to my coworkers was easier.

Now we are getting to the point where we are speed-running the deskilling of engineers into comprehension debt and they themselves rapidly losing confidence in reviewing code they did not write.

I think this blog post [0] is the best example of what could go entirely wrong and even worse when you do not know the technology.

If you cannot explain a change even when "the CI is green" or "all tests passing", I will immediately reject it.

Maybe great for vibe coding prototypes, but it all changes when that code is deployed onto mission critical systems. Just ask Amazon with Kiro. [1]

[0] https://sketch.dev/blog/our-first-outage-from-llm-written-co...

[1] https://www.reuters.com/business/retail-consumer/amazons-clo...

eranation - 32 minutes ago

LLMs diverge, not converge. They slightly increase entropy if not controlled. While you can have DRY skills and use AI to organize AI (in loops(tm) like Boris does) but eventually if you don’t understand the code, you are taking yourself out of the loop. And not just the job security that’s on the line, it’s the increasing cost for AI to babysit AI. If you or your “loops” (or paperclip, Hermes, gastown, or next in class agents of agents that runs your entire company) let it gradually sneak in slop-debt, the cost to fix it later will become prohibitive. (You can always just rewrite it, but as the race for “feature complete” and “zero backlog” continues, rewriting an ever growing set of new daily table stakes will become an economical moat)

TLDR: Keeping your codebase human readable and reason-about-able is not just helping humans to stay relevant. It will save costs for LLMs to maintain it.

_wire_ - 2 hours ago

"Even if it works?"

How do you verify that it works?

codelong888 - 24 minutes ago

[flagged]