Has the cost of building software dropped 90%?
martinalderson.com413 points by martinald 2 days ago
413 points by martinald 2 days ago
> Engineers need to really lean in to the change in my opinion.
I tried leaning in. I really tried. I'm not a web developer or game developer (more robotics, embedded systems). I tried vibe coding web apps and games. They were pretty boring. I got frustrated that I couldn't change little things. I remember getting frustrated that my game character kept getting stuck on imaginary walls and kept asking Cursor to fix it and it just made more and more of a mess. I remember making a simple front-end + backend with a database app to analyze thousands of pull request comments and it got massively slow and I didn't know why. Cursor wasn't very helpful in fixing it. I felt dumber after the whole process.
The next time I made a web app I just taught myself Flask and some basic JS and I found myself moving way more quickly. Not in the initial development, but later on when I had to tweak things.
The AI helped me a ton with looking things up: documentation, error messages, etc. It's essentially a supercharged Google search and Stack Overflow replacement, but I did not find it useful letting it take the wheel.
These posts like the one OP made is why I'm losing my mind.
Like, is there truly an agentic way to go 10x or is there some catch? At this point while I'm not thrilled about the idea of just "vibe coding" all the time, I'm fine with facing reality.
But I keep having the same experience as you, or rather leaning more on that supercharged Google/SO replacement
or just a "can you quickly make this boring func here that does xyz" "also add this" or for bash scripts etc.
And that's only when I've done most of the plumbing myself.
EVERY DX survey that comes out (surveying over 20k developers) says the exact same thing.
Staff engineers get the most time savings out of AI tools, and their weekly time savings is 4.4 hours for heavy AI users. That's a little more than 10% productivity, so not anywhere close to 10x.
What's more telling about the survey results is they are also consistent in their findings between heavy and light users of AI. Staff engineers who are heavy users of AI save 4.4 hours a week while staff engineers who are light users of AI save 3.3 hours a week. To put another way, the DX survey is pretty clear that the time savings between heavy and light AI users is minimal.
Yes surveys are all flawed in different ways but an N of 20k is nothing to sneeze at. Any study with data points shows that code generation is not a significant time savings and zero studies show significant time savings. All the productivity gains DX reports come from debugging and investigation/code base spelunking help.
> That's a little more than 10% productivity, so not anywhere close to 10x.
No Silver Bullet still alive and well in 2026
For those surveys to mean anything (for or against AI), we'd have to have an effective measure of developer productivity
In my experience the productivity measured in created merge requests increased massively.
More merge requests because now the same senior developers are creating more bugs, 4x comparing to 2025. Same developers, same codebase but now with Cursor!
do you have a link to the latest survey? my google-fu is failing me at the moment
Unfortunately I do not.
Past survey results are hidden in some presentations I've seen, and the latest survey I have full access due to my company paying for it. So I'm not sure it's legal for me to reproduce
The latest survey is here apparently: https://getdx.com/report/ai-assisted-engineering-q4-impact-r....
No idea if signing up gives the full survey for free. We get it through our company paying for DX's services
your google-fu isnt failing. there's simply only a couple large studies on this, and of those, zero that have a useful methodology.
I think there is going to be 2-3 year lag in understanding how llms actually impact developer productivity. There are way too many balls in the air, and anyone claiming specific numbers on productivity increase is likely very very wrong.
For example citing staff engineers as an example will have a bias: they have years of traditional training and are obviously not representative of software engineers in general.
FWIW I only mentioned staff engineers because the survey found staff+ engineers reported the highest time savings. The survey itself had time savings averages for junior (3.9), Mid level (4.3), Senior (4.1) and Staff (4.4).
> Like, is there truly an agentic way to go 10x or is there some catch?
Yes. I think it’s practice. I know this sounds ridiculous, but I feel like I have reached a kind of mind meld state with my AI tooling, specifically Claude Code. I am not really consciously aware of having learned anything related to these processes, but I have been all in on this since ChatGPT, and I honestly think my brain has been rewired in a way that I don’t truly perceive except in terms of the rate of software production.
There was a period of several months a while ago where I felt exhausted all the time. I was getting a lot done, but there was something about the experience that was incredibly draining. Now I am past that and I have gone to this new plateau of ridiculous productivity, and a kind of addictive joy in the work. A marvellous pleasure at the orchestration of complex tasks and seeing the results play out. It’s pure magic.
Yes, I know this sounds ridiculous and over-the-top. But I haven’t had this much fun writing software since my 20s.
> Yes, I know this sounds ridiculous and over-the-top.
in that case you should come with more data. tell us how you measured your productivity improvement. all you've said here is that it makes you feel good
What's worked best with Gemini such I made a DSL that transpiles to C with CUDA support to train small models in about 3 hours... (all programs must run against an image data set, must only generate embeddings)
Do not; vibe code from top down (ex. Make me a UI with React, with these buttons and these behaviors to each button)
Do not; chat casually with it. (ex. I think it would look better if the button was green)
Do; constrain phrasing to the next data transform goal (ex. You must add a function to change all words that start with lowercase to start with uppercase)
Do; vibe code bottom up (ex. You must generate a file with a function to open a plaintext file and appropriate tests; now you must add a function to count all words that begin with "f")
Do; stick to must/should/may (ex. You must extend the code with this next function)
Do; constrain it to mathematical abstractions (ex. sys prompt: You must not use loops, you must only use recursion and functional paradigms. You must not make up abstractions and stick to mathematical objects and known algorithms)
Do; constrain it to one file per type and function. This makes it quick to review, regenerate only what needs to change.
Using those patterns, Gemini 2.5 and 3 have cranked out banging code with little wandering off in the weeds and hallucinating.
Programming has been mired in made up semantics of the individual coder for the luls, to create mystique and obfuscate the truth to ensure job security; end of the day it's matrix math and state sync between memory and display.
THis is remarkably similar to the process we had to follow a couple of decades ago, when offshoring to IT mills: spell out every little detail in small steps, iterate often, and you'll usually get most of what you want.
This. I find constraints to be very important. It's fairly obvious an llm can tackle a class or function. It's still up to the human to string it all together. I'm not quite sure how long that will last though. Seems more of an engineering problem to me. At the end of the day you absolutely can get good outputs from these things if you provide the proper input. Everything else is orchestration.
Work that would have taken me 1-2 weeks to complete, I can now get done in 2-3 hours. That's not an exaggeration. I have another friend who is as all-in on this as me and he works in a company (I work for myself, as a solo contractor for clients), and he told me that he moved on to Q1 2026 projects because he'd completed all the work slated for 2025, weeks ahead of schedule. Meanwhile his colleagues are still wading through scrum meetings.
I realize that this all sounds kind of religious: you don't know what you're missing until you actually accept Jesus's love, or something along those lines. But you do have to kinda just go all-in to have this experience. I don't know what else to say about it.
If your work maps exceedingly well to the technology it is true, it goes much faster. Doubly so when you have enough experience and understanding of things to find its errors or suboptimal approaches and adjust it that much faster.
The second you get to a place where the mapping isn’t there though, it goes off rails quickly.
Not everyone programs in such a way that they may ever experience this but I have, as a Staff engineer at a large firm, run into this again and again.
It’s great for greenfield projects that follow CRUD patterns though.
this is just not a very interesting way to talk about technology. I'm glad it feels like a religious experience to you, I don't care about that. I care about reality
it seems to me if these things were real and repeatable there would be published traces that show the exact interactions that led to a specific output and the cost in time and money to get there.
do such things exist?
Assuming 40 hours a week of work time, you’re claiming a ~25x speed up, which is laughably absurd to me.
It will take you 2.5 months to accomplish what would have taken you five years, that is the kind of productivity increase you’re describing.
It doesn’t pass the smell test. I’m not sure that going from assembly to python would even have such a ludicrous productivity enhancement.
My sympathies go out to the friend's coworkers. They are probably wading through a bunch of stuff right now, but given the context you have given us, its probably not "scrum meetings"..
I don't even care about the llm, I just want the confidence you have to assess that any given thing will take N weeks. You say 1-2 weeks.. thats like a big range! Something that "would" take 1 week takes ~2 hours, something that "would" take 2 weeks also takes ~2 hours. How does that even make sense? I wonder how long something that would of taken three weeks would take?
Do you still charge your clients the same?
> They are probably wading through a bunch of stuff right now, but given the context you have given us, its probably not "scrum meetings"..
This made me laugh. Fair enough. ;)
In terms of the time estimations: if your point is that I don't have hard data to back up my assertions, you're absolutely correct. I was always terrible at estimating how long something would take. I'm still terrible at it. But I agree with the OP. I think the labour required is down 90%.
It does feel to me that we're getting into religious believer territory. There are those who have firsthand experience and are all-in (the believers), there are those who have firsthand experience and don't get it (the faithless), and there are those who haven't tried it (the atheists). It's hard to communicate across those divides, and each group's view of the others is essentially, "I don't understand you".
Religions are about faith, faith is belief in the absence of evidence. Engineering output is tangible and measurable, objectively verifiable and readily quantifiable (both locally and in terms of profits). Full evidence, testable assertions, no faith required.
Here we have claims of objective results, but also admissions we’re not even tracking estimations and are terrible at making them when we do. People are notoriously bad at estimating actual time spent versus output, particularly when dealing with unwanted work. We’re missing the fundamental criteria of assessment, and there are known biases unaccounted for.
Output in LOC has never been the issue, copy and paste handles that just fine. TCO and holistic velocity after a few years is a separate matter. Masterful orchestration of agents could include estimation and tracking tasks with minimal overhead. That’s not what we’re seeing though…
Someone who has even a 20% better method for deck construction is gonna show me some timetables, some billed projects, and a very fancy new car. If accepting Mothra as my lord and saviour is a prerequisite to pierce an otherwise impenetrable veil of ontological obfuscation in order to see the unseeable? That deck might not be as cheap as it sounds, one way or the other.
I’m getting a nice learning and productivity bump from LLMs, there are incredible capabilities available. But premature optimization is still premature, and claims of silver bullets are yet to be demonstrated.
Here's an example from this morning. At 10:00 am, a colleague created a ticket with an idea for the music plugin I'm working on: wouldn't it be cool if we could use nod detection (head tracking) to trigger recording? That way, musicians who use our app wouldn't need a foot switch (as a musician, you often have your hands occupied).
Yes, that would be cool. An hour later, I shipped a release build with that feature fully functional, including permissions plus a calibration UI that shows if your face is detected and lets you adjust sensitivity, and visually displays when a nod is detected. Most of that work got done while I was in the shower. That is the second feature in this app that got built today.
This morning I also created and deployed a bug fix release for analytics on one platform, and a brand-new report (fairly easy to put together because it followed the pattern of other reports) for a different platform.
I also worked out, argued with random people on HN and walked to work. Not bad for five hours! Do I know how long it would have taken to, for example, integrate face detection and tracking into a C++ audio plugin without assistance from AI? Especially given that I have never done that before? No, I do not. I am bad at estimating. Would it have been longer than 30 minutes? I mean...probably?