The Future of AI Software Development

martinfowler.com

154 points by nthypes 3 hours ago


chadash - 3 hours ago

> Will LLMs be cheaper than humans once the subsidies for tokens go away? At this point we have little visibility to what the true cost of tokens is now, let alone what it will be in a few years time. It could be so cheap that we don’t care how many tokens we send to LLMs, or it could be high enough that we have to be very careful.

We do have some idea. Kimi K2 is a relatively high performing open source model. People have it running at 24 tokens/second on a pair of Mac Studios, which costs 20k. This setup requires less than a KW of power, so the $0.8-0.15 being spent there is negligible compared to a developer. This might be the cheapest setup to run locally, but it's almost certain that the cost per token is far cheaper with specialized hardware at scale.

In other words, a near-frontier model is running at a cost that a (somewhat wealthy) hobbyist can afford. And it's hard to imagine that the hardware costs don't come down quite a bit. I don't doubt that tokens are heavily subsidized but I think this might be overblown [1].

[1] training models is still extraordinarily expensive and that is certainly being subsidized, but you can amortize that cost over a lot of inference, especially once we reach a plateau for ideas and stop running training runs as frequently.

PaulHoule - 2 hours ago

Get over your FOMO:

   I walked into that room expecting to learn from people who were 
   further ahead. People who’d cracked the code on how to adopt AI at scale, 
   how to restructure teams around it, how to make it work. Some of the 
   sharpest minds in the software industry were sitting around those tables.

   And nobody has it all figured out.
People who say they have are trying to mess with your head.
wseqyrku - 40 minutes ago

> When I began in software in the 1980s I was dismissed as an “object guy” by database folks and as a “data modeler” by object folks. I've since been dismissed as a “patterns guy”, “agile guy”, “architecture guy”, “java guy”, “ruby guy”, and “anti-architect guy”. I'm now a past-it gray-beard surviving on drinking the intellectual blood of my younger colleagues. It's tasty.

I don't think you can find that level of ego anywhere in the software industry or any other industry for that matter. Respetc.

tcgv - 24 minutes ago

Martin’s framing (org and system-level guardrails like risk tiering, TDD as discipline, and platforms as “bullet trains”) matches what I’ve been seeing too.

A useful complement is the programmer-level shift: agents are great at narrow, reversible work when verification is cheap. Concretely, think small refactors behind golden tests, API adapters behind contract tests, and mechanical migrations with clear invariants. They fail fast in codebases with implicit coupling, fuzzy boundaries, or weak feedback loops, and they tend to amplify whatever hygiene you already have.

So the job moves from typing to making constraints explicit and building fast verification, while humans stay accountable for semantics and risk.

If useful, I expanded this “delegation + constraints + verification” angle here: https://thomasvilhena.com/2026/02/craftsmanship-coding-five-...

simonw - 3 hours ago

> LLMs are eating specialty skills. There will be less use of specialist front-end and back-end developers as the LLM-driving skills become more important than the details of platform usage. Will this lead to a greater recognition of the role of Expert Generalists? Or will the ability of LLMs to write lots of code mean they code around the silos rather than eliminating them?

This is one of the most interesting questions right now I think.

I've been taking on much more significant challenges in areas like frontend development and ops and automation and even UI design now that LLMs mean I can be much more of a generalist.

Assuming this works out for more people, what does this mean for the shape of our profession?

markoa - an hour ago

One thing that I'm sure of is that the agentic future is test-driven. Tests are basically executable specs the agent can follow and verify against.

When we have solid tests, the agent output is useful and we can trust it. When tests are thin or missing, the agents still ship a lot of code, but we spend way more time debugging and fixing subtle bugs.

senko - 3 hours ago

What's with the editorialized title?

The text is actually about the Thoughtworks Future of Software Development retreat.

tabs_or_spaces - an hour ago

> Will this lead to a greater recognition of the role of Expert Generalists? I've always felt that LLMs can make you average in a new area/topic/domain really quickly. But you still need expertise to make the most out of the LLM.

Personally, I'm more interested in whether software development has become more or less pay to win with LLMs?

riffraff - 3 hours ago

I think the title on HN doesn't reflect all that is in TFA, but rather the linked article[0]. Fowler's article is interesting tho.

I do like the idea that "all code is tech debt", and we shouldn't want to produce more of it than we need. But it's also worth remembering that debt is not bad per se, buying a house with a mortgage is also debt and can be a good choice for many reasons.

[0]: https://thenewstack.io/ai-velocity-debt-accelerator/

mehagar - an hour ago

It's refreshing to hear people say "We're not really sure" in public, especially from experts.

I agree that AI tools are likely to amplify the importance of quick cycles and continuous delivery.

greymalik - 2 hours ago

The headline misrepresents the source. It’s not the title of the page, not the point of the content, and biases the quote’s context: “ if traditional software delivery best practices aren’t already in place, this velocity multiplier becomes a debt accelerator”

the_harpia_io - an hour ago

the risk tiering framing is the most useful thing i've seen from this retreat content, tbh. it maps directly to how ai-generated code review actually works - you can't give equal weight to 300 lines of generated scaffolding, so you triage by risk class. auth flows, anything touching external input, config handling - slow lane. the rest gets a quick pass.

the part that's tricky is that slow lane and fast lane look identical in a PR. the framework only works if it's explicit enough to survive code review fatigue and context switching. and most teams are figuring that out as they go.

acomjean - 2 hours ago

So do we need new abstractions / languages? It seems clear that a lot of things can be pulled together by AI because it’s tedious for humans. But it seems to indicate that better tooling is needed.

adregan - 3 hours ago

In the section on security:

> One large enterprise employee commented that they were deliberately slow with AI tech, keeping about a quarter behind the leading edge. “We’re not in the business of avoiding all risks, but we do need to manage them”.

I’m unclear how this pattern helps with security vis-à-vis LLMs. It makes sense when talking about software versions, in hoping that any critical bugs are patched, but prompt injection springs eternal.

fuzzfactor - 3 hours ago

Looks to me like the people that are filthy rich [0] can afford to move so fast that even the people who are very rich in the regular way can't keep up.

[0] Which is not even enough, these are the ones with truly excess money to burn.

anthonypasq - 2 hours ago

What is up with all this nonsense about token subsidies? Dario in his recent interview with Dwarkesh made it abundantly clear that they have substantial inference margins, and they use that to justify the financing for the next training run.

Chinese open source models are dirt cheap, you can buy $20 worth of kimi-k2.5 on opencode and spam it all week and barely make a dent.

Assuming we never got bigger models, but hardware keeps improving, we'll either be serviing current models for pennies, or at insane speeds, or both.

The only actual situation where tokens are being subsidized is free tiers on chat apps, which are largely irrelevant for any sort of useful economic activity.

taeric - 2 hours ago

I really hate that we allowed "debt" to become a synonym for "liability."

This isn't a case where you have specific code/capital you have borrowed and need to pay for its use or give it back. This is flat out putting liabilities into your assets that will have to be discovered and dealt, someday.

empath75 - an hour ago

So here are a few things i have been thinking of: --- It's not 2 pizza teams, it's 2 people teams. You no longer need 4 people on a team just working on features off of a queue, you just need 2 people making technical decisions and managing agents. --- Code used to be expensive to create. It was only economical to write code if it was doing high value work or work that would be repeated many times over a long period of time.

Now producing code is _cheap_. You can write and run code in an automated way _on demand_. But if you do that, you have essentially traded upfront cost for run time cost. It's really only worth it if the work is A) high value and B) intermittent.

There is probably a formula you can write to figure out where this trade off makes sense and when it doesn't.

I'm working on a system where we can just chuck out autonomous agents onto our platform with a plain text description, and one thing I have been thinking about is tracking those token costs and figuring out how to turn agentic workflows into just normal code.

I've been thinking about running an agent that watches the other agents for cost and reads their logs ono a schedule to see if any of what the agents are doing can be codified and turned into a normal workflow, and possibly even _writing that workflow itself_.

It would be analogous to the JVM optimizing hot-path functions... ---

What I do know is that what we are doing for a living will be near unrecognizable in a year or two.

mamma_mia - 2 hours ago

mamma mia! out with the old in with the new, soon github will be like a warehouse full of old punchcards

siliconc0w - 2 hours ago

Even with the latest SOTA models - I still consistently find issues. Performance, security, memory leaks, bad assumptions/instruction following, and even levels of laziness/gaslighting/dishonesty. I spend less time authoring changes but a lot more time reviewing and validating changes. And that is using the best models (Opus 4.6/Codex 5.3), the OSS/flash models are still quite unreliable at solving problems.

Token costs are also non-trivial. Claude can exhaust a $20/month session limit with one difficult problem (didn't even write code, just planned). Each engineer needs at least the $200/mo plan - I have multiple plans from multiple providers.

deadbabe - 2 hours ago

There have been some back of the napkin estimates on what AI could cost from the major platforms once no longer subsidized. It does not look good, as there is a minimum of a 12x increase in costs.

Local or self hosted LLMs will ultimately be the future. Start learning how to build up your own AI stack and use it day to day. Hopefully hardware catches up so eventually running LLMs on device is the norm.

christkv - 2 hours ago

My bet is that the amount of work needed per token generated will decrease over time and the models will become smaller for the same performance as we learn to optimize so cost and needed hardware will go down

raphaelmolly8 - 2 hours ago

[dead]

clockworkhavoc - 2 hours ago

Martin Fowler, longtime associate of the sanctioned fugitive and CCP-backed funder of domestic terrorism, Neville Roy Singham? https://oversight.house.gov/wp-content/uploads/2025/09/Lette...