Thousands of CEOs just admitted AI had no impact on employment or productivity
fortune.com326 points by virgildotcodes 5 hours ago
326 points by virgildotcodes 5 hours ago
Just to be clear, the article is NOT criticizing this. To the contrary, it's presenting it as expected, thanks to Solow's productivity paradox [1].
Which is that information technology similarly (and seemingly shockingly) didn't produce any net economic gains in the 1970's or 1980's despite all the computerization. It wasn't until the mid-to-late 1990's that information technology finally started to show clear benefit to the economy overall.
The reason is that investing in IT was very expensive, there were lots of wasted efforts, and it took a long time for the benefits to outweigh the costs across the entire economy.
And so we should expect AI to look the same -- it's helping lots of people, but it's also costing an extraordinary amount of money, and the few people it's helping is currently at least outweighed by the people wasting time with it and its expense. But, we should recognize that it's very early days, and that productivity will rise with time, and costs will come down, as we learn to integrate it with best practices.
The comparison seems flawed in terms of cost.
A Claude subscription is 20 bucks per worker if using personal accounts billed to the company, which is not very far from common office tools like slack. Onboarding a worker to Claude or ChatGPT is ridiculously easy compared to teaching a 1970’s manual office worker to use an early computer.
Larger implementations like automating customer service might be more costly, but I think there are enough short term supposed benefits that something should be showing there.
What if LLMs are optimizing the average office worker's productivity but the work itself simply has no discernable economic value? This is argued at length in Grebber's Bullshit Jobs essay and book.
This is an underrated take. If you make someone 3x faster at producing a report nobody reads, you've improved nothing. The real gains from AI show up when it changes what work gets done, not just how fast existing work happens. Most companies are still in the "do the same stuff but with AI" phase.
Not a phase, I’d argue that 90% of modern jobs are bullshit to keep cattle occupied and economy rolling.
Your claim and the claims that all white collar jobs are going to disappear in 12-18 months cannot both be true. I guess we will see.
I find that highly unlikely, coding is the AIs best value use case by far. Right now office workers see marginal benefits but it's not like it's an order of magnitude difference. AI drafts an email, you have to check and edit it, then send it. In many cases it's a toss up if that actually saved time, and then if it did, it's not like the pace of work is break neck anyway, so the benefit is some office workers have a bit more idle time at the desk because you always tap some wall that's out of your control. Maybe AI saves you a Google search or a doc lookup here and there. You still need to check everything and it can cause mistakes that take longer too. Here's an example from today.
Assistant is dispatching a courier to get medical records. AI auto completes to include the address. Normally they wouldn't put the address, the courier knows who we work with, but AI added it so why not. Except it's the wrong address because it's for a different doctor with the same name. At least they knew to verify it, but still mistakes like this happening at scale is making the other time savings pretty close to a wash.
Not all code generates economic value. See slacks, jiras, etc constant ui updates.
LLMs might not save time but they certainly increase quality for at least some office work. I frequently use it to check my work before sending to colleagues or customers and it occasionally catches gaps or errors in my writing.
But that idealized example could also be offset by another employee who doubles their own output by churning out lower-quality unreviewed workslop all day without checking anything, while wasting other people's time.
> but the work itself simply has no discernable economic value? This is argued at length in Grebber's Bullshit Jobs essay and book.
That book was very different than what I expected from all of the internet comment takes about it. The premise was really thin and did't actually support the idea that the jobs don't generate value. It was comparing to a hypothetical world where everything is perfectly organized, everyone is perfectly behaved, everything is perfectly ordered, and therefore we don't have to have certain jobs that only exist to counter other imperfect things in society.
He couldn't even keep that straight, though. There's a part where he argues that open source work is valuable but corporate programmers are doing bullshit work that isn't socially productive because they're connecting disparate things together with glue code? It didn't make sense and you could see that he didn't really understand software, other than how he imagined it fitting into his idealized world where everything anarchist and open source is good and everything corporate and capitalist is bad. Once you see how little he understands about a topic you're familiar with, it's hard to unsee it in his discussions of everything else.
That said, he still wasn't arguing that the work didn't generate economic value. Jobs that don't provide value for a company are cut, eventually. They exist because the company gets more benefit out of the job existing than it costs to employ those people. The "bullshit jobs" idea was more about feelings and notions of societal impact than economic value.
> They exist because the company gets more benefit out of the job existing than it costs to employ those people.
Not necessarily, I’ve seen a lot of jobs that were just flying under the radar. Sort of like a cockroach that skitters when light is on but roams freely in the dark.
Hmmm, I got something different. I thought that Bullshit Jobs was based on people who self reported that their jobs were pointless. He detailed these types of jobs, the negative psychological impact this can have on employees, and the kicker was that these jobs don't make sense economically, the bureaucratization of the health care and education sectors for example, in contrast so many other professions that actually are useful. Other examples were status-symbol employees, sycophants, duct-tapers, etc.
I thought he made a case for both societal and economic impact.
> It was comparing to a hypothetical world where everything is perfectly organized, everyone is perfectly behaved, everything is perfectly ordered, and therefore we don't have to have certain jobs that only exist to counter other imperfect things in society.
> Jobs that don't provide value for a company are cut, eventually.
Uhm, seems like Greaber is not the only one drawing conclusions from a hypothetical perfect world
Greaber’s best book is his ethnography “Lost People” and it’s one of his least read works. Bullshit Jobs was never intended to be read as seriously as it is criticized.
Honestly this is how every critique of Graeber goes in my experience: As soon as his works are discussed beyond surface level, the goalposts start zooming around so fast that nothing productive can be discussed.
I tried to respond to the specific conversation about Bullshit Jobs above. In my experience, the way this book is brought up so frequently in online conversations is used as a prop for whatever the commenter wants it to mean, not what the book actually says.
I think Graeber did a fantastic job of picking "bullshit jobs" as a topic because it sounds like something that everyone implicitly understands, but how it's used in conversation and how Graeber actually wrote about the topic are basically two different things
I think it’s more likely that the same amount of work is getting done, just it’s far less taxing. And that averages are funny things, for developers it’s undeniably a huge boost, but for others it’s creating friction.
Bullshit Jobs is one of those "just so" stories that seems truthy but doesn't stand up to any critical evaluation. Companies are obviously not hesitant to lay off unproductive workers. While in large enterprises there is some level of empire building where managers hire more workers than necessary just to inflate their own importance, in the long run those businesses fall to leaner competitors.
> Companies are obviously not hesitant to lay off unproductive workers.
Companies are obviously not hesitant to lay off anyone, especially for cost saving. It is interesting how you think that people are laid off because they’re unproductive.
> in the long run those businesses fall to leaner competitors
This is not true at all. You can find plenty of examples going either way but it’s far from truth from being a universal reality
It's only after decades of experience and hindsight that you realize that a lot of the important work we spend our time on has extremely limited long-term value.
Maybe you're lucky enough to be doing cutting edge research or do something that really seriously impacts human beings, but I've done plenty of "mission critical right fucking now" work that a week from now (or even hours from now, when I worked for a content marketing business) is beyond irrelevant. It's an amazing thing watching marketing types set money on fire burning super expensive developer time (but salaried, so they discount the cost to zero) just to make their campaigns like 2-3% more efficient.
I've intentionally sat on plenty of projects that somebody was pushing really hard for because they thought it was the absolute right necessary thing at the time and the stakeholder realized was pointless/worthless after a good long shit and shower. This one move has saved literally man years of work to be done and IMO is the #1 most important skill people need to learn ("when to just do nothing").
And that book sort of vaguely hints around at all these jobs that are surely bullshit but won’t identify them concretely.
Not recognizing the essential role of sales seemed to be a common mistake.
What counts as “concretely”? And I don’t recall it calling sales bullshit.
It identified advertising as part of the category that it classed as heavily-bullshit-jobs for reason of being zero-sum—your competitor spends more, so you spent more to avoid falling behind, standard red queen’s race. (Another in this category was the military, which is kinda the classic case of this—see also, the Missile Gap, the dreadnought arms race, et c.) But not sales, IIRC.
> And I don’t recall it calling sales bullshit.
It says stuff like why can’t a customer just order from an online form? The employee who helps them doesn’t do anything except make them feel better. Must be a bullshit job. It talks specifically about my employees filling internal roles like this.
> advertising
I understand the arms race argument, but it’s really hard to see what an alternative looks like. People can spend money to make you more aware of something. You can limit some modes, but that kind of just exists.
I don’t see how they aren’t performing an important function.
How does that make advertising a bullshit job? The only way advertising won't exist or won't be needed is when humanity becomes a hive mind and removes all competition.
Best product should be picked according to requirements by LLM without bullshit advertising.
The parts that are only done to maintain status quo with a competitor aren’t productive, and that’s quite a bit of it. Two (or more) sides spend money, nothing changes. No good is produced. The whole exercise is basically an accident.
Like when a competing country builds their tenth battleship, so you commission another one to match them. The world would have been better if neither had been build. Money changed hands (one supposes) but the aim of the whole exercise had no effect. It was similar to paying people to dig holes a fill them back in again, to the tune of serious money. This was so utterly stupid and wasteful that there was a whole treaty about it, to try to prevent so many bullshit jobs from being created again.
Or when Pepsi increases their ad spending in Brazil, so Coca Cola counters, and much of the money ends up accomplishing little except keeping things just how they were. That component or quality of the ad industry, the book claims, is bullshit, on account of not doing any good.
The book treats of several ways in which a job might be bullshit, and just kinda mentions this one as an aside: the zero-sum activity. It mostly covers other sorts, but this is the closest I can recall it coming to declaring sales “bullshit” (the book rarely, bordering on never, paints even most of an entire industry or field as bullshit, and advertising isn’t sales, but it’s as close as it got, as I recall)
Would hardly drag Graeber into this, theres a laundry list of issues with his research.
Most "Bullshit Jobs" can already be automated, but can isnt always should or will. Graeber is a capex thinker in an opex world.
The thesis of Bullshit Jobs is almost universally rejected by economists, FYI. There’s not much of value to obtain from the book.
How viable are the $20/month subscriptions for actual work and are they loss making for Anthropic? I've heard both of people needing to get higher tiers to get anything done in Claude Code and also that the subscriptions are (heavily?) subsidized by Anthropic, so the "just another $20 SaaS" argument doesn't sound too good.
I am confident that Anthropic make revenue from that $20 than the electricity and server costs needed to serve that customer.
Claude Code has rate limits for a reason: I expect they are carefully designed to ensure that the average user doesn't end up losing Anthropic money, and that even extreme heavy users don't cause big enough losses for it to be a problem.
Everything I've heard makes me believe the margins on inference are quite high. The AI labs lose money because of the R&D and training costs, not because they're giving electricity and server operational costs away for free.
I always assumed that with inference being so cheap, my subscription fees were paying for training costs, not inference.
Anthropic and OpenAI are both well documented as losing billions of dollars a year because their revenue doesn't cover their R&D and training costs, but that doesn't mean their revenue doesn't cover their inference costs.
Does it matter if they can't ever stop training though? Like, this argument usually seems to imply that training is a one-off, not an ongoing process. I could save a lot of money if I stopped eating, but it'd be a short lived experiment.
I'll be convinced they're actually making money when they stop asking for $30 billion funding rounds. None of that money is free! Whoever is giving them that money wants a return on their investment, somehow.
It matters because as long as they are selling inference for less than it costs to serve they have a potential path to profitability.
Training costs are fixed at whatever billions of dollars per year.
If inference is profitable they might conceivably make a profit if they can build a model that's good enough to sign up vast numbers of paying customers.
If they lose even more money on each new customer they don't have any path to profitability at all.
There's an argument to be made that a "return on investment by way of eliminating all workers" is a reasonable result for the capitalists.
Nobody questions that Anthropic makes revenue from a $20 subscription. The opposite would be very strange.
A lot of people believe that Anthropic lose money selling tokens to customers because they are subsidizing it for growth.
Yeah it's the caching that's doing the work for them though honestly. So many cached queries saving the GPUs from hard hits.
>make revenue from that $20 than the electricity and server costs needed to serve that customer
Seems like a pretty dumb take. It’s like saying it only takes $X in electricity and raw materials to produce a widget that I sell for $Y. Since $Y is bigger than $X, I’m making money! Just ignore that I have to pay people to work the lines. Ignore that I had to pay huge amounts to build the factory. Ignore every other cost.
They can’t just fire everyone and stop training new models.
Merely for the viability part: I use the $20/mo plan now, but only as a part-time independent dev. I will hit rate-limits with Opus on any moderately complex app.
If I am on a roll, I will flip on Extra Usage. I prototyped a fully functional and useful niche app in ~6 total hours and $20 of extra usage, and it's solid enough and proved enough value to continue investing in and eventually ship to the App store.
Without Claude I likely wouldn't have gotten to the finished prototype version to use in the real world.
For Indy dev, I think LLMs are a new source of solutions. This app is too niche to justify building and marketing without LLM assistance. It likely won't earn more than $25k/year but good enough!
I don't think the assumption that Anthropic is losing money on subscriptions holds up. I think each additional customer provides more revenue than the cost to run their inference, on average.
For people doing work with LLMs as an assistant for codebase searching, reviews, double checks, and things like that the $20/month plan is more than fine. The closer you get to vibecoding and trying to get the LLM to do all the work, the more you need the $100 and $200 plans.
On the ChatGPT side, the $20/month subscription plan for GPT Codex feels extremely generous right now. I tried getting to the end of my window usage limit one day and could not.
> so the "just another $20 SaaS" argument doesn't sound too good
Having seen several company's SaaS bills, even $100/month or $200/month for developers would barely change anything.
id guess the 200 subscription sufficient per person.
but at that point you could go for a bugger one and split amongst headcount
$20 is not useable, need $100 plan at least for development purposes. That is a lot of money for some countries. In my country, that can be 1/10 of their monthly salary. Hard to get approval on it. It is still too expensive right now.
Agreed.
We do have a way to see the financial impact - just add up Anthropic and oAI's reported revenues -> something like $30b in annual run rate. Given growth rates, (stratospheric), it seems reasonable to conclude informed buyers see economic and/or strategic benefit in excess of their spend. I certainly do!
That puts the benefits to the economy at just around where Mastercard's benefits are, on a dollar basis. But with a lot more growth. Add something in there for MS and GOOG, and we're probably at least another $5b up. There are only like 30 US companies with > $100bn in revenues; at current growth rates, we'll see combined revenues in this range in a year.
All this is sort of peanuts though against 29 trillion GDP, 0.3%. Well not peanuts, it's boosting the US GDP by 10% of its historical growth rate, but the bull case from singularity folks is like 10%+ GDP growth; if we start seeing that, we'll know it.
All that said, there is real value being added to the economy today by these companies. And no doubt a lot of time and effort spent figuring out what the hell to do with it as well.
Investors are optimistic, but what will this new tech be used for? Advertising? Propaganda? Surveillance? Drone strikes?
Does profitable always equal useful? Might other cultures justifiably think differently, like the Amish?
The Amish are skilled at getting cash from the “English” as they call non-Amish. I imagine they also think that the money they receive is roughly tied to value they create. I wasn’t talking valuations, just revenue - money that CFOs and individuals spent so far, and are planning on spending.
I also didn’t talk profitable. Upshot, though, I don’t think it’s just a US thing to say that when money exchanges hands, generally both parties feel they are better off, and therefore there is value implied in a transaction.
As to what it will be used for: yes.
You did specify revenue. The original comment mentioned benefits. I was thinking that the two are different.
>I think there are enough short term supposed benefits that something should be showing there.
As measured by whom? The same managers who demanded we all return to the office 5 days a week because the only way they can measure productivity is butts in seats?
If anything, the 'scariness' of an old computer probably protected the company in many ways. AI's approachability to the average office worker, specifically how it makes it seem like it easy to deploy/run/triage enterprise software, will continue to pwn.
I've never looked at enterprise licensing, but regular license wise, a Claude subscription is actually $200 a month. I don't count the $20 or $100 tiers because they're too limited to be useful (especially professionally!)
It’s also pretty wild to me how people still don’t really even know how to use it.
On hacker news, a very tech literate place, I see people thinking modern AI models can’t generate working code.
The other day in real life I was talking to a friend of mine about ChatGPT. They didn’t know you needed to turn on “thinking” to get higher quality results. This is a technical person who has worked at Amazon.
You can’t expect revolutionary impact while people are still learning how to even use the thing. We’re so early.
I don't think "results don't match promises" is the same as "not knowing how to use it". I've been using Claude and OpenAI's latest models for the past two weeks now (probably moving at about 1000 lines of code a day, which is what I can comfortably review), and it makes subtle hard-to-find mistakes all over the place. Or it just misunderstands well known design patterns, or does something bone headed. I'm fine with this! But that's because I'm asking it to write code that I could write myself, and I'm actually reading it. This whole "it can build a whole company for me and I don't even look at it!" is overhype.
Prompting LLMs for code simply takes more than a couple of weeks to learn.
It takes time to get an intuition for the kinds of problems they've seen in pre-training, what environments it faced in RL, and what kind of bizarre biases and blindspots it has. Learning to google was hard, learning to use other peoples libraries was hard, and its on par with those skills at least.
If there is a well known design pattern you know, thats a great thing to shout out. Knowing what to add to the context takes time and taste. If you are asking for pieces so large that you can't trust them, ask for smaller pieces and their composition. Its a force multiplier, and your taste for abstractions as a programmer is one of the factors.
In early usenet/forum days, the XY problem described users asking for implementation details of their X solution to Y problem, rather than asking how to solve Y. In llm prompting, people fall into the opposite. They have an X implementation they want to see, and rather than ask for it, they describe the Y problem and expect the LLM to arrive at the same X solution. Just ask for the implementation you want.
Asking bots to ask bots seems to be another skill as well.
You are assuming that we all work on the same tasks and should have exactly the same experience with it, which is it course far from the truth. It's probably best to start with that base assumption and work on the implications from there.
As for the last example, for all the money being spent on this area, if someone is expected to perform a workflow based on the kind of question they're supposed to ask, that's a failure in the packaging and discoverability aspect of the product, the leaky abstraction only helps some of us who know why it's there.
I’ve been helping normal people at work use AI and there’s two groups that are really struggling:
1. People who only think of using AI in very specific scenarios. They don’t know when you use it outside of the obvious “to write code” situations and they don’t really use AI effectively and get deflated when AI outputs the occasional garbage. They think “isn’t AI supposed to be good at writing code?”
2. People who let AI do all the thinking. Sometimes they’ll use AI to do everything and you have to tell them to throw it all away because it makes no sense. These people also tend to dump analyses straight from AI into Slack because they lack the tools to verify if a given analysis is correct.
To be honest, I help them by teaching them fairly rigid workflows like “you can use AI if you are in this specific situation.” I think most people will only pick up tools effectively if there is a clear template. It’s basically on-the-job training.
A neighbour of me has a PhD and is working in research at a hospital. He is super smart.
Last time he said: "yes yes I know about ChatGPT, but I do not use it at work or home."
Therefore, most people wont even know about Gemini, Grok or even Claude.
And it will get worse once the UX people get ahold of it.
You got that right . .. imagine AI making more keyboard shortcuts, "helping" wayland move off X more so, new window transistions, overhauling htmx ... it'll be hell+ on earth.
In a WhatsApp group full of doctors, managers, journalist and engineers (including software) in age of 30-60 I asked if anyone heard of openclaw and only 3 people heard of it from influencers, none used it.
But from my social feed the impression was that it is taking over the world:)
I asked it because I am building something similar since some tome and I thought its over they were faster than me but as it appears there’s no real adoption yet. Maybe there will be some once they release it as part of ChatGPT but even then it looks like too early as actually few people are using the more advanced tools.
It’s definitely in very early stage. It appears that so far the mainstream success in AI is limited to slop generation and even that is actually small number of people generating huge amounts of slop.
> I asked if anyone heard of twitter vaporware and only 3 people heard of it from influencers, none used it.
Shocking results, I say!
No, these people ("managers, engineers" etc.) do just not work in tech & IT but in other fields and they do not read tech news in your country etc.
Most people are just "not that deep in there" as most people on HN.
> I asked it because I am building something similar since some tome and I thought its over they were faster than me
If you have been working on a usecase similar to OpenClaw for sometime now I'd actually say you are in a great position to start raising now.
Being first to market is not a significant moat in most cases. Few people want to invest in the first company in a category - it's too risky. If there are a couple of other early players then the risk profile has been reduced.
That said, you NEED to concentrate on GTM - technology is commodified, distribution is not.
> It appears that so far the mainstream success in AI is limited to slop generation and even that is actually small number of people generating huge amounts of slop
The growth of AI slop has been exponential, but the application of agents for domain specific usecases has been decently successful.
The biggest reason you don't hear about it on HN is because domain-specific applications are not well known on HN, and most enterprises are not publicizing the fact that they are using these tools internally.
Furthermore, almost anyone who is shipping something with actual enterprise usage is under fairly onerous NDAs right now and every company has someone monitoring HN like a hawk.
> every company has someone monitoring HN like a hawk.
Monitoring specific user accounts or keywords? Is this typically done by a social media reputation management service?
Do you think that it is a good idea to release it first on iOS, announce on HN and Producthunt? How would you do?
On my app the tech is based on running agent generated code on JavaScriptCore to do things like OpenClaw, I’m wrapping the JS engine with the missing functionality like networking, file access and database access so I believe I will not have a problem with releasing it on Apple AppStore as I use their native stack. Then since this stack is also OS, I’m making a version that will run on Linux, the idea being users develops their solution on their device(iOS&Mac currently) see it working and and then deploys on a server with a tap of a button, so it keeps running.
Who's your persona? How are you pricing and packaging? Who is your buyer? Are you D2C? Consumer? Replacing EAs? Replacing Project Managers? ...
You need to answer these questions in order to decide whether a Show HN makes sense versus a much more targeted launch.
If you do not know how to answer these questions you need to find a cofounder asap. Technology is commodified. GTM, sales, and packaging is what turns technology into products. Building and selling and fundraising as 1 person is a one-way ticket to burnout, which only makes you and your product less attractive.
I also highly recommend chatting with your network to understand common types of problems. Once you've identified a couple classes of problems and personas for whom your story resonates, then you can decide what approach to take.
Best of luck!
The persona is, someone who knows what are they doing but need someone to actually automate their work routine. I.e. maybe it’s a crypto trader that makes decisions on signals interpretation so they can create a trading bot that executes on their method. Maybe its a compliance who needs automate some routine like checking details further when some conditions arise. Or maybe a social media manager that needs to moderate their channels.
Thanks for the advice! I’m at a stage where I want to have such tool and see who else wants it. Not sure yet about it’s viability as a business and what is the exact market. Maybe I will find out by putting it into the wild and that’s why I consider to release it as a mobile app first.
> On hacker news, a very tech literate place
I think this is the prior you should investigate. That may be what HN used to be. But it's been a long time since it has been an active reality. You can still see actual expert opinions on HN, but they are the minority more and more.
I think one longtime HN user (Karrot_Kream I think) pinpointed the change in HN discourse to sometime in mid 2022 to early 2023 when the rate of new users spiked to 40k per month and remained at that elevated rate.
From personal experience, I've also noticed that some of the most toxic discourse and responses I've received on this platform are overwhelmingly from post-2022 users.
> I see people thinking modern AI models can’t generate working code.
Really? Can you show any examples of someone claiming AI models cannot generate working code? I haven't seen anyone make that claim in years, even from the most skeptical critics.
Scroll up a few comments where someone said Claude is generating errors over and over again and that Claude cant work according to code guidelines etc :-))
I've seen it said plenty of the times that the code might work eventually (after several cycles of prompting and testing), but even then the code you get might not be something you'd want to maintain, and it might contain bugs and security issues that don't (at least initially) seem to impact its ability to do whatever it was written to do but which could cause problems later.
And really the problem isn’t that it can’t make working code, the problem is that it’ll never get the kind of context that is in your brain.
I started working today on a project I hadn’t touched in a while but I now needed to as it was involved in an incident where I needed to address some shortcomings. I knew the fix I needed to do but I went about my usual AI assisted workflow because of course I’m lazy the last thing I want to do is interrupt my normal work to fix this stupid problem.
The AI doesn’t know anything about the full scope of all the things in my head about my company’s environment and the information I need to convey to it. I can give it a lot of instructions but it’s impossible to write out everything in my head across multiple systems.
The AI did write working code, but despite writing the code way faster than me, it made small but critical mistakes that I wouldn’t have made on my first draft.
For example, it just added in a command flag that I knew that it didn’t need, and it actually probably should have known it, too. Basically it changed a line of code that it didn’t need to touch.
It also didn’t realize that the curled URL was going to redirect so we needed an -L flag. Maybe it should have but my brain knew it already.
It also misinterpreted some changes in direction that a human never would have. It confused my local repository for the remote one because I originally thought I was going to set a mirror, but I changed plans and used a manual package upload to curl from. So it out the remote URL in some places where the local one should have been.
Finally, it seems to have just created some strange text gore while editing the readme where it deleted existing content for seemingly no reason other than some kind of readline snafu.
So yes it produced very fast great code that would have taken me way longer to do, but I had to go back and consume a very similar amount of time to fix so many things that I might as well have just done it manually.
But hey I’m glad my company is paying $XX/month for my lazy workday machine.
>>The AI doesn’t know anything about the full scope of all the things in my head about my company’s environment and the information I need to convey to it.<<
This is your problem: How should it know if you do not provide it?
Use Claude - in the pro version you can submit files for each project which are setting the context: This can be files, source code, SQL scripts, screenshots whatever - then the output will be based on your context given by providing these files.
For more on this exact topic and an answer to Solow’s Paradox, see, the excellent, The Dynamo and the Computer by Paul David [0].
[0]: https://www.almendron.com/tribuna/wp-content/uploads/2018/03...
Stanford prof rebutts David's idea[0] that it's difficult to extract productivity from the data
https://www.nber.org/system/files/working_papers/w25148/w251...
I don't agree that real GDP measures what he thinks it measures, but he opines
>Data released this week offers a striking corrective to the narrative that AI has yet to have an impact on the US economy as a whole. While initial reports suggested a year of steady labour expansion in the US, the new figures reveal that total payroll growth was revised downward by approximately 403,000 jobs. Crucially, this downward revision occurred while real GDP remained robust, including a 3.7 per cent growth rate in the fourth quarter. This decoupling — maintaining high output with significantly lower labour input — is the hallmark of productivity growth.
https://www.ft.com/content/4b51d0b4-bbfe-4f05-b50a-1d485d419...
[0] on the basis that IT and AI are not general technologies in the mold of the dynamo, keyword "intangibles", see section 4 p21, A method to measure intangibles
Fwiw fortune had another article this week saying this J-curve of "General Technology" is showing up in the latest BLS data
https://fortune.com/2026/02/15/ai-productivity-liftoff-doubl...
Source of the Stanford-approved opinion: https://www.ft.com/content/4b51d0b4-bbfe-4f05-b50a-1d485d419...
> It wasn't until the mid-to-late 1990's that information technology finally started to show clear benefit to the economy overall.
The 1990s boom was in large part due to connectivity -- millions[1] of computers joined the Internet.
[1] _ In the 1990s. Today, there are billions of devices connected, most of them Android devices.
Wow I didn’t realize that. But I always thought it. I was bewildered that anyone got any real value out of any of that pre-VisiCalc (or even VisiCalc) computer tech for business. It all looked kinda clumsy.
The coding tools are not hard to pick up. Agent chat and autocomplete in IDE's are braindead simple, and even TUI's like Claude are extremely easy to pickup (I think it took me a day?) And despite what the vibers like to pretend, learning to prompt them isn't that hard either. Or, let me clarify, if you know how to code, and you know how you want something coded, prompting them isn't that hard. I can't imagine it'll take that long for an impact to be seen, if there is a major impact to be seen.
I think it's more likely that people "feel" more productive, and/or we're measuring bad things (lines of code is an awful way to measure productivity -- especially considering that these agents duplicate code all the time so bloat is a given unless you actively work to recombine things and create new abstractions)
It reminds me a lot of adderall's effect on people without ADHD. A pretty universal feeling that it's making you smarter, paired with no measurable increase in test scores.
That's a good analogy. I've never done stimulants, but from what I've heard about them they make people very active but that isn't the same as productive.
One part of the system moving fast doesn't change the speed of the system all that much.
The thing to note is, verifying if something got done is harder and takes time in the same ballpark as doing the work.
If people are serious about AI productivity, lets start by addressing how we can verify program correctness quickly. Everything else is just a Ferrari between two traffic red lights.
Really? I disagree that verifying is as hard as doing the work yourself. It’s like P != NP.
productivity may rise with time, and costs may come down. The money is already spent
> And so we should expect AI to look the same
Is that somewhat substantiated assumption? I recall learning on University in 2001 the history of AI and that initial frameworks were written in 70's and that prediction was we will reach human-like intelligence by 2000. Just because Sama came up with this somewhat breakthrough of an AI, it doesn't mean that equal improvement leaps will be done on a monthly/annual basis going forward. We may as well not make another huge leaps or reach what some say human intelligence level in 10 years or so.
> it's helping lots of people, but it's also costing an extraordinary amount of money
Is it fair to say that wall street is betting America's collective pensions on AI...
They're betting a lot more than that, but since all their chips are externalities they don't care.
Very few people have pensions anymore. People now direct their own retirement funds.
That's what he was saying. Wall Street (the stock market) are people's "pensions" now because everyone has a 401k or equivalent so their retirement is tied to the market. Thus, these companies are betting America's collective retirement on AI...
My experience has been
* If I don't know how to do something, llms can get me started really fast. Basically it distills the time taken to research something to a small amount.
* if I know something well, I find myself trying to guide the llm to make the best decisions. I haven't reached the state of completely letting go and trusting the llm yet, because the llm doesn't make good long term decisions
* when working alone, I see the biggest productivity boost in ai and where I can get things done.
* when working in a team, llms are not useful at all and can sometimes be a bottleneck. Not everyone uses llms the same, sharing context as a team is way harder than it should be. People don't want to collaborate. People can't communicate properly.
* so for me, solo engineers or really small teams benefit the most from llms. Larger teams and organizations will struggle because there's simply too much human overheads to overcome. This is currently matching what I'm seeing in posts these days
The future of work is fewer human team members and way more AI assistants.
My compsci brain suggests large orgs are a distributed system running on faulty hardware (humans) with high network latency (communication). The individual people (CPUs) are plenty fast, we just waste time in meetings, or waiting for approval, or a lot of tasks can't be parallelized, etc. Before upgrading, you need to know if you're I/O Bound vs CPU Bound.
When my company first started pushing for devs to use AI, the most senior guy on my team was pretty vocal about coding not being the bottleneck that slowed down work. It was an I/O issue, and maybe a caching issue as well from too many projects going at the same time with no focus… which also makes the I/O issues worse.
Ironically using Ai on records of meetings across an org is amazing. If you can find out what everyone is talking about you can talk to them.
Privacy is non existent, every word said and message sent at the office is recorded but the benefits we saw were amazing.
Maybe experienced people are the L2 cache? And the challenge is to keep the cache fresh and not too deep. You want institutional memory available quickly (cache hit) to help with whatever your CPU people need at that instant. If you don´t have a cache, you can still solve the problem, but oof, is it gonna take you a long time. OTOH, if you get bad data in the cache, that is not good, as everyone is going be picking that out of the cache instead of really figuring out what to do.
L2? I'm hot L1 material, dude.
But I like your and OP's analogy. Also, the productivity claims are coming from the guys in main memory or even disk, far removed from where the crunching is taking place. At those latency magnitudes, even riding a turtle would appear like a huge productivity gain.
operationally, i think new startups have a big advantage on setting up to be agent-first, and they might not be as good as the old human first stuff, but theyll be much cheaper and nimble for model improvements
Start ups mostly move fast skipping the necessary ceremony which large corps have to do mandatorily to prevent a billion dollar product from melting. Its possible for start ups because they don't have a billion dollar to start with.
Once you do have a billion dollar product protecting it requires spending time, money and people to keep running. Because building a new one is a lot more effort than protecting existing one from melting.
Interesting analogy to explore a Distributed System as compared to Organizational Dynamics.
Then where are all the amazing open source programs written by individuals by themselves? Where are all the small businesses supposedly assisted by AI?
> 4% of GitHub public commits are being authored by Claude Code right now. At the current trajectory, we believe that Claude Code will be 20%+ of all daily commits by the end of 2026.
https://newsletter.semianalysis.com/p/claude-code-is-the-inf...
There’s lots of slop out there, that doesn’t mean it’s actually good or useful code.
Keep moving those goal posts.
Doesn’t look like goal-post moving to me. GP argued that AI isn’t making a difference, because if it was, we’d see amazing AI-generated open source projects. (Edit: taking a second look, that’s not exactly what GP said, but that’s what I took away from it. Obviously individuals create open source projects all the time.)
You rebutted by claiming 4% of open source contributions are AI generated.
GP countered (somewhat indirectly) by arguing that contributions don’t indicate quality, and thus wasn’t sufficient to qualify as “amazing AI-generated open source projects.”
Personally, I agree. The presence of AI contributions is not sufficient to demonstrate “amazing AI-generated open-source projects.” To demonstrate that, you’d need to point to specific projects that were largely generated by AI.
The only big AI-generated projects I’ve heard of are Steve Yegge’s GasTown and Beads, and by all accounts those are complete slop, to the point that Beads has a community dedicated to teaching people how to uninstall it. (Just hearsay. I haven’t looked into them myself.)
So at this point, I’d say the burden of proof is on you, as the original goalposts have not been met.
Edit: Or, at least, I don’t think 4% is enough to demonstrate the level of productivity GP was asking for.
It's not a great ask. Who's going to quantify what is 'amazing open source work'?
4% for a single tool used in a particular way (many are out there using AI tools in a way that doesn't make it clear the code was AI authored) is an incredible amount. Don't see how you can look at that and see 'not enough'.
The vast majority of people using these tools aren't announcing it to the world. Why would they ? They use it, it works and that's that.
It has been argued for a very long time, lines of code is largely meaningless as a metric. But now that AI is writing those lines... it seems to be meaningful again? I continue to be optimistically skeptical.
> where are all the amazing open source programs
> amazing
Nobody moved the goal posts.
They didn’t, amazing open source was asked for, meaningless stats were given. Not that GitHub public repositories were amazing before AI, but nothing has changed since, except AI slop being a new category.
I deliberately asked for amazing open source projects. I’ve yet to see a single AI coded project i would use.
Keep licking those boots.
Here are a few of mine from the past month - for all of them 90%+ of the code written by Claude Code:
- https://github.com/simonw/sqlite-history-json
- https://github.com/simonw/sqlite-ast
- https://github.com/simonw/showboat
- https://github.com/simonw/datasette-showboat
You could have easily made the same point and just not included the last sentence. Guidelines an all that
Seemingly every day on Show HN?
Also small businesses aren't going to publish blog posts saying "we saved $500 on graphic design this week!"
Is saving 500$ by generating some shitty AI art the bar? I thought this supposed to replace entire departments
Someone asked “where are all the small businesses”, this was a reply to that. Small businesses don’t have entire art departments.
Gotcha, so the impact of AI is small businesses get to save a couple hundred dollars and the cost is only 2% of your countries GDP. That’s good.
Prior to industrialization if you wanted to paint something you had to know how to mix your own paints.
And make your own brushes.
Before the printing presses came along, putting up flyers was not even imaginable.
Signs for businesses used to hand carved.
Then printed. A store sign was still produced by a team of professionals, but small businesses coils reasonably afford to print a sign. Not often updated, but it existed.
Then desktop publishing took off. Now lone graphic designers could design and send work off to a print shop. Small businesses could now afford regularly updated menus, signage, and even adverts and flyers.
Now small businesses can make their own creatives. AI can change stylesheets, write ad copy, and generate promotional photos.
Does any of this have the artistry of hand carved signs from 600 years ago? Of course not.
But the point is technology gives individuals control.
None of this is even slightly correct lol
People have been painting with red and yellow ochre and soot for at least 50K years for sure, and probably several hundred thousand years in truth. You don't need a brush, you have fingers or a twig.
The walls on the streets of Pompeii are full of advertising -- they had an election going on and people just scribbled slogans and such on walls. You don't need flyers lol.
The idea that signs or advertising was "artistry" is deeply ahistorical. The reason old stuff looks real fancy is because labor was extremely cheap and materials were expensive.
> People have been painting with red and yellow ochre and soot for at least 50K years for sure,
Compare those to the pigments used (mixed up!) by professional painters, and then to what printers could make.
If you wanted to paint fine art in the 1400s you were possibly making your own canvases, your own paint brushes, and your own paints.
And on top of that you had to be a skilled painter!
> The walls on the streets of Pompeii are full of advertising -- they had an election going on and people just scribbled slogans and such on walls. You don't need flyers lol.
The American revolution included a lot of propaganda courtesy of printing presses and some very rich financers who had a vested interest in a revolution occuring.
Pamphlets everywhere. It is one thing to scribble on a wall, it is another to produce messages at a mass scale.
That sense of scale has been multiplied yet again by AI.
No, that's just the impact that you're not going to hear in the news ("Small business saves a couple of hundred dollars" is not a good headline). But that's not the only "impact of AI". The bigger impacts are reflected in the news and the stock market almost on a daily basis over the last two years.
Couple hundred dollars
..a month
..multiplied by how many small businesses globally?
I think both. Most organizatuons lack someone like Steve Jobs to prime their product lines. Microsoft is a good example where you see their products over the years are mostly meh. Then meetings are pervasive and even more so in most companies due to msteam convenience. But currently they faced reduced demands due softer market as compare 2-3 years ago. If you observed that no effect while they layoff many and revenue still hold or at least no negative growth, I would surmise that AI is helping. But in corporate, it only counta if directly contributed sales numbers.
There was a recent post where someone said AI allows them to start and finish projects. And I find that exactly true. AI agents are helpful for starting proof of concepts. And for doing finishing fixes to an established codebase. For a lot of the work in the middle, I can be still useful, but the developer is more important there.
The thing with a lot of white collar work is that the thinking/talking is often the majority of the work… unlike coding, where thinking is (or, used to be, pre-agent) a smaller percentage of the time consumed. Writing the software, which is essentially working through how to implement the thought, used to take a much larger percentage of the overall time consumed from thought to completion.
Other white collar business/bullshit job (ala Graeber) work is meeting with people, “aligning expectations”, getting consensus, making slides/decks to communicate those thoughts, thinking about market positioning, etc.
Maybe tools like Cowork can help to find files, identify tickets, pull in information, write Excel formulas, etc.
What’s different about coding is no one actually cares about code as output from a business standpoint. The code is the end destination for decided business processes. I think, for that reason, that code is uniquely well adapted to LLM takeover.
But I’m not so sure about other white-collar jobs. If anything, AI tooling just makes everyone move faster. But an LLM automating a new feature release and drafting a press release and hopping on a sales call to sell the product is (IMO) further off than turning a detailed prompt into a fully functional codebase autonomously.
I’m confused what kind of software engineer jobs there are that don’t involve meeting with people, “aligning expectations”, getting consensus, making slides/decks to communicate that, thinking about market positioning, etc?
If you weren’t doing much of that before, I struggled to think of how you were doing much engineering at all, save some more niche extremely technical roles where many of those questions were already answered, but even still, I should expect you’re having those kinds of discussions, just more efficiently and with other engineers.
> I’m confused what kind of software engineer jobs there are that don’t involve meeting with people, “aligning expectations”, getting consensus, making slides/decks to communicate that, thinking about market positioning, etc?
The vast majority of software engineers in the world. The most widespread management culture is that where a team's manager is the interface towards the rest of the organization and the engineers themselves don't do any alignment/consensus/business thinking, which is the manager's exclusive job.
I used to work like that and I loved it. My managers were decent and they allowed me to focus on my technical skills. Then, due to those technical skills I'd acquired, I somehow got hired at Google, stayed there nearly a decade but hated all the OKR crap, perf and the continuous self-promotion I was obliged to do.
In my case
* meeting with people, yes, on calls, on chats, sometimes even on phone
* “aligning expectations”, yes, because of the next point
* getting consensus, yes, inevitably or how else do we decide what to do and how to do it?
* making slides/decks to communicate that, not anymore, but this is a specific tool of the job, like programming in Java vs in Python.
* thinking about market positioning, no, but this is what only a few people in an organization have agency on.
* etc? Yes, for example don't piss off other people, help custumers using the product, identify new functionalities that could help us deliver a better product, prioritize them and then back to getting consensus.
> I’m confused what kind of software engineer jobs there are that don’t involve meeting with people, “aligning expectations”, getting consensus, making slides/decks to communicate that, thinking about market positioning, etc?
I'd suspect the kind that's going away.
That kind was already reserved for junior roles, contractors, and off shoring.
I’m not sure everyone would agree with that statement. As a more senior engineer at a big tech company, our execs still believe more code output is expected by level. Hell they even measure and rate you on lines of code deltas.
I don’t agree with it or believe it’s smart but it’s the world we live in
Ime a team or project lead does that and the rest of the engineers maybe do that on a smaller scale but mostly implement.
> The thing with a lot of white collar work is that the thinking/talking is often the majority of the work… unlike coding, where thinking is (or, used to be, pre-agent) a smaller percentage of the time consumed.
WHOAH WHOAH WHOAH WHOAH STOP. No coder I've ever met has thought that thinking was anything other than the BIGGEST allocation of time when coding. Nobody is putting their typing words-per-minute on their resume because typing has never been the problem.
I'm absolutely baffled that you think the job that requires some of the most thinking, by far, is somehow less cognitively intense than sending emails and making slide decks.
I honestly think a project managers job is actually a lot easier to automate, if you're going to go there (not that I'm hoping for anyone's job to be automated away). It's a lot easier for an engineer to learn the industry and business than it is for a project manager to learn how to keep their vibe code from spilling private keys all over the internet.
> making slides/decks to communicate those thoughts,
That use case is definitely delegated to LLMs by many people. That said, I don't think it translates into linear productivity gains. Most white collar work isn't so fast-paced that if you save an hour making slides, you're going to reap some big productivity benefit. What are you going to do, make five more decks about the same thing? Respond to every email twice? Or just pat yourself on the back and browse Reddit for a while?
It doesn't help that these LLM-generated slides probably contain inaccuracies or other weirdness that someone else will need to fix down the line, so your gains are another person's loss.
Yeah, but this is self-correcting. Eventually it will get to a point where the data that you use to prompt the LLM will have more signal than the LLM output.
But if you get deep into an enterprise, you'll find there are so many irreducible complexities (as Stephen Wolfram might coin them), that you really need a fully agentically empowered worker — meaning a human — to make progress. AI is not there yet.
Thinking is always the hardest part and the bottleneck for me.
It doesn’t capture everyone’s experience when you say thinking is the smaller part of programming.
I don’t even believe a regular person is capable of producing good quality code without thinking 2x the amount they are coding
Agree. I remember in school in the 1980s reading that a good programmer can write about 10 lines of code a day (citing The Mythical Man-Month) and I thought "that's ridiculous, I can write hundreds of lines a day" but didn't understand that's including all the time understanding requirements, thinking about design, testing, debugging, etc. Writing the code is a small portion of what a software engineer does.
Also remember that programs were much smaller, code had to be typed in full and read accurately because compilers were slow and you didn't want to waste time for a syntax error. Anyway it's common even today to work half a day thinking, debugging, testing and eventually git diff shows only two changed lines.
Most people (and most businesses) aren’t making good quality code though. Most tools we use have horrible codebases. Therefore now the code can often be a similar quality to before, just done far faster.
> unlike coding, where thinking is (or, used to be, pre-agent) a smaller percentage of the time consumed. Writing the software, which is essentially working through how to implement the thought, used to take a much larger percentage of the overall time consumed from thought to completion.
huh? maybe im in the minority, but the thinking:coding has always been 80:20 spend a ton of time thinking and drawing, then write once and debug a bit, and it works
this hasnt really changed with Llm coding either, except that for the same amount of thinking, you get more code output
Yeah, ratios vary depending on how productive you are with code. For me it was 50:50 and is now 80:20, but only because I was a relatively unproductive coder (struggled with language feature memorization, etc.) and a much more productive thinker/architect.
"Struggling with language feature memorization" is what we call "unemployed", not "relatively unproductive".
when the work involves navigating a bunch of rules with very ambiguous syntax, AI will automate them to the point computers automated rules based systems with very precise syntax in the 1990s
this software (which i am not related to or promoting) is better at investment planning and tax planning than over 90% of RIAs in the US. It will automate RIA to the point that trading software automated stock broking. This will reduce the average RIA fee from 1% per year to 0.20% or even 0.10% per year just like mutual fund fees dropped in the early 00s
You could have beaten the returns of most financial professionals over the last several years by just parking your money in the S&P 500, and yet plenty of people are still making a lucrative career out of underperforming it. In some fields, “being better and cheaper” does not always spell victory.
you are right on beating money managers. when I said investment planning, I meant planning the size and tax structures for investments. this software automates all of the technical work that goes on inside financial planning firms, which is done by tens of thousands of white collar professionals in US/UK/EU, et c. it will then lead to price competitiveness.
more expensive silly companies will exist, but the cheap ones get the scale. SP500 index funds have over 1 trillion in the top 3 providers. cathy wood has like 6-7 billion.
BNYMellon is the custodian of $50 trillion of investment assets. robinhood has $324bn.
silly companies get the headlines though
The slow part as a senior engineer has never been actually writing the code. It has been:
- reviews for code
- asking stakeholders opinions
- SDLC latency (things taking forever to test)
- tickets
- documentations/diagrams
- presentations
Many of these require review. The review hell doesn't magically stop at Open source projects. These things happen internally too.
My company’s behind the curve, just got nudged today that I should make sure my AI use numbers aren’t low enough to stand out or I may have a bad time. Reckon we’re minimum six months from “oh whoops that was a waste of money”, maybe even a year. (Unless the AI market very publicly crashes first)
So management basically have no clue and want you to figure out how to use AI?
Do they also make you write your own performance review and set your own objectives?
> So management basically have no clue and want you to figure out how to use AI?
This is basically the same story I have heard both my own place of employment and also from a number of friends. There is a "need" for AI usage, even if the value proposition is undefined (or, as I would expect, non-existent) for most businesses.
> Do they also make you write your own performance review and set your own objectives?
Not to get off on a tangent but this has got to be a "tell" for how much a company is managed by formula and how much it's actually got thinking people running things. Every time I've had to write my own review I fill out the form with some corporatese bullshit, my supervisor approves it and adds some more bullshit, it disappears into HR and I never hear anything about it until it's time for the next review, and it starts over again. There isn't even reference to any of my "objectives" from the last review, because that review has simply disappeared.
But I'm sure some HR exec is checking boxes for following "best practices" in employee evaluation.
Management probably also wants them to figure out how to use the laptops, ide and other resources provided to them. Getting a tool for your employees that you've been told is important but have no idea what to do with is a perfectly valid management task.
Look, to make something productive out of it: a job seeker who has high level skills using LLM assistance will be much more valuable than one without the experience. Never mind your current company mangement's policies.
Original paper https://www.nber.org/system/files/working_papers/w34836/w348...
Figure A6 on page 45: Current and expected AI adoption by industry
Figure A11 on page 51: Realised and expected impacts of AI on employment by industry
Figure A12 on page 52: Realised and expected impacts of AI on productivity by industry
These seem to roughly line up with my expectations that the more customer facing or physical product your industry is, the lower the usage and impact of AI. (construction, retail)
A little bit surprising is "Accom & Food" being 4th highest for productivity impact in A12. I wonder how they are using it.
If you include microsoft copilot trials in fortune 500s, absolutely. A lot of major listed companies are still oblivious to the functionality of AI, their senior management don't even use it out of laziness
There's a lot of rote work in software development that's well-suited to LLM automation, but I think a lot of us overestimate the actual usefulness of a chatbot to the average white-collar worker. What's the point of making Copilot compose an email when your prompt would be longer than the email itself? You can tell ChatGPT to make you a slide deck, but slide decks are already super simple to make. You can use an LLM as a search engine, but we already have search engines. People sometimes talk about using a chatbot to brainstorm, but that seems redundant when you could simply think, free from the burden of explaining yourself to a chatbot.
LLMs are impressive and flexible tools, but people expect them to be transformative, and they're only transformative in narrow ways. The places they shine are quite low-level: transcription, translation, image recognition, search, solving clearly specified problems using well-known APIs, etc. There's value in these, but I'm not seeing the sort of universal accelerant that some people are anticipating.
it turns out it's really hard to get a man to fish with a pole when you don't teach them how to use the reel
If AGI is coming, won't there just be autofishers and no one will ever have to fish again, completely devaluing one's fishing knowledge and the effort put in to learn it?
It’s not a great analogy but...
“Autofishers” are large boats with nets that bring in fish in vast quantities that you then buy at a wholesale market, a supermarket a bit later, or they flash freeze and sell it to you over the next 6-9 months.
Yet there’s still a thriving industry selling fishing gear. Because people like to fish. And because you can rarely buy fish as fresh as what you catch yourself.
Again, it’s not a great analogy, but I dunno. I doubt AGI, if it does come, will end up working the way people think it will.
In regards to copilot, they’ve also been led on a fishing expedition to the middle of a desert
Or give them a stick with twine and a plastic fork as a hook, as is the case with Copilot.
100% All of the people who are floored by AI capabilities right now are software engineers, and everyone who's extremely skeptical basically has any other office job. On investigating their primary AI interaction surface, it's Microsoft Co-Pilot, which has to be the absolute shittiest implementation of any AI system so far. As a progress-driven person, it's just super disappointing to see how few people are benefiting from the productive gains of these systems.
I'm a SWE who's been using coding agents daily for the last 6 months and I'm still skeptical.
For my team at least, the productivity boost is difficult to quantify objectively. Our products and services have still tons of issues that AI isn't going to solve magically.
It's pretty clear that AI is allowing to move faster for some tasks, but it's also detrimental for other things. We're going to learn how to use these tools more efficiently, but right now, I'm not convinced about the productivity gain.
> I'm a SWE who's been using coding agents daily for the last 6 months and I'm still skeptical.
What improvements have you noticed over that time?
It seems like the models coming out in the last several weeks are dramatically superior to those mid-last year. Does that match your experience?
Not the grandparent, but I've used most of the OpenAI models that have been released in the last year. Out of all of them, o3 was the best at the programming tasks I do. I liked it a lot more than I like GPT 5.2 Thinking/Pro. Overall, I'm not at all convinced that models are making forward progress in general.
Is your backlog and/or your velocity increasing, decreasing, or the same? That's really the ultimate question.
In a team of one at work I see clear benefits, but having worked in many different team sizes for most of my career I can see how it quickly would go down, especially if you care about quality. And even with the latest models it’s a constant battle against legacy training data, which has gotten worse over time. ”I have to spend 45 minutes explaining why a one minute AI generated PR is bad code” was how an old colleague summarized it.
I think anthropic will succeed immensely here because when integrated with Microsoft365 and especially Excel it basically does what co-pilot said it would do.
The moment of realisation happen for a lot of normoid business people when they see claude make a DCF spreadsheet or search emails
claude is also smart because it visually shows the user as it resizes the columns, changes colours, etc. Seeing the computer do things makes the normoid SEE the AI despite it being much slower
Hilarious lack of self awareness. Calling others "normoids" yet you believe you can emphasise with them enough to predict how they will adopt AI?
No one wants a chatbot “integrated” with excel and office365 crap, it’s clippy 2.0 bullshit.
Replace excel and office stuff with ai model entirely then people will pay attention.
that only works if you can oneshot. but nobody can oneshot.
iterating over work in excel and seeing it update correctly is exactly what people want. If they get it working in MSWord it will pick up even faster.
If the average office worker can get the benefit of AI by installing an add-on into the same office software they have been using since 2000 (the entire professional career of anyone under the age of 45), then they will do so. its also really easy to sell to companies because they dont have to redesign their teams or software stack, or even train people that much. the board can easily agree to budget $20 a head for claude pro
the other thing normies like is they can put in huge legacy spreadsheets and find all the errors
Microsoft365 has 400 million paid seats
IMO Copilot was "we need to give these people rope, but not enough for them to hang themselves". A non technical person with no patience and access to a real AI agent inside a business is a bull in a china shop. Copilot Cowork is the closest thing we have to what Copilot should have been and is only possible now because models finally got good enough to be less supervised.
FWIW Gemini inside Google apps is just as bad.
This isn't my experience. I see many non-software people using AI regularly. What you may be seeing is more: organizations with no incentive to do things better never did anything to do things better. AI is no different. They were never doing things better with pencil and paper.
It’s simple calculus for business leaders: admit they’re laying off workers because the fundamentals are bad and spook investors, admit they’re laying off workers because the economy is bad and anger the administration, or just say it’s AI making roles unnecessary and hope for the best.
I read an article in FT just a couple days ago claiming that increased productivity was becoming visible in economic data
> My own updated analysis suggests a US productivity increase of roughly 2.7 per cent for 2025. This is a near doubling from the sluggish 1.4 per cent annual average that characterised the past decade.
good for 3 clicks: https://giftarticle.ft.com/giftarticle/actions/redeem/97861f...
"Admitted" as the verb in a statement like this is blatant editorialization. Did they just finally "admit" what they had been reluctant to reveal? No doubt with their heads hung in shame?
Maybe this bothers me more than it should.
I’m not sure about this. I’ve been 100% ai since jan/1 and I’m way more productive at producing code.
The non code parts (about 90% of the work) is taking the same amount of time though.
Mentioning AI in an earnings call means fuck all when what they’re actually referring to is toggling on the permissions for borderline useless copilot features across their enterprise 365 deployments or being convinced to buy some tool that’s actually just a wrapper around API calls to a cheap/outdated OpenAI model with a hidden system prompt.
Yeah, if your Fortune 500 workplace is claiming to be leveraging AI because it has a few dozen relatively tech illiterate employees using it to write their em dash/emoji riddled emails about wellness sessions and teams invites for trivia events… there’s not going to be a noticeable uptick in productivity.
The real productivity comes from tooling that no sufficiently risk adverse pubco IS department is going to let their employees use, because when all of their incentives point to saying no to installing anything ever, the idea of giving the permissions required for agentic AI to do anything useful is a non-starter.
Workers may see the LLM as a productivity boost because they can basically cheat a their homework.
As a CEO I see it as a massive clog up of vast amounts of content that somebody will need to check. A DDoS of any text-based system.
The other day I got a document of 155 pages in Whatsapp. Thanx. Same with pull requests. Who will check all this?
> Who will check all this?
The answer to that, for some, is more AI.
I had a peer explain that the PRs created by AI are now too large and difficult to understand. They were concerned that bugs would crop up after merging the code. Their solution, was to use another AI to review the code... However, this did not solve the problem of not knowing what the code does. They had a solution for that as well... ask AI to prepare a quiz and then deliver it to the engineer to check their understanding of the code.
The question was asked - does using AI mean best-practices should no longer be followed? There were some in the conversation who answered, "probably yes".
> Who will check all this?
So yeah, I think the real answer to that is... no one.
Just yesterday one of my junior devs got an 800-line code review from an AI agent. It wasn't all bad, but is this kid literally going to have to read an essay every time he submits code?
The article suggests that AI-related productivity gains could follow a J-curve. An initial decline, as initially happened with IT, followed by an exponential surge. They admit this is heavily dependent on the real value AI provides.
However, there's another factor. The J-curve for IT happened in a different era. No matter when you jumped on the bandwagon, things just kept getting faster, easier, and cheaper. Moore's law was relentless. The exponential growth phase of the J-curve for AI, if there is one, is going to be heavily damped by the enshitification phase of the winning AI companies. They are currently incurring massive debt in order to gain an edge on their competition. Whatever companies are left standing in a couple of years are going to have to raise the funds to service and pay back that debt. The investment required to compete in AI is so massive that cheaper competition may not arise, and a small number of (or single) winner could put anyone dependent on AI into a financial bind. Will growth really be exponential if this happens and the benefits aren't clearly worth it?
The best possible outcome may be for the bubble to pop, the current batch of AI companies to go bankrupt, and for AI capability to be built back better and cheaper as computation becomes cheaper.
I think the 'AI productivity gap' is mostly a state management problem. Even with great models, you burn so much time just manually syncing context between different agents or chat sessions.
Until the handoff tax is lower than the cost of just doing it yourself, the ROI isn't going to be there for most engineering workflows.
I was in the “AI is grossly overhyped” camp because I work on large distributed deep learning training jobs and AI is indeed worthless for those, and will likely always be worthless since the APIs change constantly and the iteration loop is too cumbersome to constantly resubmit broken jobs to a training cluster.
Then I started working on some basic grpc/fullstack crap that I absolutely do not care about, at all, but needs to be done and uses internal frameworks that are not well documented, and now Claude is my best friend at work.
The best part is everyone else’s AI code still sucks, because they ask it to do stupid crap and don’t apply any critical thinking skills to it, so I just tell AI to re-do it but don’t fuck up the error handling and use constants instead of hardcoding strings like a middle schooler, and now I’m a 100x developer fearlessly leading the charge to usher in the AI era as I play the new No Man’s Sky update on my other PC and wait for whatever agent to finish crap.
this weirdly skirts my own experience yet somehow still read like sarcasm hehe. I think if we just return to calling it intelligent autocomplete expectations for productivity gain would be better established.
trying to hacksmash Claude into outputting something it simply can't just produces endless mess. or getting into a fight pointing out issues with what it's doing and it just piles on extra layer upon layer of gunk. but meanwhile if you ask it to boilerplate an entire SaaS around the hard part, it's done in about 15 seconds.
of course this says nothing about the costs of long term maintainability, and I think everyone by now recognises what that's going to look like
Every technology, whether it improved existing systems and productivity or not, created new wealth by creating new services and experiences. So that is what needs to happen with this wave as well.
If nothing else AI is making great strides in surveillance. It makes mistakes, but that only matters when there's accountability, and now it can make them at scales that were unthinkable before. Most of us are not going to enjoy the new experiences AI brings us, but a small number of people are already making a lot of money selling new services to government and law enforcement.
It's not just technology, it's very hard to detect the effect of inventions in general on productivity. There was a paper pointing out that the invention of the steam engine was basically invisible in the productivity statistics:
If it’s helpful to anyone, I just wrote a short blog post on this exact topic. It goes into the “j-curve” which was the proposed solution to the productivity paradox, and discusses some of the empirical research around it.
Thousand of companies to be replaced by leaner counterparts that learned to use AI towards greater employment and productivity
As we approach the singularity things will be more noisy and things will make less and less sense as rapid change can look like chaos from inside the system. I recommend folks just take a deep breath, and just take a look around you. Regardless on your stance if the singularity is real, if AI will revolutionize everything or not, just forget all that noise. just look around you and ask yourself if things are seeming more or less chaotic, are you able to predict better or worse on what is going to happen? how far can your predictions land you now versus lets say 10 or 20 years ago? Conflicting signals is exactly how all of this looks. one account is saying its the end of the world another is saying nothing ever changes and everything is the same as it always was....
Yep just a risk amplifier. We are having a global warming level event in computing and blindly walling into it.
BTW the study was from September 2024 to 2025, so its the very earliest of adopters.
This article is mostly based on NBER working paper 34836, which was published this month, and the data was collected from September 2025 to January 2026[0]
[0]: See page 2: https://www.nber.org/system/files/working_papers/w34836/w348...
I like AI and use it daily, but this bubble can’t pop soon enough so we can all return to normally scheduled programming.
CEOs are now on the downside of the hype curve.
They went from “Get me some of that AI!” after first hearing about it, to “Why are we not seeing any savings? Shut this boondoggle down!” now that we’re a few years into bubble, the business math isn’t working, and they only see burning piles of cash.
"return to normally scheduled programming" is probably not the exact phrasing you want to use. :)
I consume a lot of different content on a lot of different places. Every site or app has its vibe and communal beliefs. They rarely if ever agree on anything, but they all agree we're in a massive bubble.
I don't have a point, just that it's an unlikely unity.
It’s funny because at work we have paid Codex and Claude but I rarely find a use for it, yet I pay for the $200 Max plan for personal stuff and will use it for hours!
So I’m not even in the “it’s useless” camp, but it’s frankly only situationally useful outside of new greenfield stuff. Maybe that is the problem?
Why do you find it useless for legacy code? I find I have to give it plenty of context but it does pretty well on legacy code.
And Ask DeepWiki is a great shortcut for finding the right context… Granted this is open source and DW is free.
Is it the specific nature of your work?
I think the biggest problem is calling it AI to start with. It gives people a huge misrepresentation of what it is actually capable of. It is an impressive tool with many uses, but it is not AGI.
It's weird being on here and seeing so much naysaying, because I see a radical change already happening in software development. The future is here, it's just not equally distributed.
In the past 6 months, I've gone from Copilot to Cursor to Conductor. It's really the shift to Conductor that convinced me that I crossed into a new reality of software work. It is now possible to code at a scale dramatically higher than before.
This has not yet translated into shipping at far higher magnitude. There are still big friction points and bottlenecks. Some will need to be resolved with technology, others will need organizational solutions.
But this is crystal clear to me: there is a clear path to companies getting software value to the end customer much more rapidly.
I would compare the ongoing revolution to the advent of the Web for software delivery. When features didn't have to be scheduled for release in physical shipments, it unlocked radically different approaches to product development, most clearly illustrated in The Agile Manifesto. You could also do real-time experiments to optimize product outcomes.
I'm not here to say that this is all going to be OK. It won't be for a lot of people. Some companies are going to make tremendous mistakes and generate tremendous waste. Many of the concerns around GenAI are deadly,serious.
But I also have zero doubt that the companies that most effectively embrace the new possibilities are going to run circles around their competition.
It's a weird feeling when people argue against me in this, because I've seen too much. It's like arguing with flat-earthers. I've never personally circumnavigated Antarctica, but me being wrong would invalidate so many facts my frame of reality depends on.
To me, the question isn't about the capabilities of the technology. It's whether we actually want the future it unlocks. That's the discussion I wish we were having. Even if it's hard for me to see what choice there is. Capitalism and geopolitical competition are incredible forces to reckon with, and AI is being driven hard by both.
Curious why you like Conductor. I’m trying it out, but since I primarily live in the CLI, I might not see much value in it.
Fair point. What it really does for me is give me a better UX for having a bunch of parallel workstreams. I could achieve a similar effect thing with scripting, and maybe some clever ways of getting something like the sidebar for seeing the status of everything on a single pane. But Conductor packaged it up in a way that I found much improved over multiple Cursor or VSCode windows.
The people who will be most productive with AI will be the entreprompteurs who whip up entire products and go to market faster than ever before, iterating at dangerous speeds. Lean Startup methodology on pure steroids basically.
Unfortunately I think most of the stuff they make will be shit, but they will build it very productively.
Software doesn't need to be good to be successful; it only needs to solve a problem and be better than the competition.
I predict a golden age for experienced developers! There will be an uncountable number of poorly designed apps with scaling issues. And many of them will be funded.
Meh, no. In a future where any app could be prompted, the only thing you’d get funding for is if you had managed to go viral and secure some large audience.
This is not good. When all that matters is how viral your app is, people no longer compete on features and quality of life.
Anyone read the goal lately?
These surveys don’t make sense. Ask the forward thinking companies and they’ll say the opposite. The flood of anti AI productivity articles almost feel like they’re meant to lull the population into not seeing what’s about to happen to employment.
> Ask the forward thinking companies and they’ll say the opposite.
Which ones? OpenAI? Microsoft? Anthropic?
Eh, try using Microsoft Copilot in Word or PowerPoint. It is worthless. If your experience with AI was a Microsoft product, you would think it was a scam too.
It’s not just that though. You find when going through AI projects in an organization that many times the process is manual for a reason. This isn’t the first wave of “automation” that’s came through. Most things that can be fully automated already have been long ago and they manual parts get sold as we can make AI do it, until you see the specs and noodle around on the problem some then you realize it’s probably just going to remain manual as the amount of model training requires as much time and effort as just doing it by hand.
I have a dystopian future vision where humans are cheaper machines than robots, so we become the disposable task force for grunt work that robots aren’t cheap enough for. To some degree this is already happening.
It’s comical that Microsoft inserted Copilot buttons throughout all of their productivity suite, and none of them are able to do the bare minimum that you would hope for.
“Oh cool, copilot is in excel! I’m going to ask it a question about the data in the spreadsheet that it’s literally appearing beside natively in-app, or for help with a formula!”
“Wait what, it’s saying it can’t see anything or read from the currently displayed worksheet? Why is it inside the application then? Why would I want an outdated version of ChatGPT with no useful context or ability to read/do anything inside all my Office applications?”
Yeah Microsoft has consistently been bragging about how so much code is written by AI, yet their products are worse than ever. Seems to indicate “using AI” is not enough. You have to be smart about when and where.
At $dayjob GenAI has been shoved into every workflow and it's a constant source of noise and irritation, slop galore. I'm so close to walking away from the industry to resume being a mechanic, what a complete shit show.
Meanwhile in some auto shop,
"Perfect! Let's delve into the problem with the engine. Based on the symptoms you describe, the likely cause is a blown head gasket..."
There is probably a threshold effect above which the technology begins to be very useful for production (other than faking school assignments, one-off-scripts, spam, language translation, and political propaganda), but I guess we're not there yet. I'm not counting out the possibility of researchers finding a way to add long term memory or stronger reasoning abilities, which would change the game in a very disorienting way, but that would likely mean a change of architecture or a very capable hybrid tool.
the greatest step change will be when mainstream business realise they can use AI to accurately fill in PDF documents with information in any format
filling in pdf documents is effectively the job of millions of people around the world
That would require accurate validation of said documents, which is extremely hard now. Pointing 1 million PDF LLM machine guns at current validation pipelines will not end well, especially since LLMs are inherently unreliable.
Of course AI is bullshit. If you couldn't just use it yourself and figure that out then ask yourself why people like Bezos or Altman are perfectly happy "investing" other people's money but not their own. If they actually believed their own bullshit they would personally be investing all of their money AND taking on personal debt. Instead Bezos, a guy worth ~200B, sells 5B worth of stock to invest in "AI-adjacent" (power generation) industry, while making amazon invest 200B in data centers. Talk about conflict of interest! WTF!
The issue with framing this as a resurrection of the productivity paradox is that AI had never even theoretically increased productivity.
I think in retrospect it's going to look very silly.
[dead]
[dead]