Anthropic downgraded cache TTL on March 6th
github.com380 points by lsdmtme 14 hours ago
380 points by lsdmtme 14 hours ago
Has anybody else noticed a pretty significant shift in sentiment when discussing Claude/Codex with other engineers since even just a few months ago? Specifically because of the secret/hidden nature of these changes.
I keep getting the sense that people feel like they have no idea if they are getting the product that they originally paid for, or something much weaker, and this sentiment seems to be constantly spreading. Like when I hear Anthropic mentioned in the past few weeks, it's almost always in some negative context.
Well, off the top of my head:
- Banning OpenClaw users (within their rights, of course, but bad optics)
- Banning 3rd party harnesses in general (ditto)
(claude -p still works on the sub but I get the feeling like if I actually use it, I'll get my Anthropic acct. nuked. Would be great to get some clarity on this. If I invoke it from my Telegram bot, is that an unauthorized 3rd party harness?)
- Lowering reasoning effort (and then showing up here saying "we'll try to make sure the most valuable customers get the non-gimped experience" (paraphrasing slightly xD))
- Massively reduced usage (apparently a bug?) The other day I got 21x more usage spend on the same task for Claude vs Codex.
- Noticed a very sharp drop in response length in the Claude app. Asked Claude about it and it mentioned several things in the system prompt related to reduced reasoning effort, keeping responses as brief as possible, etc.
It's all circumstantial but everything points towards "desperately trying to cut costs".
I love Claude and I won't be switching any time soon (though with the usage limits I'm increasingly using Codex for coding), but it's getting hard to recommend it to friends lately. I told a friend "it was the best option, until about two weeks ago..." Now it's up in the air.
> It's all circumstantial but everything points towards "desperately trying to cut costs".
I have been wondering if it's more geared at reducing resource usage, given that at the moment there's a known constraint on AI datacenter expansion capability. Perhaps they are struggling to meet demand?
It’s more that Anthropic knows that the models themselves are non-sticky, and the real moat is in the ecosystem around it.
It only makes sense for them to get users to use their ecosystem, rather than other tools.
See: Claude Cowork trying to establish an entire new group of people in their ecosystem.
> Perhaps Anthropic is struggling to meet demand?
Yes, definitely, they’re gracefully failing to meet demand. They could also deny new customers, but it would probably be bad for business.
I once decided to deny new customers in order to be able to service current demand at the quality we wanted. It backfired and made people want our product even more. Our phones were blowing up. That approach can have unintended consequences!
You unintentionally used a common sales tactic; by decreasing supply you increase demand.
I wish they would just rip the bandaid to stop everybody's entitled whining.
"We're sorry, what we were able to give you for $100/mo before now needs to be $200/mo (or more). We miscalculated/we were too generous/gave too much away for too little. It's a new technology, we are seeing a ton of demand, we are trying to run a business, hope you understand. If you don't want it, don't pay for it."
Just put everyone on pay per use with the API and rip the band aid off.
Are we at the point where there are external constraints that cash can't solve?
This is my take too, although I'm not prepared for a max400 reality to replace the max200, but... I hate all of the whingeing. Piggies at the buffet line seem to be the loudest on this subject.
It is one thing to pay 100 a month to make calendar apps for your linkedin and birds on bicycles to get invited to talks, paying 200 HOWEVER
If we didn’t have the birds on bicycles, how would we know the models are getting better?
> (claude -p still works on the sub but I get the feeling like if I actually use it, I'll get my Anthropic acct. nuked. Would be great to get some clarity on this. If I invoke it from my Telegram bot, is that an unauthorized 3rd party harness?)
How often? Realistically, if you invoke it occasionally, for what's clearly an amount that's "reasonable personal use", then no you don't get nuked.
It’s the same problem people have with Google. If they ban you for some AI hallucinated reason you have no recourse other than going viral on Hacker News.
I haven't seen a single case of that happening with Anthropic yet. Every time someone has gotten banned it's because they either used third party harnesses which went to great lengths to impersonate claude code (obvious evasion), or because they set things up so it maxxed out their usage 24/7.
I'll change my mind when I see otherwise.
And this isn't being positive about Anthropic support or their treatment of users, as I too have seen lots of people here getting billed by them for stuff they never paid for, blatant fraud. That's even worse than Google. I'm only talking about getting banned for usage.
Perhaps Anthropic should put a freeze on new signups until they can increase capacity. This is the best kind of problem for a business, I'm cheering for them.
If there is one thing that is crystal clear, its that LLM providers will always take your money, no matter how bad the service is.
They also screwed up the API token detection and also blocked a bunch of 1st party tool users for ~24h.
Support consisted of AI bots saying you did something stupid, you did something wrong, you were abusing the system, followed by (only when I asked for it explicitly) claiming to file a ticket with a human who will contact you later (and it either didn't happen or their ticket system is /dev/null).
(By the way this is the 2nd time I've been "please hold" gaslit by support LLMs this exact same way, the other being with Square)
Claude -p is allowed. They're not going to give you a feature then ban you for using it.
What they changed is that it now uses extra usage, which is charged at api rates
It only switches to charging API rates if some part of your prompt triggers their magic string detector. Lot of examples of that floating around where swapping "is" for "are" or whatever will magically allow the request against your subscription plan again.
claude -p not working would be instant unsubscribe downgrade from Max to Pro and further drive my use of codex. I use both but overall have noticed I reach for Claude less than codex lately because claude keeps getting slower and slower (I have not noticed a drop off in quality, but I use it less and less so maybe I'm not in a good position to notice).
Generally I find codex and claude make a good team. I'm not a heavy user, but I am currently Claude Max 5x and ChatGPT Plus. Now that OpenAI has a $100 offering and I am finding myself using Claude less, I am considering switching to Claude Pro and ChatGPT Pro x5. The work hours restriction on Claude Max x5 really pisses me off.
I am not a heavy user. Historically I only break over 50% weekly one week a month and average about 30-40% of Max x5 over the entire month. I went Max because of the weekly limits and to access the better models and because I felt I was getting value. I need an occasional burst of usage, not 24/7 slow compute. But even for pay-as-you-go burst usage Anthropic's API prices are insane vs Max.
I have yet to ever hit a limit on codex so it's not on my mind. And lately it seems like Claude is likely to be having a service interruption anyway. A big part of subscribing to Claude Max was to get away from how the usage limits on Pro were causing me to architect my life around 5hr windows. And now Anthropic has brought that all back with this don't use it before 2pm bullshit. I want things ready to go when the muses strike. I'm honestly questioning whether Anthropic wants anyone who isn't employed as a software engineer to use their kit.
Anyway for the last month or so codex "just works" and Claude has been an invitation for annoyances. There was a time when codex was quite a bit behind claude-code. They have been roughly equal (different strength and weaknesses) since at least February (for me).
I might consider switching to codex from claude pro 20x but I need the post tool use, pre file write and post user message hooks. Waiting on codex to deliver.
- pre file write -> block editing code files without a task and plan of work
- post tool use -> show next open checkbox in the task to the agent, like an instruction pointer
- post user message -> log all user messages for periodic review of intent alignment
These 3 hooks + plain md files make my claude harness.
Why couldn’t you use Claude code harness with codex? The requests can be proxied to OpenAI.
I am cooking up an abstraction that enables these hooks on codex. Would love to have you kick the tires.
I think we are about a month away from a class action lawsuit, at their revenue they are a juicy target. And god knows they got the entirely self inflicted unholy combination going on, marketing & sales that borders on fraud (X times the usage of plan Y which has Z times of free tier which has unknowable "magic tokens") and then of course the actual fraud, reducing usage in fifteen different non obvious non public ways.
> (claude -p still works on the sub but I get the feeling like if I actually use it, I'll get my Anthropic acct. nuked. Would be great to get some clarity on this. If I invoke it from my Telegram bot, is that an unauthorized 3rd party harness?)
100% this, I’ve posted the same sentiment here on HN. I hate the chilling effect of the bans and the lack of clarity on what is and is not allowed.
In this case, they handled things pretty well. You can still use openclaw etc with your regular Anthropic subscription, it will just count towards your extra credits / usage which you can buy for a 30% discount compared to API pricing. And they gave everyone one month’s value in credits.
I don’t think they could have done that much better I’d say.
One month’s value in credits does not equal the value of one month’s subscription. They could have done better.
That does not address joshstrange's concerns.
There is very poor clarity about what is and isn't allowed with the Claude SDK/claude -p. Are we allowed to use it to automate stuff? What kind of tasks is it permitted to be used for? What if you call your script 'OrangeClaw' and release that on GitHub? What if your script gets super popular, does it suddenly become against TOS?
This is exactly my point. At what point does it become a ToS violation? Right now it's a huge grey area and the idea of getting my account banned because I crossed an invisible line with zero recourse other than to switch providers is... frustrating.
It's pretty easy to read between the lines tbh. Personal, non-automated use is fine. Using it as a means to automate depleting your 5-hour limit 24/7 ("leftover usage") is not fine. They don't want to put in in the ToS because it's almost impossible because writing what I just said will still have people going "well what's automated, where's the exact line!" when it's all pretty clear what the intended use case here is. The Anthropic peeps have said about as much.
I get that the traditional dev is allergic to the concept of reading between the lines and demands everything to be spelled out explicitly, but maybe you should just see it as something to learn because it's an incredibly useful life skill.
Ok, let's say I'm not using it to deplete leftover usage, the task just happens to run down the 5 hour window usage.
Are you willing to bet your account over whether you've read between the lines correctly? Anthropic aren't going to listen to appeals.
> the task just happens to run down the 5 hour window usage.
In a single prompt? From zero usage? That doesn't "just happen".