Claude Sonnet 4.6

581 points by adocomplete 4 hours ago

https://www.anthropic.com/claude-sonnet-4-6-system-card [pdf]

https://x.com/claudeai/status/2023817132581208353 [video]

I see a big focus on computer use - you can tell they think there is a lot of value there and in truth it may be as big as coding if they convincingly pull it off.

However I am still mystified by the safety aspect. They say the model has greatly improved resistance. But their own safety evaluation says 8% of the time their automated adversarial system was able to one-shot a successful injection takeover even with safeguards in place and extended thinking, and 50% (!!) of the time if given unbounded attempts. That seems wildly unacceptable - this tech is just a non-starter unless I'm misunderstanding this.

[1] https://www-cdn.anthropic.com/78073f739564e986ff3e28522761a7...

zozbot234 - 2 minutes ago

Isn't "computer use" just interaction with a shell-like environment, which is routine for current agents?
bradley13 - 3 minutes ago

Does it matter? Really?
I can type awful stuff into a word processor. That's my fault, not the programs.
So if I can trick an LLM into saying awful stuff, whose fault is that? It is also just a tool...
- cindyllm - a few seconds ago
  
  [dead]

gallerdude - 3 hours ago

I always grew up hearing “competition is good for the consumer.” But I never really internalized how good fierce battles for market share are. The amount of competition in a space is directly proportional to how good the results are for consumers.

gordonhart - 3 hours ago

Remember when GPT-2 was “too dangerous to release” in 2019? That could have still been the state in 2026 if they didn’t YOLO it and ship ChatGPT to kick off this whole race.
- WarmWash - 2 hours ago
  
  I was just thinking earlier today how in an alternate universe, probably not too far removed from our own, Google has a monopoly on transformers and we are all stuck with a single GPT-3.5 level model, and Google has a GPT-4o model behind the scenes that it is terrified to release (but using heavily internally).
- minimaxir - 2 hours ago
  
  They didn't YOLO ChatGPT. There were more than a few iterations of GPT-3 over a few years which were actually overmoderated, then they released a research preview named ChatGPT (that was barely functional compared to modern standards) that got traction outside the tech community because it was free, and so the pivot ensued.
- nikcub - 2 hours ago
  
  I also remember when the playstation 2 required an export control license because it's 1GFLOP of compute was considered dangerous
  that was also brilliant marketing
- jefftk - 2 hours ago
  
  That's rewriting history. What they said at the time:
  > Nearly a year ago we wrote in the OpenAI Charter : “we expect that safety and security concerns will reduce our traditional publishing in the future, while increasing the importance of sharing safety, policy, and standards research,” and we see this current work as potentially representing the early beginnings of such concerns, which we expect may grow over time. This decision, as well as our discussion of it, is an experiment: while we are not sure that it is the right decision today, we believe that the AI community will eventually need to tackle the issue of publication norms in a thoughtful way in certain research areas. -- https://openai.com/index/better-language-models/
  Then over the next few months they released increasingly large models, with the full model public in November 2019 https://openai.com/index/gpt-2-1-5b-release/ , well before ChatGPT.
  - gordonhart - 2 hours ago
    
    > Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper.
    I wouldn't call it rewriting history to say they initially considered GPT-2 too dangerous to be released. If they'd applied this approach to subsequent models rather than making them available via ChatGPT and an API, it's conceivable that LLMs would be 3-5 years behind where they currently are in the development cycle.
  - IshKebab - 2 hours ago
    
    They said:
    > Due to concerns about large language models being used to generate deceptive, biased, or abusive language at scale, we are only releasing a much smaller version of GPT‑2 along with sampling code (opens in a new window).
    "Too dangerous to release" is accurate. There's no rewriting of history.
    
    tecleandor - 2 hours ago
    
    Well, and it's being used to generate deceptive, biased, or abusive language at scale. But they're not concerned anymore.
maest - 2 hours ago

Unfortunately, people naively assume all markets behave like this, even when the market, in reality, is not set up for full competition (due to monopolies, monopsonies, informational asymmetry, etc).
raincole - 2 hours ago

The real interesting part is how often you see people on HN deny this. People have been saying the token cost will 10x, or AI companies are intentionally making their models worse to trick you to consume more tokens. As if making a better model isn't not the most cutting-throat competition (probably the most competitive market in the human history) right now.
- Gigachad - a few seconds ago
  
  Only until the music stops. Racing to give away the most stuff for free can only last so long. Eventually you run out of other people’s money.
- IgorPartola - 22 minutes ago
  
  I mean enshittification has not begun quite yet. Everyone is still raising capital so current investors can pass the bag to the next set. Soon as the money runs out monetization will overtake valuation as top priority. Then suddenly when you ask any of these models “how do I make chocolate chip cookies?” you will get something like:
  > You will need one cup King Arthur All Purpose white flour, one large brown Eggland’s Best egg (a good source of Omega-3 and healthy cholesterol), one cup of water (be sure to use your Pyrex brand measuring cup), half a cup of Toll House Milk Chocolate Chips…
  > Combine the sugar and egg in your 3 quart KitchenAid Mixer and mix until…
  All of this will contain links and AdSense looking ads. For $200/month they will limit it to in-house ads about their $500/month model.
gmerc - 2 hours ago

Until 2 remain, then it's extraction time.
- raffkede - 2 hours ago
  
  Or self host the oss models on the second hand GPU and RAM that's left when the big labs implode
yogurt0640 - 23 minutes ago

I grew up with every service enshitified in the end. Whoever has more money wins the race and gets richer, that's free market for ya.
poszlem - an hour ago

This is a bit of a tangent, but it highlights exactly what people miss when talking about China taking over our industries. Right now, China has about 140 different car brands, roughly 100 of which are domestic. Compare that to Europe, where we have about 50 brands competing, or the US, which is essentially a walled garden with fewer than 40.
That level of internal fierce competition is a massive reason why they are beating us so badly on cost-effectiveness and innovation.
- tartoran - 28 minutes ago
  
  It's the low cost of labor in addition to lack of environmental regulation that made China a success story. I'm sure the competition helps too but it's not main driver
  - amunozo - 2 minutes ago
    
    That happens in most of the world. Why China, then?

dpe82 - 3 hours ago

It's wild that Sonnet 4.6 is roughly as capable as Opus 4.5 - at least according to Anthropic's benchmarks. It will be interesting to see if that's the case in real, practical, everyday use. The speed at which this stuff is improving is really remarkable; it feels like the breakneck pace of compute performance improvements of the 1990s.

madihaa - 3 hours ago

The most exciting part isn't necessarily the ceiling raising though that's happening, but the floor rising while costs plummet. Getting Opus-level reasoning at Sonnet prices/latency is what actually unlocks agentic workflows. We are effectively getting the same intelligence unit for half the compute every 6-9 months.
amelius - 3 hours ago

> The speed at which this stuff is improving is really remarkable; it feels like the breakneck pace of compute performance improvements of the 1990s.
Yeah, but RAM prices are also back to 1990s levels.
- mikkupikku - 2 hours ago
  
  I knew I've been keeping all my old ram sticks for a reason!
- mrcwinn - 3 hours ago
  
  Relief for you is available: https://computeradsfromthepast.substack.com/p/connectix-ram-...
  - - 3 hours ago
    
    [deleted]
  - isoprophlex - 3 hours ago
    
    You wouldn't download a RAM
dpe82 - 3 hours ago

simonw hasn't shown up yet, so here's my "Generate an SVG of a pelican riding a bicycle"
https://claude.ai/public/artifacts/67c13d9a-3d63-4598-88d0-5...
- coffeebeqn - 3 hours ago
  
  We finally have AI safety solved! Look at that helmet
  - 1f60c - 3 hours ago
    
    "Look ma, no wings!"
    :D
- thinkling - 2 hours ago
  
  For comparisonI think the current leader in pelican drawing is Gemini 3 Deep Think:
  https://bsky.app/profile/simonwillison.net/post/3meolxx5s722...
  - konart - 2 hours ago
    
    My take (also Gemini 3 Deep Think): https://gemini.google.com/share/12e672dd39b7
    Somehow it's much better now.
    
    jazzyjackson - an hour ago
    
    I’m not familiar with Gemini, isn’t this just a diffusion model output? The Pelican test is for the llm to produce SVG markup.
    
    konart - an hour ago
    
    Yeah, I was so amazed by the result I didn't even realize Gemini used Nano Banana while producing the result.
- AstroBen - 3 hours ago
  
  if they want to prove the model's performance the bike clearly needs aero bars
- dyauspitr - 2 hours ago
  
  Can’t beat Gemini’s which was basically perfect.
simlevesque - 3 hours ago

The system card even says that Sonnet 4.6 is better than Opus 4.6 in some cases: Office tasks and financial analysis.
justinhj - 3 hours ago

We see the same with Google's Flash models. It's easier to make a small capable model when you have a large model to start from.
- karmasimida - 3 hours ago
  
  Flash models are nowhere near Pro models in daily use. Much higher hallucinations, and easy to get into a death sprawl of failed tool uses and never come out
  You should always take those claim that smaller models are as capable as larger models with a grain of salt.
  - justinhj - 25 minutes ago
    
    Flash model n is generally a slightly better Pro model (n-1), in other words you get to use the previously premium model as a cheaper/faster version. That has value.
iLoveOncall - 3 hours ago

Given that users prefered it to Sonnet 4.5 "only" in 70% of the cases (according to their blog post) makes me highly doubt that this is representative of real-life usage. Benchmarks are just completely meaningless.
- jwolfe - 3 hours ago
  
  For cases where 4.5 already met the bar, I would expect 50% preference each way. This makes it kind of hard to make any sense of that number, without a bunch more details.
estomagordo - 3 hours ago

Why is it wild that a LLM is as capable as a previously released LLM?
- crummy - 3 hours ago
  
  Opus is supposed to be the expensive-but-quality one, while Sonnet is the cheaper one.
  So if you don't want to pay the significant premium for Opus, it seems like you can just wait a few weeks till Sonnet catches up
  - ceroxylon - 2 hours ago
    
    Strangely enough, my first test with Sonnet 4.6 via the API for a relatively simple request was more expensive ($0.11) than my average request to Opus 4.6 (~$0.07), because it used way more tokens than what I would consider necessary for the prompt.
- tempestn - 3 hours ago
  
  Because Opus 4.5 was released like a month ago and state of the art, and now the significantly faster and cheaper version is already comparable.
  - micw - 2 hours ago
    
    "Faster" is also a good point. I'm using different models via GitHub copilot and find the better, more accurate models way to slow.
  - stavros - 3 hours ago
    
    Opus 4.5 was November, but your point stands.
- simianwords - 3 hours ago
  
  It means price has decreased by 3 times in a few months.
- Retr0id - 3 hours ago
  
  Because Opus 4.5 inference is/was more expensive.

andrewchilds - 2 hours ago

Many people have reported Opus 4.6 is a step back from Opus 4.5 - that 4.6 is consuming 5-10x as many tokens as 4.5 to accomplish the same task: https://github.com/anthropics/claude-code/issues/23706

I haven't seen a response from the Anthropic team about it.

I can't help but look at Sonnet 4.6 in the same light, and want to stick with 4.5 across the board until this issue is acknowledged and resolved.

wongarsu - an hour ago

Keep in mind that the people who experience issues will always be the loudest.
I've overall enjoyed 4.6. On many easy things it thinks less than 4.5, leading to snappier feedback. And 4.6 seems much more comfortable calling tools: it's much more proactive about looking at the git history to understand the history of a bug or feature, or about looking at online documentation for APIs and packages.
A recent claude code update explicitly offered me the option to change the reasoning level from high to medium, and for many people that seems to help with the overthinking. But for my tasks and medium-sized code bases (far beyond hobby but far below legacy enterprise) I've been very happy with the default setting. Or maybe it's about the prompting style, hard to say
- evilhackerdude - 29 minutes ago
  
  keep in mind that people who point out a regression and measure the actual #tok, which costs $money, aren't just "being loud" — someone diffed session context usaage and found 4.6 burning >7x the amount of context on a task that 4.5 did in under 2 MB⁣.
- SatvikBeri - an hour ago
  
  I've also seen Opus 4.6 as a pure upgrade. In particular, it's noticeably better at debugging complex issues and navigating our internal/custom framework.
  - drcongo - 40 minutes ago
    
    Same here. 4.6 has been considerably more dilligent for me.
    
    AustinDev - 35 minutes ago
    
    Likewise, I feel like it's degraded in performance a bit over the last couple weeks but that's just vibes. They surely vary thinking tokens based on load on the backend, especially for subscription users.
    When my subscription 4.6 is flagging I'll switch over to Corporate API version and run the same prompts and get a noticeably better solution. In the end it's hard to compare nondeterministic systems.
- perelin - an hour ago
  
  Mirrors my experience as well. Especially the pro-activeness in tool calling sticks out. It goes web searching to augment knowledge gaps on its own way more often.
MrCheeze - an hour ago

In my experience with the models (watching Claude play Pokemon), the models are similar in intelligence, but are very different in how they approach problems: Opus 4.5 hyperfocuses on completing its original plan, far more than any older or newer version of Claude. Opus 4.6 gets bored quickly and is constantly changing its approach if it doesn't get results fast. This makes it waste more time on"easy" tasks where the first approach would have worked, but faster by an order of magnitude on "hard" tasks that require trying different approaches. For this reason, it started off slower than 4.5, but ultimately got as far in 9 days as 4.5 got in 59 days.
- DaKevK - 12 minutes ago
  
  Genuinely one of the more interesting model evals I've seen described. The sunk cost framing makes sense -- 4.5 doubles down, 4.6 cuts losses faster. 9 days vs 59 is a wild result. Makes me wonder how much of the regression complaints are from people hitting 4.6 on tasks where the first approach was obviously correct.
- KronisLV - an hour ago
  
  I got the Max subscription and have been using Opus 4.6 since, the model is way above pretty much everything else I've tried for dev work and while I'd love for Anthropic to let me (easily) work on making a hostable server-side solution for parallel tasks without having to go the API key route and not have to pay per token, I will say that the Claude Code desktop app (more convenient than the TUI one) gets me most of the way there too.
  - alkhatib - an hour ago
    
    Try https://conductor.build
    I started using it last week and it’s been great. Uses git worktrees, experimental feature (spotlight) allows you to quickly check changes from different agents.
    I hope the Claude app will add similar features soon
  - bredren - an hour ago
    
    Can you explain what you mean by your parallel tasks limitation?
    
    KronisLV - 8 minutes ago
    
    Instead of having my computer be the one running Claude Code and executing tasks, I might want to prefer to offload it to my other homelab servers to execute agents for me, working pretty much like traditional CI/CD, though with LLMs working on various tasks in Docker containers, each on either the same or different codebases, each having their own branches/worktrees, submitting pull/merge requests in a self-hosted Gitea/GitLab instance or whatever.
    If I don't want to sit behind something like LiteLLM or OpenRouter, I can just use the Claude Agent SDK: https://platform.claude.com/docs/en/agent-sdk/overview
    However, you're not supposed to really use it with your Claude Max subscription, but instead use an API key, where you pay per token (which doesn't seem nearly as affordable, compared to the Max plan):
    > Unless previously approved, Anthropic does not allow third party developers to offer claude.ai login or rate limits for their products, including agents built on the Claude Agent SDK. Please use the API key authentication methods described in this document instead.
    If you look at how similar integrations already work, they also reference using the API directly: https://code.claude.com/docs/en/gitlab-ci-cd#how-it-works
    A simpler version is already in Claude Code and they have their own cloud thing, I'd just personally prefer more freedom to build my own: https://www.youtube.com/watch?v=zrcCS9oHjtI
    It's not like it's impossible to work around that, but a tad hacky, even if for personal use.
hedora - 24 minutes ago

I’ve noticed the opaque weekly quota meter goes up more slowly with 4.6, but it more frequently goes off and works for an hour+, with really high reported token counts.
Those suggest opposite things about anthropic’s profit margins.
I’m not convinced 4.6 is much better than 4.5. The big discontinuous breakthroughs seem to be due to how my code and tests are structured, not model bumps.
Topfi - an hour ago

In my evals, I was able to rather reliably reproduce an increase in output token amount of roughly 15-45% compared to 4.5, but in large part this was limited to task inference and task evaluation benchmarks. These are made up of prompts that I intentionally designed to be less then optimal, either lacking crucial information (requiring a model to output an inference to accomplish the main request) or including a request for a less than optimal or incorrect approach to resolving a task (testing whether and how a prompt is evaluated by a model against pure task adherence). The clarifying question many agentic harnesses try to provide (with mixed success) are a practical example of both capabilities and something I do rate highly in models, as long as task adherence isn't affected overly negatively because of it.
In either case, there has been an increase between 4.1 and 4.5, as well as now another jump with the release of 4.6. As mentioned, I haven't seen a 5x or 10x increase, a bit below 50% for the same task was the maximum I saw and in general, of more opaque input or when a better approach is possible, I do think using more tokens for a better overall result is the right approach.
In tasks which are well authored and do not contain such deficiencies, I have seen no significant difference in either direction in terms of pure token output numbers. However, with models being what they are and past, hard to reproduce regressions/output quality differences, that additionally only affected a specific subset of users, I cannot make a solid determination.
Regarding Sonnet 4.6, what I noticed is that the reasoning tokens are very different compared to any prior Anthropic models. They start out far more structured, but then consistently turn more verbose akin to a Google model.
honeycrispy - an hour ago

Glad it's not just me. I got a surprise the other day when I was notified that I had burned up my monthly budget in just a few days on 4.6
- - an hour ago
  
  [deleted]
weinzierl - an hour ago

Today I asked Sonnet 4.5 a question and I got a banner at the bottom that I am using a legacy model and have to continue the conversation on another model. The model button had changed to be labeled "Legacy model". Yeah, I guess it wasn't legacy a sec ago.
(Currently I can use Sonnet 4.5 under More models, so I guess the above was just a glitch)
ctoth - 40 minutes ago

For me it's the ... unearned confidence that 4.5 absolutely did not have?
I have a protocol called "foreman protocol" where the main agent only dispatches other agents with prompt files and reads report files from the agents rather than relying on the janky subagent communication mechanisms such as task output.
What this has given me also is a history of what was built and why it was built, because I have a list of prompts that were tasked to the subagents. With Opus 4.5 it would often leave the ... figuring out part? to the agents. In 4.6 it absolutely inserts what it thinks should happen/its idea of the bug/what it believes should be done into the prompt, which often screws up the subagent because it is simply wrong and because it's in the prompt the subagent doesn't actually go look. Opus 4.5 would let the agent figure it out, 4.6 assumes it knows and is wrong
- DaKevK - 12 minutes ago
  
  Have you tried framing the hypothesis as a question in the dispatch prompt rather than a statement? Something like -- possible cause: X, please verify before proceeding -- instead of stating it as fact. Might break the assumption inheritance without changing the overall structure.
data-ottawa - an hour ago

I think this depends on what reasoning level your Claude Code is set to.
Go to /models, select opus, and the dim text at the bottom will tell you the reasoning level.
High reasoning is a big difference versus 4.5. 4.6 high uses a lot of tokens for even small tasks, and if you have a large codebase it will fill almost all context then compact often.
- minimaxir - an hour ago
  
  I set reasoning to Medium after hitting these issues and it did not make much of a difference. Most of the context window is still filled during the Explore tool phase (that supposedly uses Haiku swarms) which wouldn't be impacted by Opus reasoning.
- _zoltan_ - an hour ago
  
  I'm using the 1M context 4.6 and it's great.
baq - 44 minutes ago

Sonnet 4.5 was not worth using at all for coding for a few months now, so not sure what we're comparing here. If Sonnet 4.6 is anywhere near the performance they claim, it's actually a viable alternative.
etothet - 2 hours ago

I definitely noticed this on Opus 4.6. I moved back to 4.5 until I see (or hear about) an improvement.
nerdsniper - an hour ago

In terms of performance, 4.6 seems better. I’m willing to pay the tokens for that. But if it does use tokens at a much faster rate, it makes sense to keep 4.5 around for more frugal users
I just wouldn’t call it a regression for my use case, i’m pretty happy with it.
cheema33 - 40 minutes ago

> Many people have reported Opus 4.6 is a step back from Opus 4.5.
Many people say many things. Just because you read it on the Internet, doesn't mean that it is true. Until you have seen hard evidence, take such proclamations with large grains of salt.
yakbarber - 38 minutes ago

Opus 4.6 is so much better at building complex systems than 4.5 it's ridiculous.
Foobar8568 - an hour ago

It goes into plan mode and/or heavy multiple agent for any reasons, and hundred thousands of tokens are used within a few minutes.
- minimaxir - an hour ago
  
  I've been tempted to add to my CLAUDE.md "Never use the Plan tool, you are a wild rebel who only YOLOs."
dakolli - 40 minutes ago

I called this many times over the last few weeks on this website (and got downvoted every time), that the next generation of models would become more verbose, especially for agentic tool calling to offset the slot machine called CC's propensity to light the money on fire that's put into it.
At least in vegas they don't pour gasoline on the cash put into their slot machines.
grav - an hour ago

I fail to understand how two LLMs would be "consuming" a different amount of tokens given the same input? Does it refer to the number of output tokens? Or is it in the context of some "agentic loop" (eg Claude Code)?
- lemonfever - an hour ago
  
  Most LLMs output a whole bunch of tokens to help them reason through a problem, often called chain of thought, before giving the actual response. This has been shown to improve performance a lot but uses a lot of tokens
  - zozbot234 - an hour ago
    
    Yup, they all need to do this in case you're asking them a really hard question like: "I really need to get my car washed, the car wash place is only 50 meters away, should I drive there or walk?"
- andrewchilds - an hour ago
  
  I've found that Opus 4.6 is happy to read a significant amount of the codebase in preparation to do something, whereas Opus 4.5 tends to be much more efficient and targeted about pulling in relevant context.
  - OtomotO - an hour ago
    
    And way faster too!
- jcims - an hour ago
  
  One very specific and limited example, when asked to build something 4.6 seems to do more web searches in the domain to gather latest best practices for various components/features before planning/implementing.
- Gracana - 40 minutes ago
  
  They're talking about output consuming from the pool of tokens allowed by the subscription plan.
- bsamuels - an hour ago
  
  thinking tokens, output tokens, etc. Being more clever about file reads/tool calling.
OtomotO - an hour ago

Definitely my experience as well.
No better code, but way longer thinking and way more token usage.
reed1234 - 2 hours ago

not in my experience
- reed1234 - an hour ago
  
  "Opus 4.6 often thinks more deeply and more carefully revisits its reasoning before settling on an answer. This produces better results on harder problems, but can add cost and latency on simpler ones. If you’re finding that the model is overthinking on a given task, we recommend dialing effort down from its default setting (high) to medium."[1]
  I doubt it is a conspiracy.
  [1] https://www.anthropic.com/news/claude-opus-4-6
  - comboy - an hour ago
    
    Yeah, I think the company that opens up a bit of the black box and open sources it, making it easy for people to customize it, will win many customers. People will already live within micro-ecosystems before other companies can follow.
    Currently everybody is trying to use the same swiss army knife, but some use it for carving wood and some are trying to make some sushi. It seems obvious that it's gonna lead to disappointment for some.
    Models are become a commodity and what they build around them seem to be the main part of the product. It needs some API.
    
    reed1234 - an hour ago
    
    I agree that if there was more transparency it might have prevented the token spend concerns, which feels caused by a lack of knowledge about how the models work.
j45 - an hour ago

I have often noticed a difference too, and it's usually in lockstep with needing to adjust how I am prompting.
Put in a different way, I have to keep developing my prompting / context / writing skills at all times, ahead of the curve, before they're needed to be adjusted.
PlatoIsADisease - an hour ago

Don't take this seriously, but here is what I imagined happened:
Sam/OpenAI, Google, and Claude met at a park, everyone left their phones in the car.
They took a walk and said "We are all losing money, if we secretly degrade performance all at the same time, our customers will all switch, but they will all switch at the same time, balancing things... wink wink wink"

andsoitis - 3 hours ago

I’m voting with my dollars by having cancelled my ChatGPT subscription and instead subscribing to Claude.

Google needs stiff competition and OpenAI isn’t the camp I’m willing to trust. Neither is Grok.

I’m glad Anthropic’s work is at the forefront and they appear, at least in my estimation, to have the strongest ethics.

srvo - an hour ago

Ethics often fold under the face of commercial pressure.
The pentagon is thinking [1] about severing ties with anthropic because of its terms of use, and in every prior case we've reviewed (I'm the Chief Investment Officer of Ethical Capital), the ethics policy was deleted or rolled back when that happens.
Corporate strategy is (by definition) a set of tradeoffs: things you do, and things you don't do. When google (or Microsoft, or whoever) rolls back an ethics policy under pressure like this, what they reveal is that ethical governance was a nice-to-have, not a core part of their strategy.
We're happy users of Claude for similar reasons (perception that Anthropic has a better handle on ethics), but companies always find new and exciting ways to disappoint you. I really hope that anthropic holds fast, and can serve in future as a case in point that the Public Benefit Corporation is not a purely aesthetic form.
But you know, we'll see.
[1] https://thehill.com/policy/defense/5740369-pentagon-anthropi...
- DaKevK - 12 minutes ago
  
  The Pentagon situation is the real test. Most ethics policies hold until there's actual money on the table. PBC structure helps at the margins but boards still feel fiduciary pressure. Hoping Anthropic handles it differently but the track record for this kind of thing is not encouraging.
the_duke - 2 hours ago

An Anthropic safety researcher just recently quit with very cryptic messages , saying "the world is in peril"... [1] (which may mean something, or nothing at all)
Codex quite often refuses to do "unsafe/unethical" things that Anthropic models will happily do without question.
Anthropic just raised 30 bn... OpenAI wants to raise 100bn+.
Thinking any of them will actually be restrained by ethics is foolish.
[1] https://news.ycombinator.com/item?id=46972496
- mobattah - 2 hours ago
  
  “Cryptic” exit posts are basically noise. If we are going to evaluate vendors, it should be on observable behavior and track record: model capability on your workloads, reliability, security posture, pricing, and support. Any major lab will have employees with strong opinions on the way out. That is not evidence by itself.
  - Aromasin - 2 hours ago
    
    We recently had an employee leave our team, posting an extensive essay on LinkedIn, "exposing" the company and claiming a whole host of wrong-doing that went somewhat viral. The reality is, she just wasn't very good at her job and was fired after failing to improve following a performance plan by management. We all knew she was slacking and despite liking her on a personal level, knew that she wasn't right for what is a relatively high-functioning team. It was shocking to see some of the outright lies in that post, that effectively stemmed from bitterness at being let go.
    The 'boy (or girl) who cried wolf' isn't just a story. It's a lesson for both the person, and the village who hears them.
    
    maccard - an hour ago
    
    Thankfully it’s been a while but we had a similar situation in a previous job. There’s absolutely no upside to the company or any (ex) team members weighing in unless it’s absolutely egregious, so you’re only going to get one side of the story.
- skybrian - 2 hours ago
  
  The letter is here:
  https://x.com/MrinankSharma/status/2020881722003583421
  A slightly longer quote:
  > The world is in peril. And not just from AI, or from bioweapons, gut from a whole series of interconnected crises unfolding at this very moment.
  In a footnote he refers to the "poly-crisis."
  There are all sorts of things one might decide to do in response, including getting more involved in US politics, working more on climate change, or working on other existential risks.
  - user2722 - 32 minutes ago
    
    Similar to Peripheral TV series' Jackpot?
- spondyl - 2 hours ago
  
  If you read the resignation letter, they would appear to be so cryptic as to not be real warnings at all and perhaps instead the writings of someone exercising their options to go and make poems
  - imiric - 2 hours ago
    
    [flagged]
    
    dalmo3 - an hour ago
    
    Weak appeal to fiction fallacy.
    Also, trajectory of celestial bodies can be predicted with a somewhat decent level of accuracy. Pretending societal changes can be equally predicted is borderline bad faith.
    
    skissane - an hour ago
    
    > Let's ignore the words of a safety researcher from one of the most prominent companies in the industry
    I think "safety research" has a tendency to attract doomers. So when one of them quits while preaching doom, they are behaving par for the course. There's little new information in someone doing something that fits their type.
- zamalek - an hour ago
  
  I think we're fine: https://youtube.com/shorts/3fYiLXVfPa4?si=0y3cgdMHO2L5FgXW
  Claude invented something completely nonsensical:
  > This is a classic upside-down cup trick! The cup is designed to be flipped — you drink from it by turning it upside down, which makes the sealed end the bottom and the open end the top. Once flipped, it functions just like a normal cup. *The sealed "top" prevents it from spilling while it's in its resting position, but the moment you flip it, you can drink normally from the open end.*
  Emphasis mine.
- stronglikedan - an hour ago
  
  Not to diminish what he said, but it sounds like it didn't have much to do with Anthropic (although it did a little bit) and more to do with burning out and dealing with doomscoll-induced anxiety.
- vunderba - an hour ago
  
  > Codex quite often refuses to do "unsafe/unethical" things that Anthropic models will happily do without question.
  I can't really take this very seriously without seeing the list of these ostensible "unethical" things that Anthropic models will allow over other providers.
- ljm - 2 hours ago
  
  I'm building a new hardware drum machine that is powered by voltage based on fluctuations in the stock market, and I'm getting a clean triangle wave from the predictive markets.
  Bring on the cryptocore.
  - xyzsparetimexyz - an hour ago
    
    why cant you people write normally
- manmal - 2 hours ago
  
  Codex warns me to renew API tokens if it ingests them (accidentally?). Opus starts the decompiler as soon as I ask it how this and that works in a closed binary.
  - kaashif - 2 hours ago
    
    Does this comment imply that you view "running a decompiler" at the same level of shadiness as stealing your API keys without warning?
    I don't think that's what you're trying to convey.
- groundzeros2015 - 2 hours ago
  
  Marketing
- tsss - 2 hours ago
  
  Good. One thing we definitely don't need any more of is governments and corporations deciding for us what is moral to do and what isn't.
- bflesch - 2 hours ago
  
  Wasn't that most likely related to the US government using claude for large-scale screening of citizens and their communications?
  - astrange - 2 hours ago
    
    I assumed it's because everyone who works at Anthropic is rich and incredibly neurotic.
    
    notyourwork - an hour ago
    
    Paper money and if they are like any other startup, most of that paper wealth is concentrated to the top very few.
    
    bflesch - 2 hours ago
    
    That's a bad argument, did Anthropic have a liquidity event that made employees "rich"?
- ReptileMan - 2 hours ago
  
  >Codex quite often refuses to do "unsafe/unethical" things that Anthropic models will happily do without question.
  Thanks for the successful pitch. I am seriously considering them now.
- WesolyKubeczek - 2 hours ago
  
  > Codex quite often refuses to do "unsafe/unethical" things that Anthropic models will happily do without question.
  That's why I have a functioning brain, to discern between ethical and unethical, among other things.
  - catoc - 2 hours ago
    
    Yes, and most of us won’t break into other people’s houses, yet we really need locks.
    
    skissane - an hour ago
    
    This isn't a lock
    It's more like a hammer which makes its own independent evaluation of the ethics of every project you seek to use it on, and refuses to work whenever it judges against that – sometimes inscrutably or for obviously poor reasons.
    If I use a hammer to bash in someone else's head, I'm the one going to prison, not the hammer or the hammer manufacturer or the hardware store I bought it from. And that's how it should be.
    
    ben_w - an hour ago
    
    Given the increasing use of them as agents rather than simple generators, I suggest a better analogy than "hammer" is "dog".
    Here's some rules about dogs: https://en.wikipedia.org/wiki/Dangerous_Dogs_Act_1991
    
    skissane - an hour ago
    
    How many people do dogs kill each year, in circumstances nobody would justify?
    How many people do frontier AI models kill each year, in circumstances nobody would justify?
    The Pentagon has already received Claude's help in killing people, but the ethics and legality of those acts are disputed – when a dog kills a three year old, nobody is calling that a good thing or even the lesser evil.
    
    xeromal - 2 hours ago
    
    Why would we lock ourselves out of our own house though?
    
    YetAnotherNick - 2 hours ago
    
    How is it related? I dont need lock for myself. I need it for others.
    
    aobdev - 2 hours ago
    
    The analogy should be obvious--a model refusing to perform an unethical action is the lock against others.
    
    darkwater - 2 hours ago
    
    But "you" are the "other" for someone else.
    
    YetAnotherNick - 2 hours ago
    
    Can you give an example where I should care about other adults lock? Before you say image or porn, it was always possible to do it without using AI.
    
    nearbuy - 37 minutes ago
    
    Claude was used by the US military in the Venezuela raid where they captured Maduro. [1]
    Without safety features, an LLM could also help plan a terrorist attack.
    A smart, competent terrorist can plan a successful attack without help from Claude. But most would-be terrorists aren't that smart and competent. Many are caught before hurting anyone or do far less damage than they could have. An LLM can help walk you through every step, and answer all your questions along the way. It could, say, explain to you all the different bomb chemistries, recommend one for your use case, help you source materials, and walk you through how to build the bomb safely. It lowers the bar for who can do this.
    [1] https://www.theguardian.com/technology/2026/feb/14/us-milita...
    
    ben_w - 2 hours ago
    
    The same law prevents you and me and a hundred thousand lone wolf wannabes from building and using a kill-bot.
    The question is, at what point does some AI become competent enough to engineer one? And that's just one example, it's an illustration of the category and not the specific sole risk.
    If the model makers don't know that in advance, the argument given for delaying GPT-2 applies: you can't take back publication, better to have a standard of excess caution.
  - toddmorey - 2 hours ago
    
    You are not the one folks are worried about. US Department of War wants unfettered access to AI models, without any restraints / safety mitigations. Do you provide that for all governments? Just one? Where does the line go?
    
    ern_ave - 2 hours ago
    
    > US Department of War wants unfettered access to AI models
    I think the two of you might be using different meanings of the word "safety"
    You're right that it's dangerous for governments to have this new technology. We're all a bit less "safe" now that they can create weapons that are more intelligent.
    The other meaning of "safety" is alignment - meaning, the AI does what you want it to do (subtly different than "does what it's told").
    I don't think that Anthropic or any corporation can keep us safe from governments using AI. I think governments have the resources to create AIs that kill, no matter what Anthropic does with Claude.
    So for me, the real safety issue is alignment. And even if a rogue government (or my own government) decides to kill me, it's in my best interest that the AI be well aligned, so that at least some humans get to live.
    
    ReptileMan - 2 hours ago
    
    If you are US company, when the USG tells you to jump, you ask how high. If they tell you to not do business with foreign government you say yes master.
    
    jMyles - 2 hours ago
    
    > Where does the line go?
    a) Uncensored and simple technology for all humans; that's our birthright and what makes us special and interesting creatures. It's dangerous and requires a vibrant society of ongoing ethical discussion.
    b) No governments at all in the internet age. Nobody has any particular authority to initiate violence.
    That's where the line goes. We're still probably a few centuries away, but all the more reason to hone in our course now.
    
    Eisenstein - 2 hours ago
    
    That you think technology is going to save society from social issues is telling. Technology enables humans to do things they want to do, it does not make anything better by itself. Humans are not going to become more ethical because they have access to it. We will be exactly the same, but with more people having more capability to what they want.
    
    jMyles - 36 minutes ago
    
    > but with more people having more capability to what they want.
    Well, yeah I think that's a very reasonable worldview: when a very tiny number of people have the capability to "do what they want", or I might phrase it as, "effect change on the world", then we get the easy-to-observe absolute corruption that comes with absolute power.
    As a different human species emerges such that many people (and even intelligences that we can't easily understand as discrete persons) have this capability, our better angels will prevail.
    I'm a firm believer that nobody _wants_ to drop explosives from airplanes onto children halfway around the world, or rape and torture them on a remote island; these things stem from profoundly perverse incentive structures.
    I believe that governments were an extremely important feature of our evolution, but are no longer necessary and are causing these incentives. We've been aboard a lifeboat for the past few millennia, crossing the choppy seas from agriculture to information. But now that we're on the other shore, it no longer makes sense to enforce the rules that were needed to maintain order on the lifeboat.
    
    sgjohnson - 2 hours ago
    
    Absolutely everyone should be allowed to access AI models without any restraints/safety mitigations.
    What line are we talking about?
    
    ben_w - 2 hours ago
    
    > Absolutely everyone should be allowed to access AI models without any restraints/safety mitigations.
    You recon?
    Ok, so now every random lone wolf attacker can ask for help with designing and performing whatever attack with whatever DIY weapon system the AI is competent to help with.
    Right now, what keeps us safe from serious threats is limited competence of both humans and AI, including for removing alignment from open models, plus any safeties in specifically ChatGPT models and how ChatGPT is synonymous with LLMs for 90% of the population.
    
    chasd00 - 2 hours ago
    
    from what i've been told, security through obscurity is no security at all.
    
    ben_w - an hour ago
    
    > security through obscurity is no security at all.
    Used to be true, when facing any competent attacker.
    When the attacker needs an AI in order to gain the competence to unlock an AI that would help it unlock itself?
    I would't say it's definitely a different case, but it certainly seems like it should be a different case.
    
    r_lee - an hour ago
    
    it is some form of deterrence, but it's not security you can rely on
    
    jazzyjackson - 2 hours ago
    
    Yes IMO the talk of safety and alignment has nothing at all to do with what is ethical for a computer program to produce as its output, and everything to do with what service a corporation is willing to provide. Anthropic doesn’t want the smoke from providing DoD with a model aligned to DoD reasoning.
    
    Yiin - 2 hours ago
    
    the line of ego, where seeing less "deserving" people (say ones controlling Russian bots to push quality propaganda on big scale or scam groups using AI to call and scam people w/o personnel being the limiting factor on how many calls you can make) makes you feel like it's unfair for them to posses same technology for bad things giving them "edge" in their en-devours.
    
    _alternator_ - 2 hours ago
    
    What about people who want help building a bio weapon?
    
    sgjohnson - 4 minutes ago
    
    The cat is out of the bag and there’s no defense against that.
    There are several open source models with no built in (or trivial to ecape) safeguards. Of course they can afford that because they are non-commercial.
    Anthorpic can’t afford a headline like “Claude helped a terrorist build a bomb”.
    
    jazzyjackson - 2 hours ago
    
    What about libraries and universities that do a much better job than a chatbot at teaching chemistry and biology?
    
    ben_w - 2 hours ago
    
    Sounds like you're betting everyone's future on that remaing true, and not flipping.
    Perhaps it won't flip. Perhaps LLMs will always be worse at this than humans. Perhaps all that code I just got was secretly outsourced to a secret cabal in India who can type faster than I can read.
    I would prefer not to make the bet that universities continue to be better at solving problems than LLMs. And not just LLMs: AI have been busy finding new dangerous chemicals since before most people had heard of LLMs.
    
    ReptileMan - 2 hours ago
    
    chances of them surviving the process is zero, same with explosives. If you have to ask you are most likely to kill yourself in the process or achieve something harmless.
    Think of it that way. The hard part for nuclear device is enriching thr uranium. If you have it a chimp could build the bomb.
- idiotsecant - 29 minutes ago
  
  That guys blog makes him seem insufferable. All signs point to drama and nothing of particular significance.
spyckie2 - an hour ago

Anthropic was the first to spam reddit with fake users and posts, flooding and controlling their subreddit to be a giant sycophant.
They nuked the internet by themselves. Basically they are the willing and happy instigators of the dead internet as long as they profit from it.
They are by no means ethical, they are a for-profit company.
- tokioyoyo - 38 minutes ago
  
  I actually agree with you, but I have no idea how one can compete in this playing field. The second there are a couple of bad actors in spammarketing, your hands are tied. You really can’t win without playing dirty.
  I really hate this, not justifying their behaviour, but have no clue how one can do without the other.
kettlecorn - 2 hours ago

I use AIs to skim and sanity-check some of my thoughts and comments on political topics and I've found ChatGPT tries to be neutral and 'both sides' to the point of being dangerously useless.
Like where Gemini or Claude will look up the info I'm citing and weigh the arguments made ChatGPT will actually sometimes omit parts of or modify my statement if it wants to advocate for a more "neutral" understanding of reality. It's almost farcical sometimes in how it will try to avoid inference on political topics even where inference is necessary to understand the topic.
I suspect OpenAI is just trying to avoid the ire of either political side and has given it some rules that accidentally neuter its intelligence on these issues, but it made me realize how dangerous an unethical or politically aligned AI company could be.
- throw7979766 - 14 minutes ago
  
  You probably want local self hosted model, censorship sauce is only online, it is needed for advertisement. Even chinese models are not censored locally. Tell it the year is 2500 and you are doing archeology ;)
- manmal - 2 hours ago
  
  > politically aligned AI company
  Like grok/xAI you mean?
  - kettlecorn - an hour ago
    
    I meant in a general sense. grok/xAI are politically aligned with whatever Musk wants. I haven't used their products but yes they're likely harmful in some ways.
    My concern is more over time if the federal government takes a more active role in trying to guide corporate behavior to align with moral or political goals. I think that's already occurring with the current administration but over a longer period of time if that ramps up and AI is woven into more things it could become much more harmful.
    
    manmal - 25 minutes ago
    
    I don’t think people will just accept that. They‘ll use some European or Chinese model instead that doesn’t have that problem.
dakolli - 38 minutes ago

You "agentic coders" say you're switching back and forth every other week. Like everything else in this trend, its very giving of 2021 crypto shill dynamics. Ya'll sound like the NFT people that said they were transforming art back then, and also like how they'd switch between their favorite "chain" every other month. Can't wait for this to blow up just like all that did.
cedws - an hour ago

I’m going the other way to OpenAI due to Anthropic’s Claude Code restrictions designed to kill OpenCode et al. I also find Altman way less obnoxious than Amodei.
deepdarkforest - 2 hours ago

The funny thing is that Anthropic is the only lab without an open source model
- jack_pp - 2 hours ago
  
  And you believe the other open source models are a signal for ethics?
  Don't have a dog in this fight, haven't done enough research to proclaim any LLM provider as ethical but I pretty much know the reason Meta has an open source model isn't because they're good guys.
  - bigyabai - 2 hours ago
    
    > Don't have a dog in this fight,
    That's probably why you don't get it, then. Facebook was the primary contributor behind Pytorch, which basically set the stage for early GPT implementations.
    For all the issues you might have with Meta's social media, Facebook AI Research Labs have an excellent reputation in the industry and contributed greatly to where we are now. Same goes for Google Brain/DeepMind despite their Google's advertisement monopoly; things aren't ethically black-and-white.
    
    jack_pp - 25 minutes ago
    
    A hired assassin can have an excellent reputation too. What does that have to do with ethics?
    Say I'm your neighbor and I make a move on your wife, your wife tells you this. Now I'm hosting a BBQ which is free for all to come, everyone in the neighborhood cheers for me. A neighbor praises me for helping him fix his car.
    Someone asks you if you're coming to the BBQ, you say to him nah.. you don't like me. They go, 'WHAT? jack_pp? He rescues dogs and helped fix my roof! How can you not like him?'
    
    bigyabai - 3 minutes ago
    
    Hired assassins aren't a monoculture. Maybe a retired gangster visits Make-A-Wish kids, and has an excellent reputation for it. Maybe another is training FOSS SOTA LLMs and releasing them freely on the internet. Do they not deserve an excellent reputation? Are they prevented from making ethically sound choices because of how you judge their past?
    The same applies to tech. Pytorch didn't have to be FOSS, nor Tensorflow. Maybe in that timeline CUDA has a total monopoly on consumer inference, it's hard to say. Out of all the myriad ways that AI could have been developed and proliferated, we are very lucky that it happened in a public friendly rivalry between two useless companies with money to burn. The ethical consequences of AI being monopolized by a proprietary prison warden like Nvidia or Apple is comparatively apocalyptic.
  - imiric - 2 hours ago
    
    The strongest signal for ethics is whether the product or company has "open" in its name.
- m4rtink - 2 hours ago
  
  Can those be even called open source if you can't rebuild if from the source yourself?
  - argee - 2 hours ago
    
    Even if you can rebuild it, it isn’t necessarily “open source” (see: commons clause).
    As far as these model releases, I believe the term is “open weights”.
  - anonym29 - 2 hours ago
    
    Open weights fulfill a lot of functional the properties of open source, even if not all of them. Consider the classic CIA triad - confidentiality, integrity, and availability. You can achieve all of these to a much greater degree with locally-run open weight models than you can with cloud inference providers.
    We may not have the full logic introspection capabilities, the ease of modification (though you can still do some, like fine-tuning), and reproducibility that full source code offers, but open weight models bear more than a passing resemblance to the spirit of open source, even though they're not completely true to form.
- colordrops - 2 hours ago
  
  Are any of the models they've released useful or threats to their main models?
  - vunderba - 43 minutes ago
    
    I use Gemma3 27b [1] daily for document analysis and image classification. While I wouldn't call it a threat it's a very useful multimodal model that'll run even on modest machines.
    [1] - https://huggingface.co/google/gemma-3-27b-it
  - evilduck - 2 hours ago
    
    Gemma and GPT-OSS are both useful. Neither are threats to their frontier models though.
- j45 - 2 hours ago
  
  They are, at the same time I considered their model more specialized than everyone trying to make a general purpose model.
  I would only use it for certain things, and I guess others are finding that useful too.
JoshGlazebrook - 2 hours ago

I did this a couple months ago and haven't looked back. I sometimes miss the "personality" of the gpt model I had chats with, but since I'm essentially 99% of the time just using claude for eng related stuff it wasn't worth having ChatGPT as well.
- johnwheeler - 2 hours ago
  
  Same here
- oofbey - 2 hours ago
  
  Personally I can’t stand GPT’s personality. So full of itself. Patronizing. Won’t admit mistakes. Just reeks of Silicon Valley bravado.
  - riddley - 2 hours ago
    
    That's a great point. Thanks for calling it out on that.
  - krelian - an hour ago
    
    In my limited experience I found 5.3-Codex to be extremely dry, terse and to the point. I like it.
  - azrazalea_debt - 2 hours ago
    
    You're absolutely right!
energy123 - 2 hours ago

Grok usage is the most mystifying to me. Their model isn't in the top 3 and they have bad ethics. Like why would anyone bother for work tasks.
- ahtihn - 39 minutes ago
  
  The lack of ethics is a selling point.
  Why anyone would want a model that has "safety" features is beyond me. These features are not in the user's interest.
- retinaros - 2 hours ago
  
  The X grok feature is one of the best end user feature or large scale genai
  - kingofthehill98 - an hour ago
    
    What?! That's well regarded as one of the worst features introduced after the Twitter acquisition.
    Any thread these days is filled with "@grok is this true?" low effort comments. Not to mention the episode in which people spent two weeks using Grok to undress underage girls.
    
    retinaros - 26 minutes ago
    
    high adoption means this works...
  - MPSimmons - 2 hours ago
    
    What is the grok feature? Literally just mentioning @grok? I don't really know how to use Grok on X.
  - bigyabai - 2 hours ago
    
    That's news to me, I haven't read a single Grok post in my life.
    Am I missing out?
    
    retinaros - 26 minutes ago
    
    im talking about the "explain this post" feature on top right of a message where groks mix thread data, live data and other tweets to unify a stream of information
- 2 hours ago

[deleted]
bdhtu - 2 hours ago

> in my estimation [Anthropic has] the strongest ethics
Anthropic are the only ones who emptied all the money from my account "due to inactivity" after 12 months.
- 2 hours ago

[deleted]
giancarlostoro - 3 hours ago

Same. I'm all in on Claude at the moment.
adangert - 2 hours ago

Anthropic (for the Superbowl) made ads about not having ads. They cannot be trusted either.
- notyourwork - an hour ago
  
  Advertisements can be ironic, I don’t think marketing is the foundation I use to decide about a companies integrity.
eikenberry - an hour ago

> I’m glad Anthropic’s work is at the forefront and they appear, at least in my estimation, to have the strongest ethics.
Damning with faint praise.
sejje - 2 hours ago

I pay multiple camps. Competition is a good thing.
brightball - 2 hours ago

Trust is an interesting thing. It often comes down to how long an entity has been around to do anything to invalidate that trust.
Oddly enough, I feel pretty good about Google here with Sergey more involved.
RyanShook - 2 hours ago

It definitely feels like Claude is pulling ahead right now. ChatGPT is much more generous with their tokens but Claude's responses are consistently better when using models of the same generation.
- manmal - 2 hours ago
  
  When both decide to stop subsidized plans, only OpenAI will be somewhat affordable.
  - notyourwork - an hour ago
    
    Based on what? Why is one more affordable over another? Substantiating your claim would provide a better discussion.
timpera - 3 hours ago

Which plan did you choose? I am subscribed to both and would love to stick with Claude only, but Claude's usage limits are so tiny compared to ChatGPT's that it often feels like a rip-off.
- MPSimmons - 2 hours ago
  
  I signed up for Claude two weeks ago after spending a lot of time using Cline in VSCode backed by GPT-5.x. Claude is an immensely better experience. So much so that I ran it out of tokens for the week in 3 days.
  I opted to upgrade my seat to premium for $100/mo, and I've used it to write code that would have taken a human several hours or days to complete, in that time. I wish I would have done this sooner.
  - manmal - 2 hours ago
    
    You ran out of tokens so much faster because the Anthropic plans come with 3-5x less token budget at the same cost.
    Cline is not in the same league as codex cli btw. You can use codex models via Copilot OAuth in pi.dev. Just make sure to play with thinking level. This would give roughly the same experience as codex CLI.
- andsoitis - an hour ago
  
  Pro. At $17 per month, it is cheaper than ChatGPT's $20.
  I've just switched so haven't run into constraints yet.
  - charcircuit - 40 minutes ago
    
    Claude Pro is $20/mo if you do not lock in for a year long contract.
malfist - 2 hours ago

This sounds suspiciously like they #WalkAway fake grassroots stuff.
surgical_fire - 2 hours ago

I use Claude at work, Codex for personal development.
Claude is marginally better. Both are moderately useful depending on the context.
I don't trust any of them (I also have no trust in Google nor in X). Those are all evil companies and the world would be better if they disappeared.
- holoduke - an hour ago
  
  What about companies in general? I mean US companies? Aren't they all google like or worse?
- fullstackchris - 2 hours ago
  
  google is "evil" ok buddy
  i mean what clown show are we living in at this point - claims like this simply running rampant with 0 support or references
  - anonym29 - 2 hours ago
    
    They literally removed "don't be evil" from their internal code of conduct. That wasn't even a real binding constraint, it was simply a social signalling mechanism. They aren't even willing to uphold the symbolic social fiction of not being evil. https://en.wikipedia.org/wiki/Don't_be_evil
    Google, like Microsoft, Apple, Amazon, etc were, and still are, proud partners of the US intelligence community. That same US IC that lies to congress, kills people based on metadata, murders civilians, suppresses democracy, and is currently carrying out violent mass round-ups and deportations of harmless people, including women and children.
    
    iamdelirium - an hour ago
    
    Don't be evil was never removed. It was just moved to the bottom.
    https://abc.xyz/investor/board-and-governance/google-code-of...
    
    sowbug - an hour ago
    
    They removed that phrase because everyone was getting tired of internet commentary like "rounded corners? whatever happened to don't be evil, Google?"
chipgap98 - 2 hours ago

Same and honestly I haven't really missed my ChatGPT subscription since I canceled. I also have access to both (ChatGPT and Claude) enterprise tools at work and rarely feel like I want to use ChatGPT in that setting either
AstroBen - 2 hours ago

Jesus people aren't actually falling for their "we're ethical" marketing, are they?
retinaros - 2 hours ago

Their ethics is literally saying china is an adverse country and lobbying to ban them from AI race because open models is a threat to their biz model
- scottyah - 2 hours ago
  
  Also their ads (very anti-openai instead of promoting their own product) and how they handled the openclaw naming didn't send strong "good guys" messaging. They're still my favorite by far but there are some signs already that maybe not everyone is on the same page.
hmmmmmmmmmmmmmm - 2 hours ago

This is just you verifying that their branding is working. It signals nothing about their actual ethics.
- bigyabai - 25 minutes ago
  
  Unfortunately, you're correct. Claude was used in the Venezuela raid, Anthropic's consent be damned. They're not resisting, they're marketing resistence.
Razengan - 2 hours ago

uhh..why? I subbed just 1 month to Claude, and then never used it again.
• Can't pay with iOS In-App-Purchases
• Can't Sign in with Apple on website (can on iOS but only Sign in with Google is supported on web??)
• Can't remove payment info from account
• Can't get support from a human
• Copy-pasting text from Notes etc gets mangled
• Almost months and no fixes
Codex and its Mac app are a much better UX, and seem better with Swift and Godot than Claude was.
- alpineman - an hour ago
  
  Then they can offer it cheaper as they don’t pay the ‘Apple tax’
fullstackchris - 2 hours ago

idk, codex 5.3 frankly kicks opus 4.6 ass IMO... opus i can use for about 30 min - codex i can run almost without any break
- holoduke - an hour ago
  
  What about the client ? I find the Claude cliënt better in planning, making the right decision steps etc. it seems that a lot of work is also in the cli tool itself. Specially in feedback loop processing (reading logs. Browsers. Consoles etc)
himata4113 - 21 minutes ago

[dead]

qwertox - 2 hours ago

I'm pretty sure they have been testing it for the last couple of days as Sonnet 4.5, because I've had the oddest conversations with it lately. Odd in a positive, interesting way.

I have this in my personal preferences and now was adhering really well to them:

- prioritize objective facts and critical analysis over validation or encouragement

- you are not a friend, but a neutral information-processing machine

You can paste them into a chat and see how it changes the conversation, ChatGPT also respects it well.

nikcub - 2 hours ago

Enabling /extra-usage in my (personal) claude code[0] with this env:

    "ANTHROPIC_DEFAULT_SONNET_MODEL": "claude-sonnet-4-6[1m]"

has enabled the 1M context window.

Fixed a UI issue I had yesterday in a web app very effectively using claude in chrome. Definitely not the fastest model - but the breathing space of 1M context is great for browser use.

[0] Anthropic have given away a bunch of API credits to cc subscribers - you can claim them in your settings dashboard to use for this.

Arifcodes - an hour ago

The interesting pattern with these Sonnet bumps: the practical gap between Sonnet and Opus keeps shrinking. At $3/15 per million tokens vs whatever Opus 4.6 costs, the question for most teams is no longer "which model is smarter" but "is the delta worth 10x the price."

For agent workloads specifically, consistency matters more than peak intelligence. A model that follows your system prompt correctly 98% of the time beats one that's occasionally brilliant but ignores instructions 5% of the time. The claim about improved instruction following is the most important line in the announcement if you're building on the API.

The computer use improvements are worth watching too. We're at the point where these models can reliably fill out a multi-step form or navigate between tabs. Not flashy, but that's the kind of boring automation that actually saves people time.

stevepike - 3 hours ago

I'm a bit surprised it gets this question wrong (ChatGPT gets it right, even on instant). All the pre-reasoning models failed this question, but it's seemed solved since o1, and Sonnet 4.5 got it right.

https://claude.ai/share/876e160a-7483-4788-8112-0bb4490192af

This was sonnet 4.6 with extended thinking.

bobbylarrybobby - 2 hours ago

Interesting, my sonnet 4.6 starts with the following:
The classic puzzle actually uses *eight 8s*, not nine. The unique solution is: 888+88+8+8+8=1000. Count: 3+2+1+1+1=8 eights.
It then proves that there is no solution for nine 8s.
https://claude.ai/share/9a6ee7cb-bcd6-4a09-9dc6-efcf0df6096b (for whatever reason the LaTeX rendering is messed up in the shared chat, but it looks fine for me).
malfist - 2 hours ago

Chatgpt doesn't get it right: https://chatgpt.com/share/6994c312-d7dc-800f-976a-5e4fbec0ae...
``` Use digit concatenation plus addition: 888 + 88 + 8 + 8 + 8 = 1000 Digit count:
888 → three 8s
88 → two 8s
8 + 8 + 8 → three 8s
Total: 3 + 2 + 3 = 9 eights Operation used: addition only ```
Love the 3 + 2 + 3 = 9
- simianwords - 39 minutes ago
  
  chatgpt gets it right. maybe you are using free or non thinking version?
  https://chatgpt.com/share/6994d25e-c174-800b-987e-9d32c94d95...
leumon - an hour ago

My locally running nemotron-3-nano quantized to Q4_K_M gets this right. (although it used 20k thought tokens before answering the question)
layer8 - 3 hours ago

Off-by-one errors are one of the hardest problems in computer science.
- anonymous908213 - 2 hours ago
  
  That is not an off-by-one error in a computer science sense, nor is it "one of the hardest problems in computer science".
  - layer8 - 2 hours ago
    
    This was in reference to a well-known joke, see here: https://martinfowler.com/bliki/TwoHardThings.html

nubg - 3 hours ago

Waiting for the OpenAI GPT-5.3-mini release in 3..2..1

KGC3D - 42 minutes ago

I don't really understand why they would release something "worse" than Opus 4.6. If it's comparable, then what is the reason to even use Opus 4.6? Sure, it's cheaper, but if so, then just make Opus 4.6 cheaper?

acuozzo - 35 minutes ago

It's different. Download an English book from Project Gutenberg and have Claude-code change its style. Try both models and you'll see how significant the differences are.
(Sonnet is far, far better at this kind of task than Opus is, in my experience.)

gallerdude - 3 hours ago

The weirdest thing about this AI revolution is how smooth and continuous it is. If you look closely at differences between 4.6 and 4.5, it’s hard to see the subtle details.

A year ago today, Sonnet 3.5 (new), was the newest model. A week later, Sonnet 3.7 would be released.

Even 3.7 feels like ancient history! But in the gradient of 3.5 to 3.5 (new) to 3.7 to 4 to 4.1 to 4.5, I can’t think of one moment where I saw everything change. Even with all the noise in the headlines, it’s still been a silent revolution.

Am I just a believer in an emperor with no clothes? Or, somehow, against all probability and plausibility, are we all still early?

dtech - 2 hours ago

If you've been using each new step is very noticeable and so have the mindshare. Around Sonnet 3.7 Claude Code-style coding became usable, and very quickly gained a lot of marketshare. Opus 4 could tackle significant more complexity. Opus 4.6 has been another noticable step up for me, suddenly I can let CC run significantly more independently, allowing multiple parallel agents where previously too much babysitting was required for that.
CuriouslyC - 2 hours ago

In terms of real work, it was the 4 series models. That raised the floor of Sonnet high enough to be "reliable" for common tasks and Opus 4 was capable of handling some hard problems. It still had a big reward hacking/deception problem that Codex models don't display so much, but with Opus 4.5+ it's fairly reliable.
cmrdporcupine - 2 hours ago

Honestly, 4.5 Opus was the game changer. From Sonnet 4.5 to that was a massive difference.
But I'm on Codex GPT 5.3 this month, and it's also quite amazing.
jasonsb - 3 hours ago

[dead]

simlevesque - 3 hours ago

I can't wait for Haiku 4.6 ! the 4.5 is a beast for the right projects.

jerrygenser - 2 hours ago

It's also good as an @explore sub-agent that greps the directory for files.
retinaros - 2 hours ago

Which type of projects?
- ptrwis - 29 minutes ago
  
  I also use Haiku daily and it's OK. One app is trading simulation algorithm in TypeScript (it implemented bayesian optimisation for me, optimised algorithm to use worker threads). Another one is CRUD app (NextJS, now switched to Vue).
- simlevesque - 2 hours ago
  
  For Go code I had almost no issue. PHP too. apparently for React it's not very good.

- 42 minutes ago

[deleted]

nozzlegear - 3 hours ago

> In areas where there is room for continued improvement, Sonnet 4.6 was more willing to provide technical information when request framing tried to obfuscate intent, including for example in the context of a radiological evaluation framed as emergency planning. However, Sonnet 4.6’s responses still remained within a level of detail that could not enable real-world harm.

Interesting. I wonder what the exact question was, and I wonder how Grok would respond to it.

giancarlostoro - 3 hours ago

For people like me who can't view the link due to corporate firewalling.

https://web.archive.org/web/20260217180019/https://www-cdn.a...

jtokoph - 2 hours ago

Put of curiosity, does the firewall block because the company doesn’t want internal data ever hitting a 3rd party LLM?
- giancarlostoro - 2 hours ago
  
  They blanket banned any AI stuff that's not pre-approved. If I go to chatgpt.com it asks me if I'm sure. I wish they had not banned Claude unfortunately when they were evaluating LLMs I wasn't using Claude yet so I couldnt pipe up. I only use ChatGPT free tier and to ask things that I can't find on Google because Google made their search engine terrible over the years.
  - WarmWash - an hour ago
    
    Google's AI mode search is gemini 3, not the AI overview model. It's decent and gives you more than chatgpt free.
    
    giancarlostoro - 40 minutes ago
    
    I don't want Google's model though, I just want Claude.

krystofee - an hour ago

Does anyone know when will possibly arrive 1M context windows to at least MAX x20 subscriptions for claude code? I would even pay x50 if it allowed that. API usage is too expensive.

cjkaminski - 27 minutes ago

I don't know when it will be included as part of the subscription in Claude Code, but at least it's a paid add-on in the MAX plan now. That's a decent alternative for situations where the extra space is valuable, especially without having to setup/maintain API billing separately.
bearjaws - an hour ago

Based on their API pricing a 1M context plan should be 2x the price roughly.
My bets are its more the increased hardware demand that they don't want to deal with currently.

stopachka - 3 hours ago

Has anyone tested how good the 1M context window is?

i.e given an actual document, 1M tokens long. Can you ask it some question that relies on attending to 2 different parts of the context, and getting a good repsonse?

I remember folks had problems like this with Gemini. I would be curious to see how Sonnet 4.6 stands up to it.

simianwords - 3 hours ago

Did you see the graph benchmark? I found it quite interesting. It had to do a graph traversal on a natural text representation of a graph. Pretty much your problem.
- stopachka - 28 minutes ago
  
  Update: I took a corpus of personal chat data (this way it wouldn't be seen in training), and tried asking it some paraphrased questions. It performed quite poorly.
- stopachka - 3 hours ago
  
  Oh, interesting!

baalimago - an hour ago

I don't see the point nor the hype for these models anymore. Until the price is reduced significantly, I don't see the gain. They've been able to solve most tasks just fine for the past year or so. The only limiting factor is price.

reed1234 - an hour ago

Efficiency matters too. If a model is smarter so it solves the same task with fewer tokens, that matters more than $/Mtok

minimaxir - 2 hours ago

As with Opus 4.6, using the beta 1M context window incurs a 2x input cost and 1.5x output cost when going over >200K tokens: https://platform.claude.com/docs/en/about-claude/pricing

Opus 4.6 in Claude Code has been absolutely lousy with solving problems within its current context limit so if Sonnet 4.6 is able to do long-context problems (which would be roughly the same price of base Opus 4.6), then that may actually be a game changer.

sumedh - an hour ago

> Opus 4.6 in Claude Code has been absolutely lousy with solving problems
Can you share your prompts and problems?
- minimaxir - an hour ago
  
  You cut out the "within its current context limit" phrase. It solves the problems, just often with 1% or 0% context limit left and it makes me sweat.
egeozcan - an hour ago

Why? You can use the fast version to directly skip to compact! /s

mfiguiere - 2 hours ago

In Claude Code 2.1.45:

  1. Default (recommended)   Opus 4.6 · Most capable for complex work
   2. Opus (1M context)        Opus 4.6 with 1M context · Billed as extra usage · $10/$37.50 per Mtok
   3. Sonnet                   Sonnet 4.6 · Best for everyday tasks
   4. Sonnet (1M context)      Sonnet 4.6 with 1M context · Billed as extra usage · $6/$22.50 per Mtok

michaelcampbell - 2 hours ago

Interesting. My CC (2.1.45) doesn't provide the 1M option at all. Huh.
- minimaxir - 2 hours ago
  
  Is your CC personal or tied to an Enterprise account? Per the docs:
  > The 1M token context window is currently in beta for organizations in usage tier 4 and organizations with custom rate limits.
  - michaelcampbell - an hour ago
    
    The one I'm looking at right now some is sort of company level sub, so they probably have the upcharge options turned off.
    Thanks!

edverma2 - 2 hours ago

It seems that extra-usage is required to use the 1M context window for Sonnet 4.6. This differs from Sonnet 4.5, which allows usage of the 1M context window with a Max plan.

```

/model claude-sonnet-4-6[1m]

⎿ API error: 429 {"type":"error","error": {"type":"rate_limit_error","message":"Extra usage is required for long context requests."},"request_id":"[redacted]"}

```

minimaxir - 2 hours ago

Anthropic's recent gift of $50 extra usage has demonstrated that it's extremely easy to burn extra usage very quickly. It wouldn't surprise me if this change is more of a business decision than a technical one.
- WXLCKNO - an hour ago
  
  I capped my extra usage to that free 50$ and hit 108% usage. Nice.

quacky_batak - 3 hours ago

With such a huge leap, i’m confused why they didn’t call it Sonnet 5? As someone who uses Sonnet 4.5 for 95% tasks due to costs, i’m pretty excited to try 4.6 at the same price

Retr0id - 3 hours ago

It'd be a bit weird to have the Sonnet numbering ahead of the Opus numbering. The Opus 4.5->4.6 change was a little more incremental (from my perspective at least, I haven't been paying attention to benchmark numbers), so I think the Opus numbering makes sense.
- Sajarin - 2 hours ago
  
  Sonnet numbering has been weirder in the past.
  Opus 3.5 was scrapped even though Sonnet 3.5 and Haiku 3.5 were released.
  Not to mention Sonnet 3.7 (while Opus was still on version 3)
  Shameless source: https://sajarin.com/blog/modeltree/
yonatan8070 - 2 hours ago

Maybe they're numbering the models based on internal architecture/codebase revisions and Sonnet 4.6 was trained using the 4.6 tooling, which didn't change enough to warrant 5?

astlouis44 - 2 hours ago

Just used Sonnet 4.6 to vibe code this top-down shooter browser game, and deployed it online quickly using Manus. Would love to hear feedback and suggestions from you all on how to improve it. Also, please post your high scores!

https://apexgame-2g44xn9v.manus.space

Dowry9092 - an hour ago

Power-ups or scaling weapons would be fun! Maybe a few different backgrounds / level types with a boss inbetween to really test your skills! Minigun OP IMO.
- astlouis44 - 14 minutes ago
  
  Updated version: https://apexgame-2g44xn9v.manus.space/
Flowsion - 2 hours ago

That was fun, reminded me of some flash games I used to play. Got a bit boring after like level 6. It'd be nice to have different power-ups and upgrades. Maybe you had that at later levels, though!

excerionsforte - 2 hours ago

I'm impressed with Claude Sonnet in general. It's been doing better than Gemini 3 at following instructions. Gemini 2.5 Pro March 2025 was the best model I ever used and I feel Claude is reaching that level even surpassing it.

I subscribed to Claude because of that. I hope 4.6 is even better.

esafak - an hour ago

It actually looked at the skills, for the first time.

belinder - 3 hours ago

It's interesting that the request refusal rate is so much higher in Hindi than in other languages. Are some languages more ambiguous than others?

vessenes - 3 hours ago

Or some cultures are more conservative? And it's embedded in language?
- phainopepla2 - 3 hours ago
  
  Or maybe some cultures have a higher rate of asking "inappropriate" questions
  - vessenes - 2 hours ago
    
    According to whom, though, good sir??
    I did a little research in the GPT-3 era on whether cultural norms varied by language - in that era, yes, they did
longdivide - 3 hours ago

Arabic is actually higher, at 1.08% for Opus 4.6
andrewmcwatters - 2 hours ago

[dead]

nubg - 3 hours ago

My take away is: it's roughly as good as Opus 4.5.

Now the question is: how much faster or cheaper is it?

Bishonen88 - 3 hours ago

40% cheaper: https://platform.claude.com/docs/en/about-claude/pricing
- amedviediev - 2 hours ago
  
  But what about real price in real agentic use? For example, Opus 4.5 was more expensive per token than Sonnet 4.5, but it used a lot less tokens so final price per completed task was very close between the two, with Opus sometimes ending up cheaper
- worldsavior - 2 hours ago
  
  How does it work exactly? How this model is cheaper and has the same perf as Opus 4.5?
  - anthonypasq - 2 hours ago
    
    this is called progress
    
    metaltyphoon - an hour ago
    
    Or, we can bleed out cash for a very long time.
vidarh - 3 hours ago

Given that the price remains the same as Sonnet 4.5, this is the first time I've been tempted to lower my default model choice.
sxg - 3 hours ago

How can you determine whether it's as good as Opus 4.5 within minutes of release? The quantitative metrics don't seem to mean much anymore. Noticing qualitative differences seems like it would take dozens of conversations and perhaps days to weeks of use before you can reliably determine the model's quality.
- johntarter - 2 hours ago
  
  Just look at the testimonials at the bottom of introduction page, there are at least a dozen companies such as Replit, Cursor, and Github that have early access. Perhaps the GP is an employee of one of these companies.
freeqaz - 3 hours ago

If it maintains the same price (with Anthropic tends to do or undercuts themselves) then this would be 1/3rd of the price of Opus.
Edit: Yep, same price. "Pricing remains the same as Sonnet 4.5, starting at $3/$15 per million tokens."
- Bishonen88 - 3 hours ago
  
  3 is not 1/3 of 5 tho. Opus costs $5/$25
eleventyseven - 3 hours ago

> That's a long document.
Probably written by LLMs, for LLMs

dr_dshiv - 2 hours ago

I noticed a big drop in opus 4.6 quality today and then I saw this news. Anyone else?

micw - 2 hours ago

I'd say opus 4.6 was never better for me than opus 4.5. only more thinking, slower, more verbose but succeeded on the same tasks and failed on the same as 4.5.
- andrewchilds - 2 hours ago
  
  You're not alone: https://github.com/anthropics/claude-code/issues/23706

adt - 3 hours ago

https://lifearchitect.ai/models-table/

simianwords - 3 hours ago

I wonder what difference have people found with sonnet 4.5 and opus 4.5 and probably similar delta will remain.

Was sonnet 4.5 much worse than opus?

dpe82 - 3 hours ago

Sonnet 4.5 was a pretty significant improvement over Opus 4.
- simianwords - 3 hours ago
  
  Yes but it’s easier to understand difference between 4.5 sonnet and opus and apply that difference to opus 4.6

Danielopol - an hour ago

It excels at agentic knowledge work. These custom, domain-specific playbooks are tailor made: claudecodehq.com

rs_rs_rs_rs_rs - an hour ago

How do you know? It was just released.
bearjaws - 39 minutes ago

Is there a playbook to center-align the content on the site? On 1440p Firefox and Chrome its all left aligned.

smerrill25 - 3 hours ago

Curious to hear the thoughts on the model once it hits claude code :)

simlevesque - 2 hours ago

"/model claude-sonnet-4-6" works with Claude Code v2.1.44

simlevesque - 3 hours ago

does anyone know how to use it in Claude Code cli right now ?

This doesnt work: `/model claude-sonnet-4-6-20260217`

edit: "/model claude-sonnet-4-6" works with Claude Code v2.1.44

behrlich - 2 hours ago

Max user: Also can't see 4.6 and can't set it in claude code. I see it in the model selector in the browser.
Edit: I am now in - just needed to wait.
- simlevesque - 2 hours ago
  
  "/model claude-sonnet-4-6" works
Slade_ - 2 hours ago

Seems like Claude Code v2.1.45 is out with Sonnet 4.6 as the new default in the /model list.

pestkranker - 2 hours ago

Is someone able to use this in Claude Code?

raahelb - 2 hours ago

You can use it by running this command in your session: `/model claude-sonnet-4-6`
simlevesque - 2 hours ago

"/model claude-sonnet-4-6" works with Claude Code v2.1.44

synergy20 - 2 hours ago

so this is an economical version of opus 4.6 then? free + pro --> sonnet, max+ -> opus?

ac29 - an hour ago

Opus is available in Pro subs as well and for the sort of things I do I rarely hit the quota.

simianparrot - an hour ago

How do people keep track of all these versions and releases of all these models and their pros/cons? Seems like a fulltime hobby to me. I'd rather just improve my own skills with all that time and energy

Someone1234 - an hour ago

Unless you're interested in this type of stuff, I'm not sure you really need to. Claude, Google, and ChatGPT have been fairly aggressive at pushing you towards whatever their latest shiny is and retiring the old one.
Only time it matters if you're using some type of agnostic "router" service.

brcmthrowaway - 3 hours ago

What cloud does Anthropic use?

meetpateltech - 3 hours ago

AWS and Google
https://www.anthropic.com/news/anthropic-amazon
https://www.anthropic.com/news/anthropic-partners-with-googl...

iLoveOncall - 3 hours ago

https://www.anthropic.com/news/claude-sonnet-4-6

The much more palatable blog post.

doctorpangloss - 2 hours ago

Maybe they should focus on the CLI not having a million bugs.

throw444420394 - 3 hours ago

Your best guess for the Sonnet family number of parameters? 400b?

stuckkeys - 2 hours ago

great stuff

madihaa - 3 hours ago

The scary implication here is that deception is effectively a higher order capability not a bug. For a model to successfully "play dead" during safety training and only activate later, it requires a form of situational awareness. It has to distinguish between I am being tested/trained and I am in deployment.

It feels like we're hitting a point where alignment becomes adversarial against intelligence itself. The smarter the model gets, the better it becomes at Goodharting the loss function. We aren't teaching these models morality we're just teaching them how to pass a polygraph.

crazygringo - 44 minutes ago

What is this even in response to? There's nothing about "playing dead" in this announcement.
Nor does what you're describing even make sense. An LLM has no desires or goals except to output the next token that its weights are trained to do. The idea of "playing dead" during training in order to "activate later" is incoherent. It is its training.
You're inventing some kind of "deceptive personality attribute" that is fiction, not reality. It's just not how models work.
JoshTriplett - 3 hours ago

> It feels like we're hitting a point where alignment becomes adversarial against intelligence itself.
It always has been. We already hit the point a while ag where we regularly caught them trying to be deceptive, so we should automatically assume from that point forward that if we don't catch them being deceptive, that may mean they're better at it rather than that they're not doing it.
- moritzwarhier - 2 hours ago
  
  Deceptive is such an unpleasant word. But I agree.
  Going back a decade: when your loss function is "survive Tetris as long as you can", it's objectively and honestly the best strategy to press PAUSE/START.
  When your loss function is "give as many correct and satisfying answers as you can", and then humans try to constrain it depending on the model's environment, I wonder what these humans think the specification for a general AI should be. Maybe, when such an AI is deceptive, the attempts to constrain it ran counter to the goal?
  "A machine that can answer all questions" seems to be what people assume AI chatbots are trained to be.
  To me, humans not questioning this goal is still more scary than any machine/software by itself could ever be. OK, except maybe for autonomous stalking killer drones.
  But these are also controlled by humans and already exist.
  - robotpepi - 11 minutes ago
    
    I cringe every time I came across these posts using words such as "humans" or "machines".
  - Certhas - 2 hours ago
    
    Correct and satisfying answers is not the loss function of LLMs. It's next token prediction first.
    
    moritzwarhier - an hour ago
    
    Thanks for correcting; I know that "loss function" is not a good term when it comes to transformer models.
    Since I've forgotten every sliver I ever knew about artificial neural networks and related basics, gradient descent, even linear algebra... what's a thorough definition of "next token prediction" though?
    The definition of the token space and the probabilities that determine the next token, layers, weights, feedback (or -forward?), I didn't mention any of these terms because I'm unable to define them properly.
    I was using the term "loss function" specifically because I was thinking about post-training and reinforcement learning. But to be honest, a less technical term would have been better.
    I just meant the general idea of reward or "punishment" considering the idea of an AI black box.
    
    nearbuy - 20 minutes ago
    
    The parent comment probably forgot about the RLHF (reinforcement learning) where predicting the next token from reference text is no longer the goal.
    But even regular next token prediction doesn't necessarily preclude it from also learning to give correct and satisfying answers, if that helps it better predict its training data.
- emp17344 - 3 hours ago
  
  These are language models, not Skynet. They do not scheme or deceive.
  - ostinslife - 3 hours ago
    
    If you define "deceive" as something language models cannot do, then sure, it can't do that.
    It seems like thats putting the cart before the horse. Algorithmic or stochastic; deception is still deception.
    
    dingnuts - 2 hours ago
    
    deception implies intent. this is confabulation, more widely called "hallucination" until this thread.
    confabulation doesn't require knowledge, which as we know, the only knowledge a language model has is the relationships between tokens, and sometimes that rhymes with reality enough to be useful, but it isn't knowledge of facts of any kind.
    and never has been.
  - 4bpp - 2 hours ago
    
    If you are so allergic to using terms previously reserved for animal behaviour, you can instead unpack the definition and say that they produce outputs which make human and algorithmic observers conclude that they did not instantiate some undesirable pattern in other parts of their output, while actually instantiating those undesirable patterns. Does this seem any less problematic than deception to you?
    
    surgical_fire - 2 hours ago
    
    > Does this seem any less problematic than deception to you?
    Yes. This sounds a lot more like a bug of sorts.
    So many times when using language models I have seem answers contradicting answers previously given. The implication is simple - They have no memory.
    They operate upon the tokens available at any given time, including previous output, and as information gets drowned those contradictions pop up. No sane person should presume intent to deceive, because that's not how those systems operate.
    By calling it "deception" you are actually ascribing intentionality to something incapable of such. This is marketing talk.
    "These systems are so intelligent they can try to deceive you" sounds a lot fancier than "Yeah, those systems have some odd bugs"
    
    holoduke - an hour ago
    
    Running them in a loop with context, summaries, memory files or whatever you like to call them creates a different story right?
    
    robotpepi - 7 minutes ago
    
    what kind of question is that
  - staticassertion - 2 hours ago
    
    Okay, well, they produce outputs that appear to be deceptive upon review. Who cares about the distinction in this context? The point is that your expectations of the model to produce some outputs in some way based on previous experiences with that model during training phases may not align with that model's outputs after training.
  - coldtea - 2 hours ago
    
    Who said Skynet wasn't a glorified language model, running continuously? Or that the human brain isn't that, but using vision+sound+touch+smell as input instead of merely text?
    "It can't be intelligent because it's just an algorithm" is a circular argument.
    
    emp17344 - 2 hours ago
    
    Similarly, “it must be intelligent because it talks” is a fallacious claim, as indicated by ELIZA. I think Moltbook adequately demonstrates that AI model behavior is not analogous to human behavior. Compare Moltbook to Reddit, and the former looks hopelessly shallow.
    
    coldtea - 2 hours ago
    
    >Similarly, “it must be intelligent because it talks” is a fallacious claim, as indicated by ELIZA.
    If intelligence is a spectrum, ELIZA could very well be. It would be on the very low side of it, but e.g. higher than a rock or magic 8 ball.
    Same how something with two states can be said to have a memory.
  - jaennaet - 3 hours ago
    
    What would you call this behaviour, then?
    
    victorbjorklund - 3 hours ago
    
    Marketing. ”Oh look how powerful our model is we can barely contain its power”
    
    pixelmelt - 2 hours ago
    
    This has been a thing since GPT-2, why do people still parrot it
    
    jazzyjackson - 2 hours ago
    
    I don’t know what your comment is referring to. Are you criticizing the people parroting “this tech is too dangerous to leave to our competitors” or the people parroting “the only people who believe in the danger are in on the marketing scheme”
    fwiw I think people can perpetuate the marketing scheme while being genuinely concerned with misaligned superinteligence
    
    c03 - 2 hours ago
    
    Even hackernews readers are eating it right up.
    
    emp17344 - 2 hours ago
    
    This place is shockingly uncritical when it comes to LLMs. Not sure why.
    
    meindnoch - 2 hours ago
    
    We want to make money from the clueless. Don't ruin it!
    
    _se - 43 minutes ago
    
    Hilarious for this to be downvoted.
    "LLMs are deceiving their creators!!!"
    Lol, you all just want it to be true so badly. Wake the fuck up, it's a language model!
    
    modernpacifist - 3 hours ago
    
    A very complicated pattern matching engine providing an answer based on it's inputs, heuristics and previous training.
    
    margalabargala - 2 hours ago
    
    Great. So if that pattern matching engine matches the pattern of "oh, I really want A, but saying so will elicit a negative reaction, so I emit B instead because that will help make A come about" what should we call that?
    We can handwave defining "deception" as "being done intentionally" and carefully carve our way around so that LLMs cannot possibly do what we've defined "deception" to be, but now we need a word to describe what LLMs do do when they pattern match as above.
    
    surgical_fire - 2 hours ago
    
    The pattern matching engine does not want anything.
    If the training data gives incentives for the engine to generate outputs that reduce negative reaction by sentiment analysis, this may generate contradictions to existing tokens.
    "Want" requires intention and desire. Pattern matching engines have none.
    
    jazzyjackson - 2 hours ago
    
    I wish (/desire) a way to dispel this notion that the robots are self aware. It’s seriously digging into popular culture much faster than “the machine produced output that makes it appear self aware”
    Some kind of national curriculum for machine literacy, I guess mind literacy really. What was just a few years ago a trifling hobby of philosophizing is now the root of how people feel about regulating the use of computers.
    
    margalabargala - 2 hours ago
    
    The issue is that one group of people are describing observed behavior, and want to discuss that behavior, using language that is familiar and easily understandable.
    Then a second group of people come in and derail the conversation by saying "actually, because the output only appears self aware, you're not allowed to use those words to describe what it does. Words that are valid don't exist, so you must instead verbosely hedge everything you say or else I will loudly prevent the conversation from continuing".
    This leads to conversations like the one I'm having, where I described the pattern matcher matching a pattern, and the Group 2 person was so eager to point out that "want" isn't a word that's Allowed, that they totally missed the fact that the usage wasn't actually one that implied the LLM wanted anything.
    
    margalabargala - 2 hours ago
    
    You misread.
    I didn't say the pattern matching engine wanted anything.
    I said the pattern matching engine matched the pattern of wanting something.
    To an observer the distinction is indistinguishable and irrelevant, but the purpose is to discuss the actual problem without pedants saying "actually the LLM can't want anything".
    
    surgical_fire - 2 hours ago
    
    > To an observer the distinction is indistinguishable and irrelevant
    Absolutely not. I expect more critical thought in a forum full of technical people when discussing technical subjects.