OpenAI DayBreak – GPT-5.5-Cyber

195 points by AaronO 17 hours ago

I don't know what the solution to this is, but I find it somewhat unfair that I pay money to Anthropic, and I pay money to OpenAI, and neither of them will let me use their best models for securing the software I work on.

Admittedly Opus 4.8 xhigh does a good job, but are my customers not entitled to have more security from a Fable/Mythos or GPT-5.5-Cyber audit over the codebase? Or I guess the inverse question: why aren't they allowed that audit?

(Fable/Mythos being unavailable notwithstanding.)

It seems OpenAI will at least let me do this narrowly, at greater cost, by using one of their partners. But I already pay them money!

peterspath - 3 hours ago

It is unfair and not useful at all.
If a criminal organisation (include here some countries) want to be deterred. We should all have access so we can improve security of our products.
Because the people that want to do evil, will do it anyway. They will build a myths, fables, and cyber clone.
I dislike the hypocrisy of it. Oeh ah it’s too dangerous, criminals can make use of this. But at the same time they themselves stole a whole lot of data to be trained on.
anon373839 - 11 hours ago

The problem is even worse than that. OpenAI and Anthropic have your source code and superior knowledge of its vulnerabilities. All you can do is hope that they won't one day use it against you.
- philstephenson - an hour ago
  
  To what end?
- stainablesteel - 6 hours ago
  
  or accidentally hand the information over to someone who will
- theplumber - 10 hours ago
  
  But they will! Or the government or the xyz agency !
ddxv - 11 hours ago

I think using open weight models will solve this. I believe they are nearly caught up and much of the gains are in the harnesses or properly orchestration of subqueries. (I'm no expert, just my opinion).
When the open weight models catch up, if they don't get lobbied and banned by OpenAi and Anthropic, then you'll be able to use them to properly secure your software.
- chillfox - 10 hours ago
  
  Pretty sure the secret sauce is in the summarised thinking. Maybe better though process… But I have a feeling it’s server side tools and a scratch space to prepare the reply.
  Sometimes the summarised thoughts include stuff that makes no sense unless it’s got a workspace on the server. Stuff like “I am now writing x to file y”.
  - dpoloncsak - 3 hours ago
    
    Not championing it, but this is where something like OpenClaw comes into play, right? The harness around the model, the ability to call tools, etc.
- energy123 - 10 hours ago
  
  I'm no cyber expert, maybe one can weigh in.
  Are there zero days that only a true genius can discover? Or can a smart-enough model, run over the codebase for enough time, discover them all?
  Like as we get smarter and smarter models do we expect each new generation to keep finding vulnerabilities, or to plateaue?
  - __alexs - 9 hours ago
    
    A large part of vulnerability analysis is just having the time to crunch through enough possibilities. Expertise and smarts definitely speed this up but there's a lot of just turning the crank until something falls out. Even a relatively dumb model with some good prompting will find vulnerabilities if you ask it to and give it the time and resources to do so.
    
    beardedwizard - 5 hours ago
    
    Completely agree. Its all about time spent.
    Been in the security industry a long time as a software engineer. Security research is no different than any other engineering discipline. It is down to the time you are willing to invest and where in the abstraction you focus.
    All of this pearl clutching and hand wringing over the capabilities of the models is silly to me. It has much less to do with some magical cybersecurity ability and much more to do with increasing ability of models to stay on task for long horizons. Any passionate engineer will recognize this - if you grind 10,000 hours you will find the solution to most problems, the problem is most people lack the motivation to even start, and are too risk averse to play hacker.
    The NSAs claim that all government systems were hacked by mythos and they were shocked by that is farcical. They have been hacked over and over and over by many who took the risk and tried.
    It's like they hired a competent red teamer to do internal pen testing for the first time, which we know is absolutely not the case. They have been doing it for years, and almost certainly surfacing the exact same kinds of findings each time, but they haven't been honest with the public about it and can scapegoat mythos now.
    
    rescbr - an hour ago
    
    > Any passionate engineer will recognize this - if you grind 10,000 hours you will find the solution to most problems, the problem is most people lack the motivation to even start, and are too risk averse to play hacker.
    This. I'd love to spend my whole day hacking stuff, but I need to pay my bills.
    Now with AI tooling my late night/weekend hobby hacking stuff is at least getting done. I'm definitely progressing with things that I began 2 years ago and I had to stop as other life priorities took over.
  - alex43578 - 3 hours ago
    
    That entirely depends on whether a “smart enough” model is a genius or where that cutoff is.
    To your second question, a clear plateau would be a piece of software that is 100% secure, without vulnerabilities. Since that’s impossible for anything more than a trivially simple program, particularly when you consider an ecosystem, I think there won’t be a plateau. If you use model A to secure program Dog, smarter model B could find a vulnerability in Dog or just skip to attacking Dog’s OS, firmware, etc.
pizlonator - 3 hours ago

It is super unfair
It creates a two tier system - those who have access and those who do not. Worse, it’s some corpo making the decision
- freedomben - 26 minutes ago
  
  Remember when access for all used to be a high priority for AI? I do. I noticed they don't talk about that anymore
andai - 2 hours ago

At least on their benchmark, the regular, public GPT-5.5 is basically at Mythos level already. (2% difference on CyberGym)
They didn't test Opus 4.8, but it probably isn't very far behind.
neural_thing - 31 minutes ago

I pay United airlines the price of an economy ticket, it's so unfair I don't get to fly in first class
- freedomben - 27 minutes ago
  
  If you want this analogy be correct, then you need to rewrite it to a world where you are not allowed to buy first class tickets at any price. Only people who work for a giant corporation that is blessed are allowed to buy first class, well you are only sold a coach seat. That doesn't sound very fair to me
  - neural_thing - 6 minutes ago
    
    You can pay anthropic enough money to be part of project glasswing. Some people here just can't accept that there's something coding related they can't afford.
  - fragmede - 17 minutes ago
    
    Somebody's never flown private
i2km - 10 hours ago

Surely what's coming is them offering to fix your vulnerabilities via higher-margin professional services?
milkshakes - 9 hours ago

take a look at this bug and the chain required to exploit it:
https://projectzero.google/2021/12/a-deep-dive-into-nso-zero...
https://projectzero.google/2022/03/forcedentry-sandbox-escap...
exploiting vulnerabilities on hardened targets isn't just in a different league from finding them, it is a different sport altogether.
put simply, it's the difference between an integer overflow leading to a sandbox escaping RCE and one that leads to a crash.
Codex Security and 5.5/5.6 are still very good finding vulnerable code -- they will identify and fix unsafe behavior, but they will refuse to help you with exploitation -- they will actively prevent you from taking any steps to weaponize the unsafe behavior that are not required to remediate it. they will err conservative here, but for the most part they will still let you discover and address a wide range and depth of vulnerabilities. you can verify yourself to turn off the most basic safeguards and sign up through a more rigorous process for a spectrum of TAC options.
obviously there is a balance here -- openai wants to empower defenders while at the same time not exposing capabilities to the adversaries that would overwhelm defenders. there is no "right" answer. it is a work in progress. this is an intentional and deliberate decision to provide defenders with a (temporary, dwindling) advantage.
the example i chose was pretty extreme, but the underlying principle -- enable visibility discovery and remediation, but make it difficult to weaponize and defeat countermeasures makes sense given the bigger picture, IMO.
this calm before the storm is not going to last for very long, and defenders need every advantage they can get to get their houses in order before these capabilities are widely commoditized.
gavinray - 5 hours ago
```
  > and neither of them will let me use their best models for securing the software I work on.
```
I mean, are you saying you submitted a Trusted Access application to both OpenAI + Anthropic and they BOTH declined it?
I have Verified/Trusted Access on both of them and I don't even work in Cyber.
I filled it out as an individual using my own Org ID and I got accepted to both of them, lol.
- taspeotis - 4 hours ago
  
  So I got turned off the OpenAI form because it’s pretty heavily geared towards “enterprise” which I’m not. But I’ll have a stab at it later.
  What’s the equivalent form for Anthropic please? The closest I got from Google was Claude Security’s “contact sales.”
  - gavinray - an hour ago
    
    Equivalent of Trusted Access for Anthropic is "CVP" (Cyber Verification Program)
    https://support.claude.com/en/articles/14604842-real-time-cy...
thinkingtoilet - 6 hours ago

Why is it unfair? Are you entitled to them? They released a product and you are paying for it. If you don't like the product, don't pay for it and don't use it.
- tqwhite - 6 hours ago
  
  At least, pay them for the inferior intelligence until Donald Trump says you can.
giwook - 5 hours ago

Because while you may be a good actor, there are just as many bad actors out there.
How does Anthropic or OpenAI differentiate between the two?
Once you solve that, you can get access to Mythos ;)
- avaer - 4 hours ago
  
  More importantly who gets to decide good or bad?
  Remember all of these models are based on unimaginable levels of copyright infringement. Is OpenAI a bad actor, that they use their models to infringe on the rights of others?
  This isn't a moral argument. This is all about power and money, not good or bad. That includes the Mythos ban. Good vs bad actors is political theater designed to distract from what's actually going on.
  - ch4s3 - 4 hours ago
    
    > unimaginable levels of copyright infringement
    This isn't how copyright works. The models don't wholesale encode literal information from original works and are substantive transformations. Now, you yourself as a user can use the models and weights to infringe on a copyright.
    
    frabcus - 4 hours ago
    
    There have been some US cases about this, but it isn't generally settled internationally. "Fair use" is a US specific thing. Even in the US there are ongoing cases.
    Paper about how weights are a derivative work of the training data: https://arxiv.org/abs/2407.13493
    Currently in progress law suits about AI copyright: https://informationisbeautiful.net/visualizations/the-rise-o...
    
    ch4s3 - 3 hours ago
    
    Yeah, I'm familiar with that argument re derivative work, but weights aren't really what's being shipped or sold, and I think it's reasonable to argue that the generated tokens aren't derivative but substantively transformed.
    That said, I would prefer a situation where hyper-scalers make an effort to compensate sources of good data, e.g. newspapers and so on.
    
    ch4s3 - 33 minutes ago
    
    Like it or not, Bartz v. Anthropic established that as fair use. So it isn't legally copyright infringement as currently understood under the law. This may change but it isn't obviously wrong.
    
    stackghost - 2 hours ago
    
    I think parent poster was referring to the open secret that the early models were trained on massive collections of pirated novels and textbooks.
- re-thc - 4 hours ago
  
  > How does Anthropic or OpenAI differentiate between the two?
  So if they can't why do some companies still get access today? Just 1s much bigger than "us".
  It's the equivalent of saying a company like Amazon or Cloudflare should block access to web hosting or "illegal hosting". The argument back then was they aren't gatekeepers? But now they are?
  - pixl97 - 3 hours ago
    
    This is really odd taking two completely different things and trying to apply law against them. Hosting was somewhat protected by previous rulings, selling AI services is not.
    
    re-thc - 17 minutes ago
    
    > This is really odd taking two completely different things and trying to apply law against them. Hosting was somewhat protected by previous rulings, selling AI services is not.
    What's different? What's not protected? And what's "hosting"? Where do you draw the line with "managed services"?
    So if you use "AI" to hack a computer it is different to using "hosting" to put "illegal content"?
    Are you implying 1 of them is legal? But both are for the judge to decide.
    OR if this is about the provider -- who's selling AI services? It's LLM. Just running software on GPUs. There's no AI. There, done. Same.
ben_w - 10 hours ago

While I appreciate the desire to have the best:
> Or I guess the inverse question: why aren't they allowed that audit?
There's undeniably a lot of unsecured software in the world.
Given that ID verification is hard and these companies are clearly new at it (or don't understand the implications of it, cough Worldcoin's eye-scanning orbs cough), which is worse:
(1) sufficiently good AI* is released to everyone: critical infrastructure and open source projects gets better hacking tools to white-hack their own code at exactly the same time as black hat hackers
(2) sufficiently good AI* is released to critical infrastructure and open source projects first: everyone else, the average paying customer has to wait but so too do the black hats
Because (2) is either the status quo or better depending on if you have access or not; and because (1) seems to me to lead to an acceleration of zero-days, I lean towards (1) being the worse.
* having no experience of pen-testing, I take no position on if this is "it" or not
- akmiller - 4 hours ago
  
  1 assumes that some "private entity" gets to decide what is crucial infrastructure and what is not, what gets the opportunity to be patched and what doesn't.
  I'm not ok with that and don't know why anyone would be.
  - ben_w - 2 hours ago
    
    The owner of a thing deciding to whom they wish to provide access to the thing they own, is a necessary consequence of the concept of private property.
    The only two alternative to a private entity making this decision are a government making this decision, or nobody making this decision, the latter of which is equivalent to both government and a private entity making the decision to do (1).
  - pixl97 - 2 hours ago
    
    Because it's their property. Now you can try to make an argument that it's stolen IP and that matters in some way, but that's just more likely to ensure no one has access.
    Even more so they are getting push back from the government (good job electing idiots) that said models are a security risks.
    But until then the company can charge/give access to whoever they want for however much they want except in the cases the law says no.
    And if you don't like it raise a trillion dollars and make your own.
piokoch - 9 hours ago

Soon, very soon, if you will need something useful, like medical advice, financial advice, you will be told that, well, ok, but you need to pay for an "extended license" that gonna be in thousands of dollars per month, otherwise you need to hire someone who paid that money.
The only hope are Chinese models, as Chinese commies are playing a different game as long as they are behind the flagship models (but it will change soon, like with cheap Chinese cars) and maybe, finally, Europe will start working on their solutions, instead of regulations.
- infecto - 5 hours ago
  
  That sounds too dire. My suspicion is building a model either as a derivative or brand new is a solved problem. There are indeed capital constraints today but would wager that over the horizon those go down. If one business is restricting access to something great over the medium term other companies will step up.
  - treis - 2 hours ago
    
    I think every business wants to bill on value not usage. That's where the real money is made. If a diagnosis is worth $100 and takes $1 worth of tokens you want to bill as close to $100 as you can. Right now they're billing $1 and barely making money.
    
    ahtihn - 2 hours ago
    
    What makes the diagnosis "worth" $100? Right now it requires a highly paid human which sets a floor on the cost.
    If there's competition from LLMs it's going to drive down the cost.
    
    treis - 2 hours ago
    
    >What makes the diagnosis "worth" $100?
    The opinion of the customer paying for it
dist-epoch - 7 hours ago

If you can buy a gun from a weapons manufacturer it doesn't mean they should also allow you to buy a rocket launcher.
- akmiller - 5 hours ago
  
  First of all, these products are "legal". I think the point is more that we pay for your top subscription but you've decided that a handful of companies that you pick get access to the best of the best now and everyone else has to wait and perhaps they may be allowed access at some point...if deemed worthy.
  You want the few leading AI companies in the U.S. to work under the model where they (and potentially the U.S. gov't) gets to decide who gets access to what compute? If you are fine paying into that model, then good for you...just a matter of time before they cut you off and you have no ramifications.
  - infecto - 5 hours ago
    
    Since when did US markets ever guarantee unlimited access to everyone. I don’t understand this line of thinking that has cropped up around AI companies.
    
    akmiller - 4 hours ago
    
    I haven't seen anyone suggest that a company can't do this, they absolutely can within certain frameworks of the law. That doesn't mean consumers have to like it and continue to business with companies such as these.
    
    infecto - 4 hours ago
    
    [flagged]
MrOrelliOReilly - 11 hours ago

I'm not sure I follow your logic. Paying for a service does not mean you get access to all potential services a provider offers. Providers can choose to keep some services internal.
Silly example: I pay Netflix for their most basic plan, so I get ads. Just because I already pay them money, doesn't mean I have a right to no ads! It also doesn't mean I have a right to 8k streaming; maybe Netflix reserves that for their internal cinema.
- NichoPaolucci - 7 hours ago
  
  Both companies offer "MAX" or "PRO" plans - and the best models were available to those customers. This new wave of "It's too dangerous for the public" is a new initiative from both companies.
  I agree with your overall sentiment. Paying for "Claude Mini" doesn't get you "Claude Maximos".
  However, the overall precedent that the companies have set is that if you pay for the top tier subscription, you get the top tier model. That's not true any more.
  - infecto - 5 hours ago
    
    This is so similar to people arguing for plan tokens to be used with third party tools. It does not jive with my understanding of the world. Do people really expect that paying for a top plan actually gets you guaranteed access to everything? It’s great when it works but at the end of the day why build that false expectation.
  - estearum - 5 hours ago
    
    Just like when you buy a top of the line camera or car, and then they release a new one, you are entitled to the now-top-of-the-line camera or car.
    What the heck come on.
    
    thimabi - 4 hours ago
    
    Buying a camera or car is different from paying a subscription, right? Different expectations
    
    estearum - 3 hours ago
    
    No, I have never bought a subscription and then expected it to get arbitrarily upgraded any time a higher tier was introduced.
    
    dpoloncsak - 3 hours ago
    
    But that's the exact standard that was set by the LLM providers, right? My ChatGPT 3.5 sub became a 4o sub, which became a 5.1 sub and so on
- dgellow - 7 hours ago
  
  You have the right to complain and ask for more though
- Intermernet - 8 hours ago
  
  When Netflix launched, you got the service without ads. That has changed. That's what's known as a rug-pull.
- re-thc - 4 hours ago
  
  > Paying for a service does not mean you get access to all potential services a provider offers. Providers can choose to keep some services internal.
  The problem most people have is "the logic".
  Sure you can keep it internal. Sure you can not offer it to everyone.
  No then it is not for "world security", "world peace" or some other "explanation".

jasonvorhe - an hour ago

"trusted defenders" sounds really Orwellian. Reminds me of EU's "trusted flaggers": https://digital-strategy.ec.europa.eu/en/policies/trusted-fl...

I don't trust any of these people. Meanwhile I'm paying ChatGPT and Claude and can't use their top-tier levels because they assume I'm some security risk/terrorist.

Local AI is our only hope.

dofm - an hour ago

Why not stop paying for ChatGPT and Claude, then?
Like, seriously, while you can, invest in your own stack or find cloud alternatives.
If you actually want to use these products, the easiest way you will contribute to changing their money-grubbing minds about their policies and offerings, is to stop giving them yours.
Peace of mind and control of your destiny is worth a bit of cash. And there's seemingly little risk of any kit you buy radically depreciating in cost.
I am still really cynical about all of this BS but I must say I am fully impressed by the diligence and quality of some of the open source tooling — Unsloth Studio, Opencode, Paseo, Pi, etc.
And I look at it and think: putting aside local models (and I am not sure you should, even now), is this stuff really so inferior that it's worth risking a critical dependency on a commercial cloud product that might get switched off with no notice?

theplumber - 10 hours ago

Ok so why I don’t have access to this if I already pay for the max plan? Should I pay a security researcher to run codex on my code? Is this how it is supposed to work? Let’s hope we get some real cyber models that people can actually use from the Chinese without the stupid application forms.

infecto - 5 hours ago

Why does paying for the max plan build the expectation that you get access to everything?
- armenarmen - 4 hours ago
  
  “Max” leads me to believe I have the maximum level of access
  - infecto - 4 hours ago
    
    Maybe that’s just a naming problem then. But even if they called it “Ultra Max Super Pro,” I still wouldn’t assume it means access to every future capability, every internal tool, or every restricted model they ever build.
    To me, “Max” means the highest tier of the product they’re currently offering to that customer segment, not an unlimited claim on everything the company possesses.
    Again, for me this has just been a very strange argument I keep seeing around these plans. It’s obvious the plans are subsidized and I am happy to take advantage of that in the near term but I would be a fool to think my $200/month account buys any special access.
    
    armenarmen - 4 hours ago
    
    But it did give us access to frontier models through the history of the plan, and up until what a week ago now
  - tacoooooooo - 4 hours ago
    
    you'll need the Max Plus plan for that, buddy
  - Dilettante_ - 4 hours ago
    
    And the prostitute really does love you and Your Call really Is Important To Us.
    That's a little snarkier than strictly appropriate in this forum but you cannot seriously and in good faith say that this is the first time you've seen advertisements 'exagerrate' the truth.
    
    dpoloncsak - 2 hours ago
    
    Nobody said anything about it being the first time seeing it. You still get to be upset when lied to.
baq - 10 hours ago

Why do you think you should have access…? People who pay enterprise API rates also don’t if this makes you feel better (it shouldn’t, you shouldn’t have felt bad in the first place)
- civet_java - 8 hours ago
  
  I'm not sure I understand what you're arguing for? There are massive companies that collectively profiting off of stolen IP and are now gatekeeping even their paid offerings - surely consumers will rail against this? Personally, I feel very bad and can't wait for Chinese models to continue improving as much as they can prior OpenAI's and Anthropic's IPOs.
  - baq - 6 hours ago
    
    I’m not arguing for anything, actually. The ‘fair’ ship has sailed, even if the pirates somehow get shut down (which would be suicide by USG, won’t happen, national security issue), open Chinese models are not even hiding the fact that they distill from the frontier US labs, thus benefiting indirectly from the stolen content.
    Note I don’t particularly like the ‘stolen’ word here as I don’t like when the music and film companies use it in the same context. Copyright infringement? Sure. Theft? No.
    
    civet_java - an hour ago
    
    > I don’t particularly like the ‘stolen’ word here
    Except that's the standard that we've measured everyone with up until the LLM/generative tech boom. I don't see why the benchmarks should change now. I realise my argument doesn't move reality but that doesn't mean we shouldn't call a spade a spade. Said companies carried out theft (or copyright infringement if you prefer) at industrial scale which is far more reprehensible crime against humanity than anything the individuals we think of as "digital pirates" have committed.
    > open Chinese models are not even hiding the fact that they distill from the frontier US labs
    The difference is they return to the same system that they feed from (indirectly); people get access to model weights even if the entire model isn't open source. The same can't be said for OpenAI, Anthropic, Google etc (who also benefit from Chinese models and train on them).
    Sure, the alternatives aren't a panacea of fairness but I'd much rather advocate for and support the thieves who give me a better deal if my choice is limited to thieves. Especially if thieves aren't hostile to their customers like Anthropic is (which is why I replied to you in the first place).
  - infecto - 5 hours ago
    
    And the Chinese models rip IP just like everyone else before them. Your argument is moot.
    This was a problem for 5+ years ago. Nobody cares or at least the majority voice does not care across the world. Cat is out of the bag and there is no way to put it back in.
    EDIT: Worth noting that I have long held the belief that if you put data out on the public sidewalk that you should have low to no expectation that it’s IP. It’s how I think about Google Maps data for example. If they want to reap the benefits by not walking it off the a user login than they can feel the pain if folks use that information. Same applies for media that has been bought, Reddit comments or any other datasets.
    
    civet_java - an hour ago
    
    > And the Chinese models rip IP just like everyone else before them.
    The difference is the Chinese models return to the same system that they feed from (indirectly); people get access to model weights even if the entire model isn't open source. The same can't be said for OpenAI, Anthropic, Google etc (who also benefit from Chinese models and train on them).
    Further, Chinese models are significantly cheaper and the comapnies aren't hostile to their customers.
    > Worth noting that I have long held the belief that if you put data out on the public sidewalk that you should have low to no expectation that it’s IP.
    Except your beliefs aren't the cornerstone of modern jurisprudence. Why are models able to reliably produce replicas of Ghibli movies which go well beyond any example you listed?
    
    infecto - 15 minutes ago
    
    [flagged]
- 10 hours ago

[deleted]

Recursing - 9 hours ago

I see a lot of knee-jerk comments to this, but I highly recommend running a scan ( https://openai.com/daybreak/codex-security-plugin/#codex-cli ) in your projects so you can evaluate it yourself. It found a real security issue in a project of mine, with very few false-positives.

Its built-in resume mechanism didn't work after it crashed when running out of my 5 hour session limit, but Claude Code was easily able to resume it 5 hours later reading the session logs and https://openai.com/codex/security/scan.sh

- 3 hours ago

[deleted]

egorfine - 8 hours ago

I read this news as white noise because there is no scenario in which I will be allowed access to this model. First, I happen to be a citizen of a country that is not the USA. What's more shocking is that I'm not even located in the US. Thus in the eyes of OpenAI I do not exist in regard to SOTA security models. Second, I will never ever do KYC with a company that provides text transformation services*. Third, even if I did, I will not be able to pass KYC because the typical KYC requirements are strictly tailored to a certain subset of the world's population and lifestyle choices, tuned by Americans according to their world view. Fourth, even if I pass KYC, my account will be banned by OpenAI immediately on the first prompt because they have close to 1B users and couldn't care less about any single one of them.

(*) which are nothing short of amazing and are changing the world, there's no doubt about that.

bilekas - 8 hours ago

There is so much to unpack here.
> Thus in the eyes of OpenAI I do not exist in regard to SOTA security models.
I'm not seeing anywhere it says it's only limited to the U.S. Only that they had 'ongoing dialogue' with them. Which reads weird to me, how can an ongoing dialogue be past tense? But I digress.
> We’ve had ongoing dialogue with the U.S. government about our cyber approach, including today’s announcements and on our preparation for upcoming model releases.
> Third, even if I did, I will not be able to pass KYC because the typical KYC requirements are strictly tailored to a certain subset of the world's population and lifestyle choices, tuned by Americans according to their world view.
KYC is just that, Know Your Customer, if your 'permitted customers' are security researchers in the industry with a proven identity of employment etc then that is the KYC process, I don't see any issues with that.
> even if I pass KYC, my account will be banned by OpenAI immediately on the first prompt because they have close to 1B users and couldn't care less about any single one of them
Why do you assume this? Are you planning on intentionally trying to do something actively nefarious ? It's such a strange take.
- egorfine - 8 hours ago
  
  > how can an ongoing dialogue be past tense?
  Easy: it can be considered past tense in case "ongoing dialogue" is a corporatespeak for "f..k you". Which I believe is the case here. But that's an opinion.
  > Know Your Customer [..] I don't see any issues with that
  This might be the case if you're coming from a standpoint I have mentioned: the American one. This is a world view where everybody have physical paper documents proving residence, every labour effort is arranged in a very specific legal framework, every person have an address in a specific format, every person has one of just a few types of ID documents, etc, etc.
  Problem is, the world have vast, vast differences in all of the mentioned areas and KYC companies couldn't care less because they are a business and they make money by KYCing as much people as possible for as little spend as possible. Thus they simply ignore any case that's not mainstream no matter how perfectly legal it is.
  Being a digital nomad I cannot pass KYC at the vast majority of online services. My passport is under no sanctions, I do have residency in the first world country, etc., but passing KYC at Persona and others is not possible.
  >> my account will be banned by OpenAI immediately > Why do you assume this?
  Because of the risk profile. The company has no way of knowing whether "find all security vulnerabilities in this code" is a request from a whitehat or a blackhat hacker. The risk of someone using GPT to hack yet another DeFi project for a hundred millions while mentioning OpenAI is higher than perhaps a million user accounts, let alone a single one.
  - bilekas - 6 hours ago
    
    > The company has no way of knowing whether "find all security vulnerabilities in this code" is a request from a whitehat or a blackhat hacker
    That's what KYC is for.
    > This might be the case if you're coming from a standpoint I have mentioned: the American one.
    I'm not in the US, nor America for that matter, I'm in the EU.
    > Problem is, the world have vast, vast differences in all of the mentioned areas and KYC companies couldn't care less because they are a business and they make money by KYCing as much people as possible for as little spend as possible
    "The lady doth protest too much, methinks".
    There's not much constructive here other than a lot of assumptions and apparent malcontempt with how some businesses handle their business, but that's for another topic I think.
    
    egorfine - 6 hours ago
    
    >> The company has no way of knowing > That's what KYC is for.
    No, KYC has nothing to do with that problem. KYC doesn't help at all here.
    > I'm not in the US, nor America for that matter, I'm in the EU.
    Same here.
    
    milkshakes - 5 hours ago
    
    > No, KYC has nothing to do with that problem. KYC doesn't help at all here.
    that's a bold statement. how does it not help solve the problem? what is a better solution?
    
    ahtihn - 2 hours ago
    
    How does KYC tell a company whether you have bad intentions or not? Let's say you work in a consultancy doing security research. On paper that looks good right?
    How easy would it be for criminal orgs to setup legitimate looking fronts to pass these KYC checks?
    
    milkshakes - an hour ago
    
    see my downthread post. kyc is the first step in the process, not the last. without verifying identity, none of the other steps can take place
    
    egorfine - 5 hours ago
    
    > how does it not help solve the problem
    How does it? Online KYC is a procedure to verify someone's documents and face. And that's it. What does it have to do with the actual usage of the OpenAI account and the code that is being examined with AI?
    
    milkshakes - 5 hours ago
    
    > The company has no way of knowing whether "find all security vulnerabilities in this code" is a request from a whitehat or a blackhat hacker
    the system in place to prevent unauthorized abuse. by default, the guardrails are conservative. to reduce the guardrails you can jump through a progressive series of hoops to establish whether or not you have a valid use case. the entrypoint for establishing your use case is verifying your identity and background. if you don't want to do this, you are free to use Codex Security to identify and fix vulnerabilities, it is quite good at this. the harness and model are already evaluating the usage of the account and the nature of the code being examined and actions requested. but the again, the guardrail thresholds will be very conservative for anonymous users.
    what is your proposal?
    
    egorfine - 4 hours ago
    
    > what is your proposal?
    None. I don't see a solution.
    I'm silently rooting for Chinese models here.
dist-epoch - 7 hours ago

> Second, I will never ever do KYC with a company that provides text transformation services*
I guess you would also not provide KYC to a bank that provides number additions/subtractions between database rows services
- egorfine - 6 hours ago
  
  As a society we have collectively decided that these numbers are money. Thus different rules apply.
  And I'm happy to pass KYC in person, so I do have accounts in different parts of the world. It's the online KYC that's not passable for digital nomads like me.

kstkrv - 3 hours ago

Is it only my feeling, that the US government rolled back the Fable release, because they wanted their guy to get to the market before Anthropic?

Oarch - 2 hours ago

Realistically, what can these types of commitments look like for the AI frontier companies, moving forward?

They're releasing ever more powerful models with stronger offensive capabilities. So do they have to help bolster the defense of all existing software, just... forever?

If we advance both the offence and defense with each new release, is this sustainable?

mentalgear - 11 hours ago

No one commenting on the fact that oAI is releasing a Claude Mythos-class model - with apparent 0 restrictions or concerns by the US government, while Anthropic's (their competitor) model has been pulled weeks prior by the administration for 'security' reasons.

It certainly has nothing to do with openAI's co-founders donating to the current administrations election fund, are actively supporting the DoW war efforts of autonomous weapons and also otherwise being ideology tightly coupled with the current US government.

frankacter - 9 hours ago
>No one commenting on the fact that oAI is releasing a Claude Mythos-class model - with apparent 0 restrictions or concerns by the US government
We don't know that it is Mythos level, it could very well be at (guardrailed) Fable or below.
This is not a wide open distribution, this is only being provided to hand picked partners, similar to how Mythos was distributed (unlike Fable which had wider distribution)
The larger question, which I don't see an answer to in this post:
1) was this tested and validated by the US Government?
2) is the list of partners vetted by the US Government?
If This is "mythos-class" AND
```
   OpenAI approves SK Telecom as a trusted partner ( https://www.wired.com/story/sk-telecom-anthropic-mythos-export-controls/ ) 
```
OR
```
   OpenAI did not get approval.
```
will this be shut down as quick? Otherwise, it is not really a comparable scenario.
- arcanemachiner - 9 hours ago
  
  Isn't Fable just Mythos + prompt guardrails?
  - TideAd - 6 hours ago
    
    The guardrails i.e. the classifier would also fire on Fable's output and even Fable's internals (i.e. prior to the shutdown people would see it firing when Fable got angry)
    https://x.com/repligate/status/2069150509978489287
  - frankacter - 8 hours ago
    
    Yes, limiting the full scope of capabilities, which is why I differentiated between Mythos (unrestricted) from guardrailed.
    Main point being, we have no idea the measurable capabilities of this, it could be as great or better as unrestricted Mythos, or on par with guardrailed Fable.. or just OpenAI hype that measures up to neither.
    The distinction is important because if it truly is a Mythos level (or even guardrailed Fable) it in theory would require that 30 day US government validation before release as well as oversight to the approved partners allowed to use it.
    Op was drawing a parallel as to why we should be outraged at the double standard, I was drawing a better parallel by which to compare.
  - snewman - 8 hours ago
    
    Correct.
snaking0776 - 6 hours ago

I feel like we’re all just operating on vibes at this point (including the government). These models are already powerful and could probably be used to do some bad things even without the next gen.
Anthropic has successfully made their vibe the bringers of the end of the world so please stop us. ChatGPT is in some weird middle ground where they’re no longer the evil company of a year ago and are trying to act like the mature one in the business. XAI is the vanity project of a jealous kid who wants to ignore anyone elses opinion but is falling behind. Gemini is viewed as the less cool cousin and no one pays attention to since google has generally avoided the end of the world talk (coincidentally I think gemini has the best non-coding model but no one seems to talk about that)
I think each company has been treated in the wider discourse based on their vibe and no one is operating on facts here since it’s all pretty obfuscated intentionally. I could see some things going very bad as a result of these models but I also don’t believe anyone who’s just pattern matching to their favorite scifi book here and we’re all playing into it. May as well go outside for a bit and enjoy the sun.
toufka - 4 hours ago

Also the fact that the president’s daughter, son in law, and his brother have a significant amount of their net worth invested in OpenAI via Thrive.
maxbond - 8 hours ago

Entirely possible but let's give it some time to see if they try to make it GA and if the DoD sends them a letter.
postepowanieadm - 8 hours ago

Maybe if Anthropic haven't called it too dangerous for public things could be different?
- ninjalanternshk - 7 hours ago
  
  We should be basing public policy on facts not marketing language.
  - Insanity - 6 hours ago
    
    This assumes that the people in charge for public policy understand the technology beyond the marketing. Which I don’t think is this case based on the track record of policymakers when it comes to essentially any technology.
  - altmanaltman - 6 hours ago
    
    Amazon CEO literally told US gov officials that it could be jailbroken and is dangerous. That was a fact (or from the pov of the gov it was). If they were going to ban based on marketing, all LLMs models would have been banned by now.
  - Atotalnoob - 5 hours ago
    
    Why? You can’t yell fire in a crowded place, but “this is going to destroy the world, sign up now” is okay?
    Most things should be fact based, but you also should have limits to your marketing
    
    estearum - 5 hours ago
    
    Did Anthropic say "this is going to destroy the world, sign up now"? The details matter quite a lot actually in free speech cases.
    Should a nuclear energy startup that says "nuclear weapons are very dangerous and should be regulated" be liable to have its assets frozen?
- signatoremo - 2 hours ago
  
  Huh? This is like saying "Maybe she should not be wearing too revealing dress" about a rape victim.
scrollop - 6 hours ago

Interesting beside the time when Anthropic were banned for not allowing the MoW to use claude to kill people, then OpenAI come in and swoop up the contract.
- estearum - 5 hours ago
  
  That's not what happened.
  DoD signed a contract with Anthropic that said it can't use Claude 1) to run a fully-autonomous killchain or 2) to surveil Americans.
  DoD then decided they did not want to abide by the contract they signed, and they've since launched a pressure campaign to coerce Anthropic into reneging on the terms they already agreed to.
FergusArgyll - 7 hours ago

It's 1) Not mythos class 2) restricted to "security partners"
- mijoharas - 7 hours ago
  
  They at least claim that it is greater than Mythos class[0]
  [0] https://news.ycombinator.com/item?id=48642254
  - dist-epoch - 7 hours ago
    
    They show one security benchmark where its above Mythos.
    That doesn't mean it matches Mythos ability to write functional exploits.
    Maybe it is, maybe it's not. They don't tell.
flanked-evergl - 11 hours ago

Do you think that Anthropic's models would have been pulled if they did not say for months how their models is basically going to break the whole internet and that governments should most definitely restrict AI? I doubt it.
The problem is, though, given Anthropic have said all of that, they really have very little grounds for objecting to the US government's intervention here. Everything that the government would have to prove to justify their intervention has already been freely admitted by Anthropic, even though the "admission" was maybe more intended as a marketing ploy.
postalcoder - 11 hours ago

Man, some of you will invent conspiracy theories to justify some deeply cynical fiction. OAI has been more proactive about doing customer KYC than A\.
OpenAI, four months ago, started to require users to verify their identity if they flagged their activities on frontier models (gpt-5.3-codex and higher) as risky. Their filters were originally quite coarse and it resulted in a ton of normal tasks being flagged. There was a lot of drama about it at the time, but it seems like things have smoothed out.
KYC goes back to a year or two ago. API access to gpt-image-1 required it.
https://openai.com/index/trusted-access-for-cyber/
- blahblaher - 10 hours ago
  
  And some of you really are ingenuous... Like the US government cares anything about that.
  - tqwhite - 6 hours ago
    
    Donald Trump cares about it and frequently uses our government to abuse people he does not like, to carry out grudges and vendettas.
    There is no doubt in my mind that he is personally pissed because Anthropic stood up to him.
- mijoharas - 9 hours ago
  
  Oh! So the new openai model is limited to US residents and they use their existing KYC process to verify it?
  That makes sense if both openai and anthropic have export restrictions on their similar models. If they didn't then it seems like the comment you're replying to may be correct.
  - infecto - 5 hours ago
    
    OAI has definitely been doing KYC for a lot longer than Anthropic.
beshur - 6 hours ago

Guess this is how capitalism works? The bestest example we might get from the original inventors so to say.
netdur - 11 hours ago

Why do you want to tax openai on anthropic's fud mistake?
catigula - 5 hours ago

Conspiracy theory. You should be shamed for this idea.
Mythos is a serious model; the NSA said it compromised them in hours.
This is a marketing article.
ReptileMan - 6 hours ago

"Let’s just say that if complete and utter chaos was lightning, then he’d be the sort to stand on a hilltop in a thunderstorm wearing wet copper armour and shouting 'All gods are bastards'."
This quote from Pratchett pretty well summarizes Anthropic marketing approach. OAI are quite more restrained in advertising their model's doomsday abilities which greatly diminishes the pretexts that USG can use.
Two things can both be true - that Trump's administration is petty and vindicative and that Anthropic are reckless and not doing their best in winning hearts and minds.
theplumber - 10 hours ago

[dead]

GL26 - 10 hours ago

Would love to see the benchmark comparison between Mythos / Fable and GPT-5.5-Cyber

mijoharas - 9 hours ago

Do you mean full benchmarks? Because from the article they claim 85.6 for 5.5-Cyber vs. 83.8 for mythos on Cybergym.

bwfan123 - 3 hours ago

Security is a great business model. You sell locks, and if thieves break in, you sell more and stronger locks. There are no consequences for the lock-maker. Similarly, AI tools can both find and fix vulnerabilities. Not only that, AI can create vulnerabilities in the code it generates. Now that is a perpetual money machine.

KronisLV - 8 hours ago

Since this is more powerful than Fable in some of the benchmarks, surely it'll also get export controls... right? Right?

tetrisgm - 11 hours ago

It's a pretty interesting opportunity. I wonder if they will reach to companies and tell them how many things they could fix and how many are critical, before selling them the solution.

KeplerBoy - 11 hours ago

If they won't, some consultant with a subscription eventually will.

daflip - 12 hours ago

I guess eventually the whole process can be completely autonomous, what could possibly go wrong :-)

arikrahman - 12 hours ago

It's good looking forward to wrapping it around Reasonix

nova22033 - 4 hours ago

Chinese and Russian intel agencies can set up American shell corporations and buy all of our personal data....but using a model to secure my customer? Well...no...you can't have that.

ramon156 - 12 hours ago

AI companies yearn for otgs built on AI tools

throwaway888abc - 12 hours ago

Can someone on HN with access to it fix the Fable / Mythos so it's secure to use again and therefore available ?

joe_the_user - 11 hours ago

[dead]

lisa_luoyf - 9 hours ago

Interesting release. I’m most curious about how well this holds up in messy real-world environments, since that’s usually where specialized benchmark gains get tested.

theodorespeak - 4 hours ago

Let's see if I can connect the dots as well as I think I can.

Just putting it here for posterity.

Like those movies before them, The Creator will be a cult classic in a few years.

The acting was okay, the story was okay.The vfx were great.

The premise is prophetic.

America and China will go to war over AI.

America will try and contain it for themselves. China will keep on trying to keep it accessible.

After endless negotiations between the two superpowers, there is no more room left for talking.

The free (not as in beer) AI models were always seemingly slightly worse than the proprietary American models. A barely noticeable difference on most tasks, but a difference nonetheless.

The breakthrough came when people, companies and governments began chaining and pooling models and infrastructure together,because they were free to do so, thereby creating behemoths that America could not outclass, outsmart or outspend.

And when you stake your whole future on that one thing, it's winner take all.

So for that reason they all had had to die.

And so the war began...

elashri - 10 hours ago

I think if nothing happens from the government, then this would be a very good example of the benefit of keeping your mouse shut especially if you are lying to get some hype like Anthropic did for months.

spwa4 - 11 hours ago

Does the EU CRA now mean that every European company that either sells software or sells anything that has a software component is now forced to pay for this by September and update their software?

- 12 hours ago

[deleted]

sigbeta - 10 hours ago

whats the point of a benchmark if its not deployable? another glasswing pr stunt to me

baq - 10 hours ago

Definitely a PR stunt that I had to reboot my boxen every other day in May for security patches

brcmthrowaway - 12 hours ago

Gamechanger

beyondscaletech - 5 hours ago

[flagged]

lionkor - 12 hours ago

This is how you do it when you're not AS childish. You go "here's a model for cybersecurity" and put a price on it. I know they're releasing it to some vendors first, etc. but the lack of a clown spectacle is nice.

The whole "it's too dangerous to release!" is complete hogwash.

A person can take a hammer, walk out in the street, and we can count how many people he can kill with the hammer before he is stopped. My local hardware store still sells hammers, and I haven't seen the CEO of it claim that their hammers are much more dangerous and it's totally going to end the world if you allow any random person to have one!

ragequittah - 11 hours ago

If that hammer could allow people to go into people's homes / work en masse, steal all their information, blackmail them, steal their identities, break their systems (including those of hospitals and other critical infrastructure) and generally help fund bad actors through it all we'd think of having restrictions on hammers too. A hammer can't screw people over by the millions.
I don't like this argument specifically with AI. Facial recognition everywhere you go is just a tool. Your job creating a detailed profile on exactly how you work, who you talk to, and about what is just a tool. The tools have become so good and easy to use we have to have serious discussions about them before things get out of hand.
- OutOfHere - 11 hours ago
  
  Did you see how close the non-sheltered available models come? They come quite close. Most people aren't even using them for this purpose, but they could, and this is our reality. This is why your argument fails.
  - ben_w - 10 hours ago
    
    Disagree. @lionkor compared them to a hammer, and @ragequittah is saying they're not like a hammer.
    The narrow gap between downloadable and frontier models is tangential to this. If you want to expand on the "hammer" metaphor, the downloadable models are a small construction/demolitions firm, and the frontier models are a big construction/demolitions firm.
    In this analogy, there's no training school or certifications for the staff either of them hire, and society is still working out what public liability requirements and planning permission laws are even though both companies are being hired all over the place, because everything they do was only invented a few years ago.
    
    baq - 10 hours ago
    
    > big construction/demolitions firm
    Like, e.g. the USACE
    
    ben_w - 10 hours ago
    
    If the USACE was a private military company and local lords sometimes still did direct battle with each other without being told to stop by the king.
    
    baq - 8 hours ago
    
    how do you think the states became united
    
    OutOfHere - 6 hours ago
    
    I wasn't talking about downloadable models though. GPT-5.5 and Opus 4.9 are what I would compare against. The article mentions GPT 5.5.
  - soco - 10 hours ago
    
    So the solution is... giving up? Let the technogods do whatever they please? Because we are not talking about storms and earthquakes, but about humans in power.
    
    OutOfHere - 6 hours ago
    
    Who said anything about giving up? And giving up on what?
bob1029 - 12 hours ago

The risk of catching federal charges, proper jail time and aggressive responses from law enforcement is a far more effective means of preventing malicious behavior than anything proposed so far.
I can go into stores that sell things that are much more dangerous than hammers (or frontier cyber models) and no one will give me a hard time about it.
- 3 hours ago

[deleted]
estearum - 5 hours ago

Does your local store sell aerosolized anthrax?
raincole - 12 hours ago

It's amusing that what Anthropic does is basically:
1. Browse the internet
2. See what people hate about OpenAI
3. Adopt the worse version of it
4. Profit?
Sam Altman fearmongered about AI alignment - we fearmonger harder.
OpenAI is CloseAI now - we are even less open.
OpenAI is going to IPO - we IPO first.
- ralphington - 11 hours ago
  
  I don't have a horse in the race, but these comments are remarkably toxic. This reminds me of the RTFM epidemic on early Stack overflow.
  - raincole - 11 hours ago
    
    It's toxic to call out big companies fearmongering about how their AI is too smart to be accessable? And it's somehow comparable to telling newbies asking question to RTFM?
    Really?
  - OutOfHere - 11 hours ago
    
    They look to be facts.