Meta Superintelligence's surprising first paper

paddedinputs.substack.com

386 points by skadamat a day ago

This has nothing to do with superintelligence, it's just the people that were working on the paper prior to the re-org happened to publish after the name change.

Though it is notable that contrary to many (on HN and Twitter) that Meta would stop publishing papers and be like other AI labs (e.g. OpenAI). They're continued their rapid pace of releasing papers AND open source models.

pityJuke - 10 hours ago

What model(s) have Meta released since the Lab re-org?
Also, that wasn't based on purely hearsay, Zuck explicitly said:
> We believe the benefits of superintelligence should be shared with the world as broadly as possible. That said, superintelligence will raise novel safety concerns. We'll need to be rigorous about mitigating these risks and careful about what we choose to open source. Still, we believe that building a free society requires that we aim to empower people as much as possible. [0]
[0]: https://www.meta.com/superintelligence/
- ipsum2 - 4 hours ago
  
  That has always been the policy. To answer your question, Meta has released ~100 models since the Superintelligence Lab reorg.
  https://huggingface.co/facebook/models
  The most interesting ones to me are:
  - CWM (Code world model), an LLM for coding https://github.com/facebookresearch/cwm
  - DINOv3, A vision encoder https://ai.meta.com/dinov3/
  - MAPAnything, a 3d reconstruction model https://huggingface.co/facebook/map-anything
  - VJEPA v2, Self-supervised video pre-training model https://github.com/facebookresearch/vjepa2
- parpfish - an hour ago
  
  > We believe the benefits of superintelligence should be shared with the world as broadly as possible.
  i'd interpret that as meaning "everybody is welcome to be our customer, but we're still control all of it"
- PatronBernard - an hour ago
  
  When did Zuck start caring about society?
- gessha - 3 hours ago
  
  You still believe anything that comes out of his mouth?
RataNova - 9 hours ago

Still, I think the optics matter... the fact that Meta's still putting out technical work (and open sourcing it) after the restructure says a lot about where they want to position themselves
ekianjo - 13 hours ago

Open weights models, not open source. And even their weights are under a specific license not as permissive as apache 2.
- HPsquared - 11 hours ago
  
  This is the right terminology. Model weights are literally compiled binary data; they are the output of an algorithm run on a bunch of source data. That training dataset is the "source" of the model. Training data (or the scripts used to generate it) is human-readable and modifiable, like source code. Binary weights are not.
  - carom - 10 hours ago
    
    Just to note though, source copyright extends to its compiled form. There is probably an analogue there for model weights.
    
    jeremyjh - 4 hours ago
    
    Tell me about the companies that own the copyrights to their training data.
  - phkahler - 2 hours ago
    
    Binary weights can still be "edited" with additional training.
- sdeframond - 10 hours ago
  
  I propose that from now on we call freewares "open binaries".
- hippo22 - 4 hours ago
  
  I'm not a lawyer, but I believe that the weights aren't subject to copyright. So, you can use them outside of Meta's license agreement provided you get them from somewhere else.
- drexlspivey - 12 hours ago
  
  Does an “open source” model the way you describe it exist or is it a mythical creature?
  - qcoret - 12 hours ago
    
    Unicorns also don't exist, but we don't change the definition to include horses.
    
    jakupovic - 11 hours ago
    
    Prove to me that unicorns don't exist, first level arguments only!
    
    aerhardt - 11 hours ago
    
    The first level argument is that old horse, burden of proof.
  - ayewo - 10 hours ago
    
    An open source model does exist now [1] and is multilingual. Previous discussion [2].
    [1] https://ethz.ch/en/news-and-events/eth-news/news/2025/07/a-l...
    [2] https://news.ycombinator.com/item?id=44535637
  - CaptainOfCoit - 5 hours ago
    
    It does, but does it matter? Even if every software released in 2025 was proprietary, doesn't make their published binaries "open source" because no other software could be classified as "open source".
    We name things based on what they are, not based on the lack of other things.
  - omneity - 11 hours ago
    
    There aren’t many but they do exist. OLMo for example.
  - Rattled - 11 hours ago
    
    Olmo by AllenAI and Pythia by EleutherAI.
  - kaufmann - 11 hours ago
    
    [dead]
Zacharias030 - 15 hours ago

Should be the top comment.
MSL is not only those few high profile hires.

mark_l_watson - 9 hours ago

A great idea, bypassing as much conversion as possible between vector space and natural language tokens. Reminds me of a discussion of having AI’s “talk” to each other using vector space.

There was an interesting quote “plain old BM25 from 1994 outperforms vector search on recall” and super relevant to what I did yesterday. I am trying to use small local models more often and yesterday I wrote Common Lisp code that uses a large corpus of text and a user query or prompt to construct a fairly concise one-shot prompt with select context from the text corpus. This is RAG, and I used both BM25 and vector embeddings matching. I added the code and an example as a new chapter in my CL book (link directly to new material: https://leanpub.com/lovinglisp/read#leanpub-auto-autocontext...) yesterday afternoon. BM25 is fast. This is new code, and I will certainly be experimenting more with it, but as-is it is useful when working with small local LLMs.

godelski - 20 hours ago

It's kinda funny, Meta has long had some of the best in the field, but left them untapped. I really think if they just took a step back and stop being so metric focused and let their people freely explore then they'd be winning the AI race. But with this new team, I feel like meta mostly hired the people who are really good at gaming the system. The people that care more about the money than the research.

A bit of this is true at every major lab. There's tons of untapped potential. But these organizations are very risk adverse. I mean why not continue with the strategy that got us to the point we're at in the first place. Labs used to hire researchers and give them a lot of free reign. But those times ended and AI progress also slowed down. Maybe if you want to get ahead you gotta stop thinking like everyone else

Well meta... you can "hold me hostage" for a lot cheaper than those guys. I'm sure this is true for hundreds of passionate ML researchers. I'd take a huge pay cut to have autonomy and resources. I know for a fact there's many working at Mets right now that would do the same. Do maybe if you're going to throw money at the problem, diversify a bit and look back at what made SV what it is today and what made AI take leaps forward

hamasho - 17 hours ago

My theory is that as more people compete, the top candidates become those who are best at gaming the system rather than actually being the best. Someone has probably studied this. My only evidence is job applications for GAFAM and Tinder tho.
- crystal_revenge - 16 hours ago
  
  I've spent most of my career working, chatting and hanging out with what might be best described as "passionate weirdos" in various quantitative areas of research. I say "weirdos" because they're people driven by an obsession with a topic, but don't always fit the mold by having the ideal combination of background, credentials and personality to land them on a big tech company research team.
  The other day I was spending some time with a researcher from Deep Mind and I was surprised to find that while they were sharp and curious to an extent, nearly every ounce of energy they expended on research was strategic. They didn't write about research they were fascinated by, they wrote and researched on topics they strategically felt had the highest probability getting into a major conference in a short period of time to earn them a promotion. While I was a bit disappointed, I certainly didn't judge them because they are just playing the game. This person probably earns more than many rooms of smart, passionate people I've been in, and that money isn't for smarts alone; it's for appealing to the interests of people with the money.
  You can see this very clearly by comparing the work being done in the LLM space to that being done in the Image/Video diffusion model space. There's much more money in LLMs right now, and the field is flooded with papers on any random topic. If you dive in, most of them are not reproducible or make very questionable conclusions based on the data they present, but that's not of very much concern so long as the paper can be added to a CV.
  In the stable diffusion world it's mostly people driven by personal interest (usually very non-commericial personal interests) and you see tons of innovation in that field but almost no papers. In fact, if you really want to understand a lot of the most novel work coming out of the image generation world you often need to dig into PRs made by an anonymous users with anime themed profile pic.
  The bummer of course is that there are very hard limits on what any researcher can do with a home GPU training setup. It does lead to creative solutions to problems, but I can't help but wonder what the world would look like if more of these people had even a fraction of the resources available exclusively to people playing the game.
  - kcexn - 12 hours ago
    
    This is such a nuanced problem. Like any creative endeavour, the most powerful and significant research is driven by an innate joy of learning, creating, and sharing ideas with others. How far the research can be taken is then shaped by resource constraints. The more money you throw at the researchers, the more results they can get. But there seems to be a diminishing returns kind of effect as individual contributors become less able to produce results independently. The research narrative also gets distorted by who has the most money and influence, and not always for the better (as recent events in Alzheimer's research has shown).
    The problem is once people's livelihoods depend on their research output rather than the research process, the whole research process becomes steadily distorted to optimise for being able to reliably produce outputs.
    Anyone who has invested a great deal of time and effort into solving a hard problem knows that the 'eureka' moment is not really something that you can force. So people end up spending less time working on problems that would contribute to 'breakthroughs' and more time working on problems that will publish.
  - RataNova - 8 hours ago
    
    The tragedy is exactly what you said: all that energy, creativity, and deep domain obsession locked out of impact because it’s not institutionally “strategic.”
  - smokel - 13 hours ago
    
    > I certainly didn't judge them because they are just playing the game.
    Please do judge them for being parasitical. They might seem successful by certain measures, like the amount of money they make, but I for one simply dislike it when people only think about themselves.
    As a society, we should be more cautious about narcissism and similar behaviors. Also, in the long run, this kind of behaviour makes them an annoying person at parties.
    
    what-the-grump - 12 hours ago
    
    But this is in itself selfish right?
    You dislike them because they don’t benefit you indirectly by benefiting society at large.
    The incentive structure is wrong, incentivizing things that benefit society would be the solution not judging those that exist in the current system by pretending altruism is somehow not part of the same game.
    
    smokel - 12 hours ago
    
    I agree that the system itself is dysfunctional, and I understand the argument that individuals are shaped or even constrained by it. However, in this case, we are talking about people who are both exceptionally intelligent and materially secure. I think it's reasonable to expect such individuals to feel some moral responsibility to use their abilities for broader good.
    As for whether that expectation is "selfish" on my part, I think that question has been debated for centuries in ethics, and I'm quite comfortable landing on the side that says not all disapproval is self-interest. In my own case, I'm not benefiting much either :)
    
    Eisenstein - 12 hours ago
    
    There is a difference between being selfish in the sense that you want others to contribute back to the society that we are all part of, and being selfish in the sense that you want to compete for exclusive rewards.
    You can call this difference whatever you want, don't pretend that they are morally or effectively equivalent.
    
    kakacik - 8 hours ago
    
    Selfish for the long term future and prosperity of mankind? Thats some good selfishness all right.
    
    idiotsecant - 5 hours ago
    
    This take is simply wrong in a way that I would normally just sigh and move on, but it's such a privileged HN typical pov that I feel like I need to address it. If a plumber did plumbing specifically because someone needed it and he would be paid, would you call them a narcissist? If a gardener built a garden how their customer wanted would you call them a narcissist? Most of the world doesn't get to float around in a sea of VC money doing whatever feels good. They find a need, address it, and get to live another day. Productively addressing what other people need and making money from it isn't narcissism, it's productivity.
    
    lkey - 5 hours ago
    
    You are comparing a skilled trade that commands ~100k annual compensation to positions that have recently commanded 100 million dollars in compensation upon signing, no immediate productivity required, as this talent denial is considered strategic.
    You consider the person who expects eventual ethical behavior from people that have 'won' capitalism (never have to labour again) to be privileged.
    
    szundi - 12 hours ago
    
    [dead]
    
    bradleyjg - 11 hours ago
    
    but I for one simply dislike it when people only think about themselves
    The key word there is only. Nothing in the post you suggested only. You have one vignette about one facet of this guy’s life.
    I really dislike the resurgence in Puritanism.
    
    smokel - 10 hours ago
    
    Please don't read too much into this single word. The comment above mentioned "nearly every ounce of energy they expended on research was strategic", and I was keeping that in mind while writing my remark.
    Please read my sibling comment where I expand a bit on what I meant to say.
    
    - 10 hours ago
    
    [deleted]
- b00ty4breakfast - 18 minutes ago
  
  that's what happens at the top of most competitive domains. Just take a look at pro sports; guys are looking for millimeters to shave off and they turn to "playing the game" rather than merely improving athletic performance. Watching a football game (either kind) and a not-small portion of the action is guys trying to draw penalties or exploit the rules to get an edge.
- godelski - 17 hours ago
  > Someone has probably studied this
  There's even a name for it
  https://en.wikipedia.org/wiki/Goodhart%27s_law
  - ivanbelenky - 14 hours ago
    
    Thanks for sharing. I did not know this law existed and had a name. I know nothing about nothing but it appears to be the case that the interpretation of metrics for policies assume implicitly the "shape" of the domain. E.g. in RL for games we see a bunch of outlier behavior for policies just gaming the signal.
    There seems to be 2 types
    - Specification failure: signal is bad-ish, a completely broken behavior --> local optimal points achieved for policies that phenomenologically do not represent what was expected/desired to cover --> signaling an improvable reward signal definition
    - Domain constraint failure: signal is still good and optimization is "legitimate", but you are prompted with the question "do I need to constraint my domain of solutions?"
    - finding a bug that reduces time to completion of a game in a speedrun setting would be a new acceptable baseline, because there are no rules to finishing the game earlier - shooting amphetamines on a 100m run would probably minimize time, but other factors will make people consider disallowing such practices.
    
    Eisenstein - 11 hours ago
    
    I view Goodhart's law more as a lesson for why we can never achieve a goal by offering specific incentives if we are measuring success by the outcome of the incentives and not by the achievement of the goal.
    This is of course inevitable if the goal cannot be directly measured but is composed of many constantly moving variables such as education or public health.
    This doesn't mean we shouldn't bother having such goals, it just means we have to be diligent at pivoting the incentives when it becomes evident that secondary effects are being produced at the expense of the desired effect.
    
    godelski - an hour ago
    
    > This is of course inevitable if the goal cannot be directly measured
    It's worth noting that no goal can be directly measured[0].
    I agree with you, this doesn't mean we shouldn't bother with goals. They are fantastic tools. But they are guides. The better aligned our proxy measurement is with the intended measurement then the less we have to interpret our results. We have to think less, spending less energy. But even poorly defined goals can be helpful, as they get refined as we progress in them. We've all done this since we were kids and we do this to this day. All long term goals are updated as we progress in them. It's not like we just state a goal and then hop on the railroad to success.
    It's like writing tests for code. Tests don't prove that your code is bug free (can't write a test for a bug you don't know about: unknown unknown). But tests are still helpful because they help evidence the code is bug free and constrain the domain in which bugs can live. It's also why TDD is naive, because tests aren't proof and you have to continue to think beyond the tests.
    [0] https://news.ycombinator.com/item?id=45555551
  - julienreszka - 17 hours ago
    
    It’s a false law tho. Collapses under scrutiny
    
    NBJack - 15 hours ago
    
    If I hadn't seen it in action countless times, I would belive you. Changelists, line counts, documents made, collaborator counts, teams lead, reference counts in peer reviewed journals...the list goes on.
    You are welcome to prove me wrong though. You might even restore some faith in humanity, too!
    
    godelski - 17 hours ago
    
    Sorry, remind me; how many cobras are there in India?
    
    bandrami - 12 hours ago
    
    The Zoological Survey of India would like to know but hasn't figured out a good way to do a full census. If you have any ideas they would love to hear them.
    Naja naja has Least Concern conservation status, so there isn't much funding in doing a full count, but there are concerns as encroachment both reduces their livable habitat and puts them into more frequent contact with humans and livestock.
    
    oblio - 12 hours ago
    
    The comment was a joke.
    https://en.wikipedia.org/wiki/Perverse_incentive
    
    epwr - 17 hours ago
    
    Could you elaborate or link something here? I think about this pretty frequently, so would love to read something!
    
    vasco - 16 hours ago
    
    Metric: time to run 100m
    Context: track athlete
    Does it cease to be a good metric? No. After this you can likely come up with many examples of target metrics which never turn bad.
    
    noosphr - 15 hours ago
    
    If it were a good metric there wouldn't be a few phone books worth of regulations on what you can do before and during running 100 meters. From banning rocket shoes, to steroids, to robot legs the 100 meter run is a perfect example of a terrible metric both intrinsically as a measure of running speed and extrinsically as a measure of fitness.
    
    AnthonyMouse - 15 hours ago
    
    > Metric: time to run 100m
    > Context: track athlete
    > Does it cease to be a good metric? No.
    What do you mean? People start doping or showing up with creatively designed shoes and you need to layer on a complicated system to decide if that's cheating, but some of the methods are harder to detect and then some people cheat anyway, or you ban steroids or stimulants but allow them if they're by prescription to treat an unrelated medical condition and then people start getting prescriptions under false pretexts in order to get better times. Or worse, someone notices that the competition can't set a good time with a broken leg.
    
    godelski - 16 hours ago
    
    So what is your argument, that it doesn't apply everywhere therefore it applies nowhere?
    You're misunderstanding the root cause. Your example works as the the metric is well aligned. I'm sure you can also think of many examples where the metric is not well aligned and maximizing it becomes harmful. How do you think we ended up with clickbait titles? Why was everyone so focused on clicks? Let's think about engagement metrics. Is that what we really want to measure? Do we have no preference over users being happy vs users being angry or sad? Or are those things much harder to measure, if not impossible to, and thus we focus on our proxies instead? So what happens when someone doesn't realize it is a proxy and becomes hyper fixated on it? What happens if someone does realize it is a proxy but is rewarded via the metric so they don't really care?
    Your example works in the simple case, but a lot of things look trivial when you only approach them from a first order approximation. You left out all the hard stuff. It's kinda like...
    Edit: Looks like some people are bringing up metric limits that I couldn't come up with. Thanks!
    
    vasco - 16 hours ago
    
    > So what is your argument, that it doesn't apply everywhere therefore it applies nowhere?
    I never said that. Someone said the law collapses, someone asked for a link, I gave an example to prove it does break down in some cases at least, but many cases once you think more about it. I never said all cases.
    If it works sometimes and not others, it's not a law. It's just an observation of something that can happen or not.
    
    godelski - 16 hours ago
    
    > I never said all cases.
    You're right. My bad. I inferred that through the context of the conversation.
    > If it works sometimes and not others, it's not a law.
    I think you are misreading and that is likely what lead to the aforementioned misunderstanding. You're right that it isn't a scientific law, but the term "law" gets thrown around a lot in a more colloquial manner. Unfortunately words are overloaded and have multiple meanings. We do the same thing to "hypothesis", "paradox", and lots of other things. I hope this clarifies the context. (even many of the physics laws aren't as strong as you might think)
    But there are many "laws" used in the same form. They're eponymous laws[0], not scientific ones. Read "adage". You'll also find that word used in the opening sentence on the Wiki article I linked as well as most (if not all) of them in [0]
    [0] https://en.wikipedia.org/wiki/List_of_eponymous_laws
    
    exe34 - 15 hours ago
    
    it doesn't break down - see comments about rules above. it was the perfect example to prove yourself wrong.