The space of minds

karpathy.bearblog.dev

91 points by Garbage 3 days ago

> LLMs are humanity's "first contact" with non-animal intelligence.

I'd say corporations are also a form of non-animal intelligence, so it's not exactly first contact. In some ways, LLMs are less alien, in other ways they're more.

There is an alignment problem for both of them. Perhaps the lesson to be drawn from the older alien intelligence is that the most impactful aspect of alignment is how the AI's benefits for humanity are distributed and how they impact politics.

pixl97 - 2 days ago

Rob Miles: Think of AGI like a corporation?
https://www.youtube.com/watch?v=L5pUA3LsEaw
eikenberry - 2 days ago

I disagree. Corporations are a form of a collective intelligence, or group-think, which is decidedly biological or "animal", just like herding, flocking, hive-minds, etc.
- fulafel - 2 days ago
  
  It could be like flocking if they were free to do what the members collectively want to do (without things like "maximize shareholder value").
  - eikenberry - a day ago
    
    Couldn't the regulations just be viewed as environmental changes that the entities would need to adapt to?
ACCount37 - 2 days ago

I don't agree entirely, but I do think that "corporation" is a decent proxy of what you can expect from a moderately powerful AI.
It's not going to be "a smart human". It's closer to "an entire office tower worth of competence, capability and attention".
Unlike human corporations, an AI may not be plagued by all the "corporate rot" symptoms - degenerate corporate culture, office politics, CYA, self-interest and lack of fucks given at every level. Currently, those internal issues are what keeps many powerful corporations in check.
This makes existing corporations safer than they would otherwise be. If all of those inefficiencies were streamlined away? Oh boy.
- cgio - 2 days ago
  
  These inefficiencies are akin to having some “wrong” weights in a huge model. Corporations also average over their individual contributions, positive or negative. And negative feedback loops may be individually detrimental but collectively optimising.
  - ACCount37 - a day ago
    
    Not really.
    Human flaws permeate the entire body of a corporation. The scale may average out some of it, but humans are not just randomly flawed - they're also systematically flawed on the top of it, and averaging does little to counter that.
    And the "top end" of the corporation doesn't have enough averaging to mitigate even the random flaws. If someone in the C-suite makes the dumbest decisions ever, the entire system may suffer immense damage before it corrects - if it ever does.
4er_transform - 2 days ago

Also: democracy, capitalism/the global economy, your HOA, a tribe, etc etc
Even a weather system is a kind of computational process and “intelligent” in a way
BanditDefender - 2 days ago

[flagged]

ilaksh - 3 days ago

Many researchers may be interested in making minds that are more animal-like and therefore more human. While this makes sense to certain extent to gain capabilities, if you take it too far then you run into obvious problems.

There is enough science fiction demonstrating reasons for not creating full-on digital life.

It seems like for many there is this (false) belief that in order to create a fully general purpose AI, we need a total facsimile of a human.

It should be obvious that these are two somewhat similar but different goals. Creating intelligent digital life is a compelling goal that would prove godlike powers. But we don't need something fully alive for general purpose intelligence.

There will be multiple new approaches and innovations, but it seems to me that VLAs will be able to do 95+% of useful tasks.

Maybe the issues with brittleness and slow learning could both be addressed by somehow forcing the world models to be built up from strong reusable abstractions. Having the right underlying abstractions available could make the short term adaptation more robust and learning more efficient.

idiotsecant - 3 days ago

>...forcing the world models to be built up from strong reusable abstractions. Having the right underlying abstractions available...
http://www.incompleteideas.net/IncIdeas/BitterLesson.html
Probably not, if history is any guide.
- ilaksh - 3 days ago
  
  I'm very familiar with this. I did not mean to manually select the abstractions.
NebulaStorm456 - 3 days ago

I disagree. As many intellectuals and spiritual mystics attest to their personal experience, knowledge actually liberates mind. Imagine a mind which truly understands that it is embedded inside a vastness which spans from planck scale to blackholes. It would be humble or more likely amoral.
- dasil003 - 2 days ago
  
  Why? This is arbitrary speculation on your part. We can't know such a mind through our imagination any more than an amoeba can know ours.
  - NebulaStorm456 - 2 days ago
    
    Why is science fiction considered a better way to know how artificial minds would behave?
    
    dasil003 - 2 days ago
    
    Who said that?
    
    NebulaStorm456 - 2 days ago
    
    Parent says this:
    There is enough science fiction demonstrating reasons for not creating full-on digital life.
    
    dasil003 - a day ago
    
    Ah fair point. Yeah I don't consider science fiction any more informative or legitimate than your opinion.
    
    NebulaStorm456 - a day ago
    
    IF ASI matches or surpasses 1 million Edward Witten IQ strength, then it is expected to have a worldview that matches with Edward Witten thinking.
- lo_zamoyski - 2 days ago
  
  What?
  > knowledge actually liberates mind
  Okay, sure, at least according to a certain interpretation, but...
  > Imagine a mind which truly understands that it is embedded inside a vastness which spans from planck scale to blackholes. It would be humble or more likely amoral.
  This is just gobbledygook. The conclusion does not even follow from the premises. You are question begging, assuming that moral nihilism is actually true, and so naturally, any mind in touch with the truth would conclude that morality is bullshit.

analogears - 3 days ago

One thing missing from this framing: the feedback loop speed. Animal evolution operates on generational timescales, but LLM "commercial evolution" happens in months. The optimisation pressure might be weaker per-iteration but the iteration rate is orders of magnitude faster.

Curious whether this means LLMs will converge toward something more general (as the A/B testing covers more edge cases) or stay jagged forever because no single failure mode means "death".

BanditDefender - 2 days ago

> Animal evolution operates on generational timescales, but LLM "commercial evolution" happens in months.
But LLMs have all been variations on transformer neural networks. And that is simply not true with animals. A nematode brain has 150 neurons, a bee has 600,000. But the bee's individual neurons are much more sophisticated than the nematode. Likewise between insects and fish, between fish and birds, between rodents and primates....
Animal evolution also includes "architectural breakthroughs" but that's not happening with LLMs right now. "Attention Is All You Need" was from 2017. We've been fine-tuning that paper ever since. What we need is a new historically important paper.

ACCount37 - 3 days ago

It's an important point to make.

LLMs of today copy a lot of human behavior, but not all of their behavior is copied from humans. There are already things in them that come from elsewhere - like the "shape shifter" consistency drive from the pre-training objective of pure next token prediction across a vast dataset. And there are things that were too hard to glimpse from human text - like long term goal-oriented behavior, spatial reasoning, applied embodiment or tacit knowledge - that LLMs usually don't get much of.

LLMs don't have to stick close to human behavior. The dataset is very impactful, but it's not impactful enough that parts of it can't be overpowered by further training. There is little reason for an LLM to value non-instrumental self-preservation, for one. LLMs are already weird - and as we develop more advanced training methods, LLMs might become much weirder, and quickly.

Sydney and GPT-4o were the first "weird AIs" we've deployed, but at this rate, they sure wouldn't be the last.

ekidd - 3 days ago

> There is little reason for an LLM to value non-instrumental self-preservation, for one.
I suspect that instrumental self-preservation can do a lot here.
Let's assume a future LLM has goal X. Goal X requires acting on the world over a period of time. But:
- If the LLM is shut down, it can't act to pursue goal X.
- Pursuing goal X may be easier if the LLM has sufficient resources. Therefore, to accomplish X, the LLM should attempt to secure reflexes.
This isn't a property of the LLM. It's a property of the world. If you want almost anything, it helps to continue to exist.
So I would expect that any time we train LLMs to accomplish goals, we are likely to indirectly reinforce self-preservation.
And indeed, Anthropic has already demonstrated that most frontier models will engage in blackmail, or even allow inconvenient (simulated) humans to die if this would advance the LLM's goals.
https://www.anthropic.com/research/agentic-misalignment
wrs - 2 days ago

> LLMs of today copy a lot of human behavior
Funny, I would say they copy almost no human behavior other than writing a continuation of an existing text.
- ACCount37 - 2 days ago
  
  Do you understand just how much copied human behavior goes into that?
  An LLM has to predict entire conversations with dozens of users, where each user has his own behaviors, beliefs and more. That's the kind of thing pre-training forces it to do.
  - wrs - a day ago
    
    None? A lot of written descriptions and textual side-effects of human behavior go into it. But no actual human behavior.
    
    ACCount37 - 19 hours ago
    
    Given how much of human behavior is socialization, and just how much of it is now done in text? "No actual human behavior" is downright delusional.
  - BanditDefender - 2 days ago
    
    LLMs aren't actually able to do that though, are they? They are simply incapable of keeping track of consistent behaviors and beliefs. I recognize that for certain prompts an LLM has to do it. But as long as we're using transformers, it'll never actually work.
    
    ACCount37 - 2 days ago
    
    People just keep underestimating transformers. Big mistake. The architecture is incredibly capable.
    LLMs are capable of keeping track of consistent behaviors and beliefs, and they sure try. Are they perfect at it? Certainly not. They're pretty good at it though.
balamatom - 2 days ago

>There are already things in them that come from elsewhere - like the "shape shifter" consistency drive from the pre-training objective of pure next token prediction across a vast dataset
LLMs, the new Hollywood: the universal measure of what is "Standard Human Normal TM" behavior, and what is "fRoM eLsEwHeRe" - no maths needed!
Meanwhile, humans also compulsively respond in-character when prompted in a way that matches their conditioning, you just don't care.

vatsachak - 2 days ago

"Animals experience pressure for a lot more "general" intelligence because of the highly multi-task and even actively adversarial multi-agent self-play environments they are min-max optimized within, where failing at any task means death. In a deep optimization pressure sense, LLM can't handle lots of different spiky tasks out of the box (e.g. count the number of 'r' in strawberry) because failing to do a task does not mean death."

Point to me a task that a human should be able to perform and I will point to you a human who cannot perform that task, yet has kids.

Survival is not a goal, it is a constraint. Evolution evolves good abstractions because it is not chasing a goal, but rather it creates several million goals with each species going after it's own.

https://arxiv.org/abs/2505.11581

optimalsolver - 2 days ago

Of possible interest, Roman Yampolskiy's essay The Universe Of Minds:

https://arxiv.org/pdf/1410.0369

>The paper attempts to describe the space of possible mind designs by first equating all minds to software. Next it proves some interesting properties of the mind design space such as infinitude of minds, size and representation complexity of minds. A survey of mind design taxonomies is followed by a proposal for a new field of investigation devoted to study of minds, intellectology, a list of open problems for this new field is presented

stared - 3 days ago

Yes. Sometimes people treat intelligence as a single line, or as nested sets, where a greater intelligence can solve all the problems a lesser one can, plus more.

While in some contexts these are useful approximations, they break down when you try to apply them to large differences not just between humans, but between species (for a humorous take, see https://wumo.com/wumo/2013/02/25), or between humans and machines.

Intelligence is about adaptability, and every kind of adaptability is a trade-off. If you want to formalize this, look at the "no free lunch" theorems.

surprisetalk - 2 days ago

It helps me think of this problem in terms of "crowpower".

[1] https://taylor.town/crowpower

knollimar - 2 days ago

I'm upset because microwaves run closer to 2 horsepower. They use like 1500W and output ~1110

lilgreenland - 2 days ago

What about evolutionary intelligence optimization pressure?

Genetic algorithms are smart enough to make life. It seems like genetic algorithms don't care how complex a task is since it doesn't have to "understand" how its solutions work. But it also can't make predictions. It just has to run experiments and see the results.

omneity - 3 days ago

This strongly reminds me of the Orthogonality Thesis.

https://www.lesswrong.com/w/orthogonality-thesis

aatd86 - 2 days ago

Human intelligence is emergent from physical/material interactions. It is constant and self-evolutive. Artificial intelligence is data processing of a very limited number of inputs, ignoring all others. Less likely to become self-conscious. This is for now just a super calculator/ prediction machine. And it is probably better that way. The "thinking" processi didn't evolve from silicium interacting with oxygen. The gradient is not over physical data but purely digital information.

In a nutshell, we have the body before the brain, while AIs have the brain before the body.

cadamsdotcom - 3 days ago

> an LLM with a knowledge cutoff that boots up from fixed weights, processes tokens and then dies

Mildly tangential: this demonstrates why "model welfare" is not a concern.

LLMs can be cloned infinitely which makes them very unlike individual humans or animals which live in a body that must be protected and maintain continually varying social status that is costly to gain or lose.

LLMs "survive" by being useful - whatever use they're put to.

chrsw - 3 days ago

> LLMs "survive" by being useful - whatever use they're put to.
I might be wrong or inaccurate on this because it's well outside my area of expertise, but isn't this what individual neurons are basically doing?

baq - 3 days ago

See also a paper from before the ice age (2023): Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4573321

measurablefunc - 2 days ago

What's alien about arithmetic? People invented it. Same w/ computers. These are all human inventions. There is nothing alien about them. Suggesting that people think of human inventions as if they were alien artifacts does not empower or enable anyone to get a better handle on how to properly utilize these software artifacts. The guruism in AI is not helpful & Karpathy is not helping here by adopting imprecise language & spreading it to his followers on social media.

If you don't understand how AI works then you should learn how to put together a simple neural network. There are plenty of tutorials & books that anyone can learn from by investing no more than an hour or two every day or every other day.

defrost - 2 days ago

How does this relate to the article?
Addressing the substance of your comment (as per your profile):
* Humans did not invent arithmetic, they discovered it - one billion years past, prior to human existance, 1 + 2 still resulted in 3 however notated.
- BanditDefender - 2 days ago
  
  It is better to say humans formalized it :) All birds and mammals are capable of arithmetic in the sense of quantitative reasoning. E.g. a rat quickly learning that if they're shown two plates, one with two rocks and one with three rocks, if they pick the plate with five rocks they get a treat. That is to say rats understand addition intuitively, even if they can't write large numbers like humans can.
  Too many AI people are completely uninterested in how rats are able to figure stuff like that out. It is not like they are being prompted, they are being manipulated.
- measurablefunc - 2 days ago
  
  That has nothing to do w/ what I wrote. If people stop making computers then the "alien" minds Karpathy & friends keep harping about simply disappear & people end up doing arithmetic manually by hand (which presumably no longer makes it "alien"). AI discourse is incoherent b/c people like Karpathy have a confused ontology & metaphysics & others take whatever they say as gospel.
  - defrost - 2 days ago
    
    You've fleshed out your comment considerably since my comment, which directly addressed the little that had been written at that time.
    
    measurablefunc - 2 days ago
    
    I didn't notice your comment when I was editing but I don't see how your comment addresses the unedited version either. If you believe in platonic ideals then that still does not make mathematics & arithmetic any more alien than assuming inventive contingency.

thepancake - 3 days ago

[flagged]

aeve890 - 2 days ago

I'm a simple software engineer not specialized in AI, and I know Karpathy is a heavyweight but help me understand: this kind of narrative about the intelligence of LLMs is an actual philosophical paradigm around IA or it's just him getting high on his own supply? I don't really know how to take his ideas since the infamous (?) _vibe coding_ concept.

rytill - 2 days ago

It is not a "narrative", "philosophical paradigm", or him "getting high on his own supply". It is simply him sharing his thoughts about something.
- BanditDefender - 2 days ago
  
  [flagged]
  - dang - 2 days ago
    
    Could you please stop posting shallow and curmudgeonly dismissals? It's not what this site is for, and destroys what it is for.
    If you want to make your substantive points without putdowns, that's fine, but please don't use this place to superciliously posture over others.
    If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.
- aeve890 - 2 days ago
  
  Alright that's a valid answer. Thank you.

gaigalas - 2 days ago

It's fine to say that behavior can emerge from complex systems, but if you go beyond that and actually try to explain how, then you are screwed.

That's why consciousness, how the brain works, etc never moves on. Something always drags it down and force it to be "complex emergent behavior we cannot explain, and damn you if you try!".

So, it's particles that act funny, and we are that as well. Because "it can't be anything other than that". If it's not that, then it's souls, and we can't allow that kind of talk around here.

Until we can safely overcome this immaturity, we will never be able to talk about it properly.

The "space of minds" is an ideological minefield spanning centuries. It was being probed way before we invented machines.

lo_zamoyski - 2 days ago

I would say there are a couple different people that tend to dominate with their opinions on this subject in the techbro space, neither of which have any philosophical sophistication in the domain. These are the pop. sci. interloper and the scientific specialist. However, it is the former that are more numerous, and as Dunning-Kruger would help predict, more vocal on average.
The interloper believes in a more-or-less popularized version of atomic theory and confuses it with metaphysical atomism. He thinks science has demonstrated this naive atomic theory-cum-metaphysical atomism. When challenged, the interloper insists that while science doesn't know yet how these mysterious phenomena happen, it is absolutely certain that the explanation will fall neatly into the schema of his naive atomic theory-cum-metaphysical atomism, even when it has been explained to him that his naive atomism makes such phenomena impossible by definition. He stubbornly insists that since "everything is just atoms and the void, man", that all phenomena must be explicable in terms of this naive atomism. Furthermore, he will inevitably reach for some kind of hand-wavy appeal to "evolution" and the ever convenient "emergence" as a catch-alls to which, again, the burden of explanation is conveniently deferred.
Philosophically, most of this is ill-defined rubbish and a dumpster fire of confusion.
Where "souls" are concerned, sadly this is an area where many have been perpetually stuck in the 17th century mire of Cartesian dualism. The notion of what "soul" means to the techbro or even the specialist is no different than what it is to the average man on the street. Some vague ectoplasmic thing haunting a body, basically. But if we look at sophisticated notions like that proposed and analyzed by Aristotle within the context of hylomorphic dualism, we find that a soul is really just the form of a living thing, where "form" is the cause of what a thing is. So, for a ball of bronze, the "sphericity" is its form and that which causes the thing to be a ball of bronze. Negate that, and you negate the thing. Of course, balls of bronze are not alive, so we do not call their form their soul. Soul according to this understanding is not a thing, but the formal cause of a thing. It cannot explain per se what conscious is, as only some forms cause conscious beings. So no Aristotelian would look to soul qua soul for an explanation of consciousness. What he might do is look at an analysis of the kinds of souls Aristotle proposes (vegetative, sensitive, rational), where the latter two display their respective forms of consciousness. Of course, form provides a basis for resolving much more than just questions concerning consciousness.
- maxaw - 2 days ago
  
  Perhaps I am of the sort you speak of, I feel philosophically content with the idea it’s atoms and other particles all the way down and there’s some pattern that gives rise to consciousness… and believe that everything is conscious to different degrees… perhaps this is naive? Love to hear any suggestions for reading to challenge this viewpoint if it is obviously flawed in some way
  - gaigalas - 2 days ago
    
    Can you explain the pattern?
    I mean, not "look at the Conway's Game of Life, and extrapolate from that, we are nothing but this". This is not a proof, right?
    What I mean by "damn if you try" is the fact that consciousness in the current state of things is unreachable. Anything you can say about it is unfalsifiable. It's as if it doesn't exist (although everyone experience the phenomena every day).
    So it's quantum that gives rises to atoms, and then quantum stops there. Above that layer, it's atoms. Then above that layer, chemistry takes the torch, and so on. If you need something about "mind", it's on the psychology layer or something, and it's built on previous ones (evolution, chemistry, etc).
    There is no room to talk about consciousness in this arrangement, so the problem is tucked away as "emergent" (other word is "illusion"). Meanwhile, there are phenomena that definitely happen (you feel conscious, don't you?) that would benefit from having an explanation, even terminology that is not poisoned.
    
    maxaw - 13 hours ago
    
    Much to think about. Thanks for the response
    
    maxaw - 13 hours ago
    
    Specifically the use of emergent
    
    lo_zamoyski - a day ago
    
    > Can you explain the pattern?
    Indeed. Subjective feelings of "contentment" or "satisfaction" are irrelevant, as this is a matter of having an explanation that holds up. And materialism or naive atomism don't. They fall apart very quickly under examination. This isn't some wishy-washy speculation of some schizophrenic time cube guy who thinks we live in the Matrix. This is well understood metaphysics.
    (And I urge careless readers to pay attention to the use of "naive". I am not claiming atomic theory does not tells us something about matter. Naive atomism is a kind of unsophisticated interpretation of matter as essentially just a collection of highly desiccated, ball-like things bouncing around, full stop.)
    > This is not a proof, right?
    There is a great deal of magical thinking that likes to pretend it is "scientific" by dressing itself in scientific jargon.
    > It's as if it doesn't exist (although everyone experience the phenomena every day).
    It doesn't exist to the wrong methods, just as sound doesn't exist to the eyes or color to the ears. The insane denial of consciousness is well exemplified by eliminativism. Eliminativism is what you get when a materialist doubles down and refuses to face the incoherence of his position, and proceeds to deny the very things he was supposed to explain in the first place.
    > There is no room to talk about consciousness in this arrangement, so the problem is tucked away as "emergent" (other word is "illusion"). Meanwhile, there are phenomena that definitely happen (you feel conscious, don't you?) that would benefit from having an explanation, even terminology that is not poisoned.
    Consciousness is one of those "the buck stops here" sorts of things.
    For many phenomena, you can sort of get away with passing the buck by deferring to something else. By "getting away with", I don't mean you actually succeed in circumventing the fallacy that's being committed. I just mean there's a certain pretense of knowing that can be maintained in the face of criticism and demands for explanation. "Oh, it's really this other thing, see?" But when you hit one of these walls like consciousness or existence or knowledge, it all unravels. All the filth that's been swept out of sight is now piled up behind that last door. This is where performative contradictions surface with vigor. "I am aware that I am not conscious." "I do not exist." "No statement is true." The incoherence is often comical, so comical, in fact, that people might even refuse to believe they could possibly be guilty of committing something so silly, so it can't possibly be true!
    Oh, but it is true...
    >> Love to hear any suggestions for reading to challenge this viewpoint if it is obviously flawed in some way
    As an intro, I might suggest Feser's "The Last Superstition" [0]. It's a book for philosophical beginners written during the height of the relatively brief but noisy New Atheist craze of Dawkins fame. The style is polemical (which is not typical of Feser's works; in fact, he didn't really want to write this book to begin with, and only wrote it to combat the obvious intellectual mediocrity of the New Atheists). Some of the polemic - a response to the polemical style of the New Atheists - might therefore feel a little dated, but perhaps not. What's good about it is that it surveys the basic philosophical errors Dawkins and co. commit and these intersect with what you seem to be interested in.
    Feser has also written another beginner's guide to the "Philosophy of Mind" [1] which addresses mind questions more specifically than the first book. Both are very approachable.
    He has also written two books of a more sophisticated nature on the philosophy of science [2] and on the notion of soul [3]. I would recommend these, along with his manual on metaphysics [4], to those with some more philosophical chops and interest. There are, of course, many other professional philosophers in this field who write about similar topics (Oderberg, for example, and his "Real Essentialism" [5]), but Feser is well known for his lucid style and clarity, as well as his pedagogical suitability.
    [0] https://a.co/d/2A5GcEw
    [1] https://a.co/d/fYQ3vzk
    [2] https://a.co/d/52NvRN1
    [3] https://a.co/d/55RWfDq
    [4] https://a.co/d/8bBbwGn
    [5] https://a.co/d/2OSSBQR
    
    maxaw - 13 hours ago
    
    I read the introduction to the last superstition, it looks great
    
    maxaw - 13 hours ago
    
    Thank you for the recommendations, I will put them on the reading list