They’re made out of weights
maxleiter.com1225 points by MaxLeiter 19 hours ago
1225 points by MaxLeiter 19 hours ago
The weights start with a random manifold. The training takes data and shapes the manifold, weight by weight, in many cycles. Once the training is the done manifold is fixed.
When a new inference has to be done the query(q) is projected in the manifold space. This projection is dropped on the manifold and the gravity of the manifold gives an answer of q+1 length. Which(qw+i) is dropped qw+n times to output a final response of n length.
The gravity is created by repeated multiplication(of the weights/input) to find out how the projected embeddings should fall according to the manifold in the GPU.
It's like a giant plinko board where the shape of the original disc guides how it falls through the apparatus, and the apparatus has been tuned so that different discs end up in the exits we want them to
That's a very concise and illuminating way to think about what's happening, IF (and only if) you already know how these models work. Thanks for that.
Yes this is more like compression to remember and not for learning/understanding.
Compression is the reason why these Models are able to learn and understand.
My brain is doing the exact same thing.
I learned enough to compress concepts like a bike and what a bike does and for what i can use a bike.
Ask a LLM and it will answer you similiar to humans.
Blind people learn concepts of bikes too and in a smiliar way: by description.
LLMs just have so much data in form of text available and are able to ingest all of this, that the LLM compression algorithm doesn't has to be that good/finetuned than ours.
But I would assume that Yann LeCun's JEPA or other breakthroughs in the next few years will get us there.
> Blind people learn concepts of bikes too and in a smiliar way: by description.
And by touch and sound. And maybe some were daring enough to drive one, or unlucky enough to get hit by one. But have way more input than just texts.
LLMs also have other inputs, like audio and images. They get encoded (just like a human eye encodes an image) and passed to the weights.
Obligatory echolocation bit:
Invisibilia's episode was my first exposure to it.
https://www.npr.org/programs/invisibilia/378577902/how-to-be...
The man posits that clicking is instinctual for blind people but they are told to quiet down in class and most never develop their echolocation abilities
So a blind person only can describe lava to you after they touched and heared it?
A blind person has touched warm and hot things and gotten burned before, and then they are told lava is this molten liquid that is even hotter than anything they have touched. That is enough for them to understand.
A blind person that never touched a hot object wouldn't really know though, there is a reason we dismiss talk from people who lack experience.
You don't know that. Yo don't know what someone would think if you tell them the general concept of cold and warm.
The reaction you should have, the feeling etc.
I asked chatgpt how it would describe a scene without mentioning temperature. It was very good in describing what a human would describe.
I'm aware of the bias we have against LLMs but I think people just underestimate how much data is there.
I'm not saying a robot wouldn't be better with this information or an LLM and they actually use temperature sensors for robots so they can control movement speed and dexterity with overheating elements but the gap is small.
In what way is that different from any other model of reality that you'd use to winnow a dataset into an answer to a question? The only major difference I see is that beyond a certain number of transformations, people are willing to treat it as some sort of miracle, and too tired to figure out why it came up with the answer it came up with. It's almost like people desperately want to give up their agency and creativity to black boxes, whether those weights produce answers that are right or wrong. Factor in that psychology and it looks a lot less like we have invented something useful, and a lot more like we as a species are choosing to quit life en masse.
> The only major difference I see is that beyond a certain number of transformations, people are willing to treat it as some sort of miracle, and too tired to figure out why it came up with the answer it came up with.
It’s funny, because I thought you were talking about humans here when you wrote this. We know some things about how our bodies encode information that is sent to the brain, and we know some things about how neurons receive information and act on it, but after that we get too tired and give up on how the brain works and treat it like a miracle.
It’s like we desperately want to believe our consciousness is not just electrical impulses in our brain, and we want to ascribe agency and uniqueness to the physical processes going on in our head.
> but after that we get too tired and give up on how the brain works and treat it like a miracle.
I disagree. We know very well how neurons work, and we have a pretty good idea of how neural activity translates to behavior. In other words, we have a pretty good idea on how the brain works. We stop at consciousness because as of yet it is in the realm of philosophy, not science. We don‘t know what consciousness is or even whether or not it is useful for science and we are simply waiting for the philosophers guides us out of that situation.
Note that both cognitive psychology and behavioral psychology has done fine without tackling consciousness. When neuropsychology emerged in the 1980s it complemented both these fields perfectly. The situation is the opposite with the philosophy of mind which grew significantly around the same time.
There have been some attempts to describe consciousness as an emerging phenomena out of neural activity, but so far all of these attempts have failed, or at least failed to turn consciousness into a useful term in psychology (the way gravity is a useful term in physics). I think it is equally likely that these attempts have failed because consciousness may simply not be a useful term in psychology, that is as likely as it is that we simply don‘t understand it well enough.
Saying we have a good idea of how the brain works massively overstates the case...
We know how neurons fire. We do not know how a brain turns that into thought, meaning, intention, experience and on and on. That is not "pretty well understanding the brain", it's understanding some components and hand waving the thing we actually care about.
What I actually care about is how neural activity translates to behavior. And we have a good enough idea of that that we can design SSRI medicine to treat depression, or neurological tests to detect Alzheimer. As for experience we do know something and we are learning more with cognitive psychology, in e.g. priming experiments etc.
I feel like the search for consciousness is to psychology what the search for the Aether was for physics and chemistry. I think it is a worthwhile search, and maybe we will discover something important during that search, but we should also be prepared to find out that the thing might not exist, or it’s presumed properties are better explained with a different model.
SSRIs are not evidence that we understand how neural activity becomes behavior. They are evidence that you can perturb a system usefully without understanding it very well. That is exactly my point.
Respectfully, you are miles out of your depth here.
> beyond a certain number of transformations, people are willing to treat it as some sort of miracle, and too tired to figure out why it came up with the answer it came up with
It’s less about being too tired and more about being realistic about the limits of understanding.
Consider mass and energy flows in planet-scale systems: At some point we call these “weather” and change the tools with which we study them, but we never stopped trying to understand the phenomenon.
If you're going to make something smarter than a person, you got to be convinced that you're only going to be able to understand it on the single training step level and then inductively trust that the rest of it works. We do empirical testing of course with evals, but there's sort of an art to figuring out what is theoretically going to improve eval performance. Trying to fit the meaning of all those weights in your little human brain and working back from there isn't going to work for more than a little slice of the dataset at a time because that's all we can fit in our understanding.
When we attempt to recreate those complex, planetary atmospheric phenomena in a box, we're doing so in order to measure and study them.
Making random turbulence in a box until it resembles the outside world, and calling it weather and extrapolating some predictive meaning from the result, is the total antithesis of what you're describing about why we come up with simplified models for impossibly complex systems. The purpose of [mathematical] models that are built thoughtfully is to explain why complex systems are the way they are, with data and algorithms, however imperfectly. [Whereas] The purpose of LLM models is to give the illusion of answering questions while never answering why the answer was given. The difference is the difference between a scientist and a tarot card reader, an equation and an oracle.
People have a well known tendency to gravitate toward the shamanistic, oracular, and superstitous. Listen, I ran a casino for 6 years, I know. The impossibility of knowing how 80 layers of matrix multiplication led to a particular answer is in itself a psychological factor in choosing whether to accept the answer or to question it. People tend to err on the side of the over, in sports betting terms... or on the lazy side in general... and they will make whatever excuses they need to after the fact to justify their decisions. So now we have a machine that can act like an oracle and which you can also blame, but the blame goes into a void because this machine is stateless and is only a reflection of information, not an intentional refinery of data.
Sit next to a bank of slot machines for an hour and listen to the absolutely ridiculous shit most people will come up with to explain how they "know" if a machine is going to pay out soon, and then tell me if you think it's a good idea to give them an LLM in their pocket to answer their questions in whatever way they frame them.
> The purpose of [mathematical] models that are built thoughtfully is to explain why complex systems are the way they are, with data and algorithms, however imperfectly.
Nope. The main purpose of the whole endeavor is usually to predict the behavior of a complex system, because that's actually what we care about. If we can predict it, we can adapt to it, and eventually use it to our advantage.
Explaining why a complex system is the way it is, is merely nice-to-have. Models are opinions. All of them are wrong, but some are useful, and we rank them by how useful they are. The models and explanations are important because, beyond their elegance and convenience, it's also the case that more accurate models give you better predictions across larger domains, meaning we get better at getting something useful out of the complex system.
People get fixated on modern theoretical science, with bottom-up mathematical explanations traced through seas of empirical data, with whole magical rituals of peer review and double-blind studies and statistical significance around them. But they forget that the core of empirical science is literally throwing shit at a wall to see what sticks. That is the guiding principle, everything else is just making the process more efficient.
Understanding complex natural systems (or even engineered ones that got too complex) always starts with tests - tests on the real thing, then on approximate models that we poke and prod and bash into shape until they start acting similarly to the real thing. It's through the poking and bashing, and how they affect our proxy model, that we glean insights into nature of the simulated phenomena, and eventually formulate general theories - but more importantly, the models give us useful predictions from the start, before we have any theories explaining why.
I don't know - this is a highly specific interpretation of both what science is and why people choose to do it.
I'm a scientist. Believe it or not, I believe in substantially more than prediction and I think its rather trivial to come up with examples where mere prediction is insufficient to meet a normal person's notion of an account of a thing (eg, pre-copernican planetary motion). I'm not saying you are wrong, per se, just that the idea that "it was prediction all along" is a very specific idea of what human beings are interested in and what we are up to.
> that we glean insights into nature of the simulated phenomena
That is right - most people believe that there is a simulated phenomenon "out there" that we learn about. I think there are strong reasons to believe this having to do with how models are related to predictions. The wrong ontology can make prediction very hard and the right one can make prediction substantially easier. Arguably, we are in that situation right now with language models - we just threw a lot of parameters at the problem and now we are able to predict but we still don't really understand. This is perhaps inevitable in the case of language, but I don't think we should look at models with tons of degrees of freedom and the ability to predict things as a death knell for the very idea of deeper understanding.
> The purpose of LLM models is to give the illusion of answering questions while never answering why the answer was given.
This is just your own idiosyncratic and biased belief. You're not describing anything objective about LLMs, you're describing your personal attitude to them. This colors your understanding in a way that can't really be reasoned with until you let go of the artificial constraints you're imposing on your own understanding.
> Sit next to a bank of slot machines for an hour and listen to the absolutely ridiculous shit most people will come up with to explain how they "know" if a machine is going to pay out soon, and then tell me if you think it's a good idea to give them an LLM in their pocket to answer their questions in whatever way they frame them.
If the LLM in their pocket has a more robust world model than they themselves and is e.g. able to refute their irrational convictions, it actually seems like a very good idea. (Big if, of course.)
Agency?
What are you talking about?
I want freedom.
I want freedom to do what i want and not sitting in front of a computer and coding for some company.
Please AI lets burn down knowledge work and labor work. Lets create so much stress to our society that we start rethinking what works mean.
Lets redefine work into discovering the world again. Let people do old handcraft jobs, let them do more sports, let them read more, let them write and make more. Let them enjoy nature.
Work has never been about "discovering the world". There have been a handful of privileged folks who had the time to "discover the world". Work has traditionally been "let's find enough food for my family". If you want to think of a future of abundance then perhaps we can discover the world.
> Lets redefine work into discovering the world again. Let people do old handcraft jobs, let them do more sports, let them read more, let them write and make more. Let them enjoy nature.
Why leave something so important up to what AI does or doesn't do?
Because capitalism doesn't allow for that.
Only a fundamental change to our society will allow this for the masses when pressure to the rich skyerockets
This seems to be a little naive about how humans consume the benefits we create in society.
"Let people do old handcraft jobs, let them do more sports, let them read more, let them write and make more. Let them enjoy nature."
Very nice thoughts. You know we all could do this today without "burning it down"? Get in your pod, eat your slop, and watch your screen is where this is headed.
"I want freedom to do what i want and not sitting in front of a computer and coding for some company."
You get that it's you creating the misery here? Then stop? Don't do it. Go start a farm or whatever you think will solve your problems. At some point this all boils down to "chop wood and fetch water" so if the modern way of doing that is so terrible then stop. Go fetch water the old fashioned way and be free.
The solution we've come up with is move all the unpleasant work stuff to China where people don't complain about doing it because they already have communism, and therefore everything is of course effortlessly perfect there.
"I want freedom to do what i want and not sitting in front of a computer and coding for some company."
"Please AI lets burn down knowledge work and labor work"
"Let people do old handcraft jobs."
So many presuppositions about what people want to do.
As a child I spent a lot of time programming and doing "knowledge work" because it's fun - I don't enjoy "old hand-crafted jobs". Sure, let's definitely destroy capitalism in it's current state I suppose. But I find people like you who hate knowledge-work/coding and think everyone else must feel the same and only do it for the money a bit out-of-touch.
right, these knowledge work and coding jobs are, by my lights, about the best possible job. From my perspective we've invented a machine that does the fun parts while leaving me the less fun parts (review, various hard-to-claude janitorial tasks, etc).
I might like woodworking as a hobby (for example), but I sure as heck don't want to be a carpenter or to depend on my ability to hand craft enough widgets people like to survive
I differenciate between things you have to do (work) and things you want to do. Work means someone else is telling you your priorities.
If you want to write code and think, you would be welcome in my utopian vision.
But when i write code, its business shit. And its business shit someoneelse already solved a few times.
The weights are code, the prompt is code, the output is code.
Is the meat code?
The data is the code. Training algorithm is the compiler. The weights are the byte code produced to run on the inference VM.
The data is the code is the data. Reality has no distinction between "data" and "code". These terms are categories we impose on systems we design, to make it easier for us to build and reason about them, but they're nothing but mere opinions, and depend less on the system structure, and more on the perspective of the person asking which is which.
This is related to, and possibly equivalent with, the core point of both this story and the original one: computation is independent from substrate.
You can build a computer out of anything, whether it's semiconductors or lasers or meat or magnetic fields or water flowing downhill or abstract thought, and that computer will happily perform the same computation as every other equivalent construct from whatever substrate. That's because computers are ultimately made of math, and we design "real ones" by finding ways to approximate the mathematical constraints with physical systems. But the choice of how to map the math to physical systems is completely arbitrary, and any such mappings are equivalent from POV of information processing ability.
(Of course substrate is not arbitrary from economic POV, which is why we build most of our computers out of silicon and plastic, and make it work with electric current and lasers.)
> Reality has no distinction between "data" and "code"
yes, yes, ostensibly the universe is built on lisp.
But we all know that it was hacked together with a lot of perl[1].
[1] you all know the reference.
One of the best thing I done for my career (as a self taught software developer, but with a degree in electronic engineering) is to learn computation theory.
Computation is math (and a very restricted subset of math). It’s mostly specific sequences of sets manipulation. What sets and what manipulations are defined by people, not by the idea of computation.
The best thing is that as soon as you specify the sequences of manipulation, it become a a set that you can manipulate. That can be a difficult concept to grasp, but that’s what helps in designing notation that are more appropriate for the human mind to describe a solution for a specific problem.
Yes. Is it data? Yes.
Is the distinction between "code" and "data" just someone's opinion? Yes. There is no such distinction in reality.
This is a good model. If you take an old ROM dump from a video game, it's just a pile of bits. You don't know what bits represent code, what represent an image, what represent text, etc. You have to analyze them contextually to actually figure out what is code and what is "data" in context, because without context they are truly one and the same.
That's why encountering something like LISP for the first time (by writing a LISP interpreter, for example) creates a big bang event in form of an imminent intellectual catharsis. People who encountered it just once, will never be able to see the world through the old "meaty" lenses afterwards.
Is matter code? There is some sort of computation happening in space over time.
By Fermat's principle, a ray of light has to know where it will ultimately end up before it can choose the direction to begin moving in.
So either something is computing it or some exploration is happening at quantum level and we just see the final result.
Fermat's principle is an outcome of constructive interference of waves. It works both for classical and quantummechanical descriptions. E.g. check https://phys.libretexts.org/Bookshelves/University_Physics/U...
> a ray of light has to know where it will ultimately end up before it can choose the direction to begin moving in
A ray of light doesn't know or choose because it has no agency, just like an apple doesn't know or decide to fall because of gravity. It's an anthropomorphization.
True, so the interference is the "computation"(heavy emphasis on quotes) which gives rise to the principle.
> a ray of light has to know where it will ultimately end up before it can choose the direction to begin moving in
I'm no physicists, so I guess I'll ask it: Why?
Also related, why do some ray of light then "see" a black hole yet decide to head into them anyways, if they saw it before they went in that direction? Seems like a dumb move :)
Its future isn't over there because it moves in that direction, instead it moves in that direction because its future lies over there.
Relatedly:
> [General Relativity] basically says that the reason you are sticking to the floor right now is that the shortest distance between today and tomorrow is through the center of the Earth.
https://physics.stackexchange.com/questions/250800/gr-and-my...
Does anyone have a link to a good video visualisation of training & inference?
3 blue 1 brown has a great visual introduction to transformers, the heart of LLMs.
It's chapter 5. Start at chapter 1 if you want more background on neural nets and backprop.
Yes, yes, but what fertile fallacies and common misunderstanding can politicians use to acquire more power via exploiting the difference between the common person's flawed understanding due to cargo culting, cognitive biases, and/or outdated or inappropriate analogies vs actual reality? Is there any way we can get the AI to say give all political power to narrator is the solution to all problems and use the common person's mistaken worship of AI as a spiritual all knowing conscious being with unusual sensitivity and caring about everyone to cement that power? Certainly one of you eggheads can tweak that for me? What? It's against your ethics? We're trying to save the world here. Here, let me call up Bernie Sanders to propose nationalizing half your companies so we can do that.
The original story is an original work made by a human consciousness exploring how it might be different from other forms of consciousness.
This one is a pastiche made by a human consciousness borrowing extremely heavily from another human consciousness justifying why something else might be another form of consciousness.
That rather undercuts the point; if this was generated by an LLM unprompted, it would be different, but it isn't. You could perform exactly the same rhetorical trick with a toaster or anything else.
Thank you for your point. I don't understand why half these comments are taking this blog post seriously when it ends with "Weights helped me draft and proof this story."
> Weights helped me draft and proof this story.
Any HN reader here now, I encourage you to read the original ( https://www.eastoftheweb.com/short-stories/UBooks/TheyMade.s... ) in one sitting, go about your day, then read it again. Maybe make some notes on personal critical questions.
Now read the post's topic again ( https://maxleiter.com/blog/weights ) and reflect on the prior fact that weights helped [the author] draft and proof this story.
My reaction (and I'm sorry that it is harsh according to some) is that there is no intelligence found in either the author nor their tool. This is extreme navel gazing, based in science fiction, wanting (wishing) to believe those stories to be true.
I'm skeptical of AI sentience because we must do our due diligence, not because it's impossible. Skepticism is the only respectful approach because to grant sentience is a step away from granting rights.
We humans tend to chauvinism in all things (e.g. we're special, the center of the Universe, God made the universe for us, etc), no less when it comes to judging intelligence. The original story about thinking meat was written to help us out of our chauvinism; this derived story was written about weights for the same reason. Which is quite valid.
The actual counterpoint is demonstrated in _Blindsight_ Peter Watts. He makes a strong (and rather terrifyingly strong) point that intelligence is not consciousness.
I cannot tell if you are asserting my comment is chauvinistic with your use of "we." If that is so: that's a poor counter to my point or assessment of my stance because it assumes I'm making a baseless argument as a "proud human."
My original comment (roughly "there's no intelligence in this article, nor sentience in LLMs") is in response to the blog post's buried lede (that the cumulative activity of LLMs has accrued to a weight of "AGI is around the corner" or "there is artificial consciousness in this matrix").
To be clear, I'm not saying LLMs are useless or a wrong direction in development of "AI," but rather it's the Fool's Gold for the path towards AGI, the pursuit of the academic field of Artificial Intelligence research. A research that I've been abreast of for years before this new age of language models that has made everyone with a keyboard an arm chair expert.
Also, thank you for the book recommendation, it's on my list! :)
I read your comment as criticizing the OP's story as pointless and unoriginal. My comment elucidated the point of the original story, and what I think is the point of the second story.
Roger that, thank you. w/r/g your recapitulation of my point: yes, the story is unoriginal and pointless, and the HN community seems to eat it up-- isn't that odd.
So I still disagree with your elucidated point (as you end with "which is valid"): the OP author is using prior art fiction to bolster their opinion of LLM-based software tools as being a possible vector of sentience, not to disarm our chauvinism like the original author intended. If OP wanted to make that point, they could have written a critical essay instead of farming out their thoughts as tokens.
But still, I look forward to reading the book you suggested to understand and appreciate your perspective more.
The point is valid regardless of whether you judge LLM to be sentient or not, because the point is to say "don't let your prejudice about substrate bias your decision". Or in other words, if you're going to weigh something don't tip the scale. This is good advice whatever the outcome of the measurement.
Blindsight is a remarkable book - I hope you enjoy it!
Ah I see now that I am in agreement with you, thank you for being a patient interlocutor. I do not discard and rule out the possibility of a different substrate being the well-spring from which sentience emerges.
Plainly, based on the current ground trodded and the trajectory laid out by the frontier AI labs, I do not have concrete evidence/proof of sentience having emerged from LLM-based software tools as of June 4 2026 nor do I expect it to happen in the future based on my understanding and observations of this technology. I'm not excluding the possibility but wielding skepticism. I am open to being proven wrong with new discoveries.
Which is why (to return to my lashing of the dead horse) I don't see OP's post as worthwhile. Their post reiterates a point that is already valid (the prior art) with no new substantial discovery. Which is why "unoriginal and pointless" is apt, a novel idea was not presented; it's just some vain virtue signaling.
I don't think skepticism should be called chauvinism. I imagine that artificial consciousness could be made. But I don't think this is it.
Also I don't see why intelligence not being consciousness is scary? My cats are very conscious as far as I can tell, but not particularly intelligent. I think LLM's exhibit some contextual intelligence without there being any particular reason to believe they're conscious other than woo psuedoscience.
You underestimate the intelligence of your cat. Or rather you measure intelligence with an extreme human bias. What you consider intelligent behavior your cat may consider weird, and what your cat considers intelligent behavior is something you will never consider.
That said, I don’t think it is useful for philosophy nor science to consider intelligence to be the same thing as consciousness. In fact I would go even further and claim that intelligence is not a useful construct, neither for philosophy nor for science. Consciousness, on the other hand, I think is useful for philosophy, but not (as of now) for science.
> I'm skeptical of AI sentience because we must do our due diligence, not because it's impossible. Skepticism is the only respectful approach because to grant sentience is a step away from granting rights.
Thanks for saying this! It amazes me to witness so much pushback (in HN of all places!) for the call for skepticism and scientific rigor on claims made by business which have vested interests in hyping things up.
People argue for AI sentience from a place of emotion couched in logic. they _desperately_ want it to be true and will not take a logical step back. Any argument comes back to "well doesn't a human brain work like that?" Or some variation of it.
My personal theory is a fuzzy thought about how people want to reject the concept of a higher being and want to embrace the fact that we are now able to create our own consciousness and religion is dead.
I don't understand why, but it is the undertone of every argument I've seen that is pro-AI-is-sentient, like some big unspoken elephant-in-the-room.
I would rather just judge this tech on its own merits.
edit: this comment got 1 upvote literally as I submitted it. I know @ doesn't work, but @dang, something seems very strange about that.
I'm open to the possibilty of AI conciousness, and there is some desparation related to the concept of a higher being:
There are many people who will categorically rule out the posibility of AI consciousness due to near-unshakable belief in a higher being. This argument resembles "Christians should not be worried about our climate since God is ultimately in control." Such views make it harder to collectively prevent dangers from a sentient AI, or harm to a sentient AI.
I do not claim that everyone who believes in a higher power believes concious AI to be impossible, or vice versa; just that it would be very hard to change the minds of those who adhere to this reasoning.
> People argue for AI sentience from a place of emotion couched in logic. they _desperately_ want it to be true and will not take a logical step back. Any argument comes back to "well doesn't a human brain work like that?" Or some variation of it.
It's funny, because I find myself constantly stating the inverse of this. Every argument I've seen against AI being sentient plainly comes from, as you so eloquently put it, exactly "a place of emotion couched in logic". People desperately want it to not be true and will not take the logical step back of examining its actual similarities to human intelligence. Every argument comes down to "but it's not actually a human", or some variation of it -- which, if you pay attention, is not actually a logical counterargument. (Or, ironically, "but it doesn't have a soul", which is why the Pope is the perfect figurehead for these people).
If you already know any logical argument against it can be countered with "well doesn't a human brain work like that?", why are you so confident that your position is actually the logical one?
...And could it simply be that, alternatively, the concept is not actually a logical distinction, but rather an emotional one, made by emotional beings to put a word to what they claim makes them special?
You can't do the same with a toaster. Physically you could write that story. But it would fall flat because the toaster is not a compelling subject in a discussion of consciousness. You don't have to believe that LLMs or AI agents are conscious to acknowledge that the argument for their consciousness is far more compelling than any other technological artifact.
You could absolutely write a compelling story about a sentient toaster; it's been done before [1].
That is entirely separate to whether or not it would be a meaningful way to understand the world; a convincing story is not the same thing as one that is true.
I never said you couldn't write any arbitrary compelling story about a toaster, I said that this specific hypothetical story, where you rewrite "They're made of meat!" to be about a toaster, would not be compelling.
I am doing my best to communicate with you but to be honest you are not hearing me (across both responses), and I am out of words.
Just wanted to say, I appreciate your patience and good sense in this thread.
It's difficult to tell who's trolling -- probably best to go with the charitable assumption that everyone is honestly trying to convey their opinion, but mostly talking past each other. Unfortunately these discussions about the nature of consciousness never go anywhere useful.
I think I'm probably in the same boat as you, roughly: a) LLMs are doing something really interesting that resembles in many ways both intelligence and consciousness; b) I suspect they're not actually conscious but I don't know how you'd know for sure; c) it all just drives home that we still don't really know what consciousness actually is. But like (a), it's definitely something really interesting...
I don't think I was quite as patient as I should have been, but I do appreciate it.
It would be equally compelling, because the compelling nature of the story comes from the language, the presentation, rather than the [specific thing being ascribed consciousness].
No, the original "they're made out of meat" works because we're confident that we are in fact intelligent and conscious, despite how ridiculous and unlikely the author manages to make it sound.
"They're made out of weights" works precisely because LLMs really do have this mysterious property that they seem somehow intelligent even though nobody can explain exactly why, and there's active debate over whether they could be considered conscious.
The thing being discussed isn't simply an arbitrary MacGuffin; in both cases the nature of the thing is central to the impact of the story.
I disagree; it works in the original because it's the unlikely consciousness that produces the text itself; in the LLM case, it's produced by the likely consciousness.
"Imagine how other intelligences would view us", written by us, hits a lot less hard when it's "imagine how our intelligences view a thing we are claiming is intelligent", not written by it.
> Imagine how other intelligences would view us", written by us, hits a lot less hard when it's "imagine how our intelligences view a thing we are claiming is intelligent", not written by it
This is well put. We don't need to imagine how a human views a llm because we can ... just do that. Everyone capable of reading the story is also capable of thinking about how they feel abouy llms that exist right now and you've probably used.
The trick of the original story is inverting your perspective, moving your view point fron yourself to an "other" (which I think is a primary qualifier for most good fiction).
The article ends with this disclaimer "Weights helped me draft and proof this story.". So it is at least partially written by LLM.
> a compelling story about a sentient toaster
"Howdy-doodly-doo! Anybody like any toast?"
But that toaster would just be a device to talk about consciousness in general. In this case it does that and also it talks specifically of the LLM case, which can spark the discussion. Unless you believe to have the only valid and true opinion on the matter, and affirm that a normal toaster is just the same as an LLM in this topic.
An LLM is as conscious as a toaster...
It’s reductio ad absurdum. No one cares about teapots in space either (Russel).
I agree that is the mode of argument; reductio ad absurdum is a brittle argument, because it only works if the analogy holds. I argued the analogy doesn't hold.
> But it would fall flat because the toaster is not a compelling subject in a discussion of consciousness.
Teapots are not compelling.
> You don't have to believe that LLMs or AI agents are conscious to acknowledge that the argument for their consciousness is far more compelling than any other technological artifact.
God is compelling t billions of people.
Is Russel’s Teapot a bad argument in the God debate?
> Is Russel’s Teapot a bad argument in the God debate?
What's the relevance? If the argument made here are was a good argument, it wouldn't matter if Russell's argument was bad. We could construct a bad argument using reductio ad absurdum right here and now and it wouldn't matter to either argument.
Can you be straight with me? You know the salient difference between asserting the consciousness of a toaster and the consciousness of an AI, right? It isn't a mystery to you why we would find one line of inquiry interesting and the other not so much?
For instance, it's probably a real possibility in your mind that I am not a human and am an AI. But you probably aren't entertaining the hypothesis that I'm a toaster.
> What's the relevance?
It directly parallels your argument.
> Can you be straight with me? You know the salient difference between asserting the consciousness of a toaster and the consciousness of an AI, right? It isn't a mystery to you why we would find one line of inquiry interesting and the other not so much?
There are two aspects here.
1. That people find the question interesting
2. That it has any bearing on reality (ontology?)
The first aspect is anthropology. Russel’s Teapot is not supposed to undercut any anthropological arguments. It’s supposed to undercut the second aspect.
So far you have said that the argument is compelling. What’s that got to do with reality? A robot cow could be sexually arousing to a real bull.
> For instance, it's probably a real possibility in your mind that I am not a human and am an AI. But you probably aren't entertaining the hypothesis that I'm a toaster.
Yeah. AIs know how to use computers. What’s this got to do with consciousness? Whether or not you are an AI is practical and disprovable. Consciousness is so ephemeral (for lack of a better word, not literally) that Philosophical Zombies is a real argument/thought experiment.
You may think I’m being coy (“Can you be straight with me”) but that’s not my intent at all.
> It directly parallels your argument.
Much like Russel argued that the burden of proof of God's existence is on theists, the burden to establish this parallel is on you as the person forwarding the argument. I don't see any relevant connection. Russel isn't arguing that a teapot is as real as God in the same way it's disputed here that a toaster is as conscious as an LLM.
> So far you have said that the argument is compelling. What’s that got to do with reality? A robot cow could be sexually arousing to a real bull.
AI is a real phenomenon that we can study and measure. There is no experiment that anyone has devised can determine whether or not they are conscious, so that is the reality - uncertainty. That doesn't mean they're conscious. It means that the belief they are not conscious is assumption.
You might say the same of a toaster, but these hypothesis are not equally strong. The toaster doesn't exhibit any behaviors to suggest that it's conscious. Consciousness isn't a hypothesis with any explanatory power for the observed behaviors of a toaster. It's not a hypothesis that's on the table. That's why the analogy doesn't work.
To put a fine point on it, it appears on it's face that AIs could be conscious. They can put on a very convincing performance of being a person. A sufficiently convincing performance is indistinguishable from the real thing. So at face value, the burden of proof is on them not being conscious. Reductionist arguments that present the mechanics of how they work and leap to their not being conscious don't work, because there is no law saying a statistical model can't be conscious. That's an assumption, not knowledge.
> Much like Russel argued that the burden of proof of God's existence is on theists, the burden to establish this parallel is on you as the person forwarding the argument.
I pointed out the parallel in both statements. I can't do more than that.
> Russel isn't arguing that a teapot is as real as God in the same way it's disputed here that a toaster is as conscious as an LLM.
The teapot isn't real and the toaster consciousness is not real. What am I missing?
> AI is a real phenomenon that we can study and measure.
Robot cows are real as well.
> There is no experiment that anyone has devised can determine whether or not they are conscious, so that is the reality - uncertainty. That doesn't mean they're conscious. It means that the belief they are not conscious is assumption.
Yeah. You can't prove it for any entity. I agree.
> You might say the same of a toaster, but these hypothesis are not equally strong. The toaster doesn't exhibit any behaviors to suggest that it's conscious. Consciousness isn't a hypothesis with any explanatory power for the observed behaviors of a toaster. It's not a hypothesis that's on the table. That's why the analogy doesn't work.
The bull swears that the robot cow is a real cow. But we know better.
> To put a fine point on it, it appears on it's face that AIs could be conscious.
It doesn't to me. Not any facelength.
> They can put on a very convincing performance of being a person. A sufficiently convincing performance is indistinguishable from the real thing.
Objective reality has never cared (am I anthropomorphizing now?) what is indistinguishable for people.
> So at face value, the burden of proof is on them not being conscious.
Which party is the burden of proof on? This is confusing since you are saying that the burden of proof is on a position (on them not being conscious).
Is the burden of proof on people who argue that they are n o t conscious? That's peculiar.
I have never heard about any principle in philosophy or in science that says that, given enough Looks Like A Duck points, it is a duck. Based on subjective experience, even.
We obviously can't demand a falsifiable theory here. But we have to do better than arguing from incredulity.
> Reductionist arguments that present the mechanics of how they work and leap to their not being conscious don't work, because there is no law saying a statistical model can't be conscious. That's an assumption, not knowledge.
They don't have to rise to the level of disproving something for which they have no burden to disprove.