ARC-AGI without pretraining
iliao2345.github.io351 points by georgehill 5 months ago
351 points by georgehill 5 months ago
I feel like extensive pretraining goes against the spirit of generality.
If you can create a general machine that can take 3 examples and synthesize a program that predicts the 4th, you've just solved oracle synthesis. If you train a network on all human knowledge, including puzzle making, and then fine-tune it on 99% of the dataset and give it a dozen attempts for the last 1%, you've just made an expensive compressor for test-maker's psychology.
This betrays a very naive concept of "knowledge" and "understanding". It presupposes that there's some kind of platonic realm of logic and reason that an AGI just needs to tap in to. But ultimately, there can be no meaning, or reasoning, or logic, without context. Matching a pattern of shapes presupposes the concept of a shape, which presupposes a concept of spatial relationships, which presupposes a concept of 3 or even 2 dimensional space. These things only seem obvious and implicit to you because they permeate the environment that your mind spent hundreds of millions of years evolving to interpret, and then tens of years consuming and processing to understand.
The true test of an AGI is it's ability to assimilate disparate information into a coherent world-view, which is effectively what the pretraining is doing. And even then, it is likely that any intelligence capable of doing that will need to be "preloaded" with assumptions about the world it will occupy, structurally. Similar to the regions of the brain which are adept at understanding spatial relationships, or language, or interpreting our senses, etc.
Yes, AGI was here at AlphaGo. People don't like that because they think it should have generalized outside of GO, but when you say AGI was here at AlphaZero which can play other games they again say not general enough. At this point is seem unlikely that AI will ever be general enough to satisfy the sceptics for the reason you said. There will always be some domain that requires training on new data.
You're calling an Apple an Orange and complaining that everyone else wont refer to it as such. AGI is a computer program that can understand or learn any task a human can, mimicking the cognitive ability of a human.
It doesn't have to actually "think" as long as it can present an indistinguishable facsimile, but if you have to rebuild its training set for each task, that does not qualify. We don't reset human brains from scratch to pick up new skills.
I'm calling a very small orange an orange and people are saying it isn't a real orange because it should be bigger so I show them a bigger orange and they say not big enough. And that continues forever.
https://en.wikipedia.org/wiki/General_game_playing
Is not
https://en.wikipedia.org/wiki/Artificial_general_intelligenc...
Maybe not yet, but what prevents games from getting more complicated and matching rich human environments, requiring rich human like adaptability? Nothing at all!
But AlphaZero can't play those richer games so it doesn't really matter in this context.
"AI will ever be general enough to satisfy the sceptics for the reason you said"
Also
People keep thinking "General" means one AI can "do everything that any human can do everywhere all at once".
When really, humans are also pretty specialized. Humans have Years of 'training' to do a 'single job'. And they do not easily switch tasks.
>When really, humans are also pretty specialized. Humans have Years of 'training' to do a 'single job'. And they do not easily switch tasks.
What? Humans switch tasks constantly and incredibly easily. Most "jobs" involve doing so rapidly many times over the course of a few minutes. Our ability to accumulate knowledge of countless tasks and execute them while improving on them is a large part of our fitness as a species.
You probably did so 100+ times before you got to work. Are you misunderstanding the context of what a task is in ML/AI? An AI does not get the default set of skills humans take for granted, its starting as a blank slate.
You're looking at small tasks.
You don't have a human spend years getting an MBA, then drop them in a Physics Lab and expect them to perform.
But that is what we want from AI, to do 'all' jobs equally as great as any individual human in that one job.
That is a result we want from AI, it is not the exhaustive definition of AGI.
There are steps of automation that could fulfill that requirement without ever being AGI - it’s theoretically possible (and far more likely) that we achieve that result without making a machine or program that emulates human cognition.
It just so happens that our most recent attempts are very good at mimicking human communication, and thus are anthropomorphized as being near human cognition.
I agree.
I'm just making point that for AI "General" Intelligence.
That humans are also not as "General" as we assume in these discussion. Humans are also limited in a lot of ways, and narrowly trained, make stuff up, etc...
So even a human isn't necessarily a good example for what AGI would mean. Human is not a good target either.
Humans are our only model of the type of intelligence we are trying to develop, any other target would be a fantasy with no control to measure against.
Humans are extremely general. Every single type of thing we want an AGI to do is a type of things that a human is good at doing, and none of those humans were designed specifically to do that thing. It is difficult for humans to move from specialization to specialization, but we do learn them with only the structure to "learn, generally" being our scaffolding.
What I mean by this is that we do want AGI to be general in the way a human is. We just want it to be more scalable. It's capacity for learning does not need to be limited by material issues (i.e. physical brain matter constraints), time, or time scale.
So where a human might take 16 years to learn how to perform surgery well, and then need another 12 years to switch to electrical engineering, an AGI should be able to do it the same way, but with the timescale only limited by the amount of hardware we can throw at it.
If it has to be structured from the ground up for each task, it is not a general intelligence, it's not even comparable to humans, let alone scalable beyond us.