AI won't kill us any time soon
In my previous article, I explained why I think that AI will probably become a serious risk for humanity at some point.
I wrote that article because I think it's true, important, and because there are a sizable number of people who don't agree with this. However, I also disagree with many people who are concerned with AI risk. I think that their arguments are faulty, and do a disservice towards this position.
The Extreme AI Risk Position
There are all sorts of "AI doomsday" predictions I don't find compelling. The one I am focusing on in this article is the fast take-off AGI scenario that you hear about in rationalist circles.
On this view, we'll achieve Artificial General intelligence in the next 20 years, as hardware improves and these systems get more data, they could suddenly do everything humans could do.
We'll make these superior machines long before we understand how to do it safely, unless we take serious action now.
What bothers me about this
Before I address why I think this isn't going to happen, I want to explain why I don't like that this idea has taken a hold of so many people.
Suppose everyone acts on this idea and they're wrong, it actually would have taken at least a century to get to AGI. The opportunity costs of that are huge. If we had good reason to think we'll be wiped out in the next few decades, we would dedicate many resources towards an imaginary problem rather than the real problems we have to deal with today. We'll also stifle AI research, not achieving its tremendous potential for improving human well-being - from improvements in medicine to agriculture and more.
My other problem with this is that it's a "boy that cried wolf" situation. If this is the argument that is advanced, then if people find holes with it they will dismiss the idea of AI Risk, even though there are good reasons to be concerned about AI risk without thinking that it's the biggest threat we have to face in our lifetimes.
Even if it took 100 years, we would still have to make sure we address it. We only need it to go wrong in a handful of ways for humanity to be screwed. We don't need to advance a variety of fanciful scenarious and extrapolations that spell doomsday. If we thought that some important resource would run out in 100 years (oil perhaps), or that we won't be able to feed everyone, finding ways to circumvent that would be on our current agenda. You could argue that people aren't good at thinking about long term risks, so making them think it will happen soon motivates them to act on it. But concerns about lying aside, people also aren't very good at rationally thinking in the face of impending doom.
All you need to accept AI risk
You don't actually need to accept that many things to thing that AGI will be a major risk for humanity. I'll summarise the argument from my previous article here.
- General intelligence is compatible with the laws of physics.
- Nothing supernatural is needed to explain our cognitive capacities. It's all a certain kind of information processing.
- We can make machines that function with similar principles. It doesn't have to be made of carbon.
- We can make them "better" than us. We're limited by slow biological evolution. We have embryological constraints. We have biases that are potentially avoidable. We use a small amount of energy. We have many slow chemical connections. We communicate with these inefficient flaps of meat. If we make something that functions according to similar principles, we can fix many of the flaws we have and make a machine that is far more intelligent than we are.
- We probably will get there. Unless if progress in understanding the brain halts, as long as there continues to be progress we will get there some day.
- If we have machines that can do everything we can do, but better, then unless their "values" are perfectly aligned with our own, actions they take may be incompatible with our well-being. They don't need to be evil. They don't need to seek domination or whatever. We don't even think of ants when we step on them to get what we want. We don't need to be part of their calculus at all, we just have to be harmed somewhere along the causal chain. If they are as smart as us but don't care about us, they won't filter out those actions. The scenarios don't have to be farfetched, just look at what humans are doing by accident to the ecosystem and to each other.
- If they're so smart, we will have a hard time predicting what they'll do or stopping them.
- Therefore we can't wait until we get there before figuring out how to respond, we need to consider this before and be prepared.
There's nothing here about a timeline. There's nothing here about machines getting rid of rules we give them to meet objectives we give them. There's no need for them to start modifying themselves or plotting against us or decide that they must protect themselves at all costs. We don't need to consider science fiction scenarios.
Why I think this view is wrong
Why do we think that machines can be generally intelligent in the first place? Because we are. We had evidence that heavier than air flight is possible because birds do it, and we have evidence that intelligence is possible because we can do it.
If we synthesised a human, we should expect it to be able to do what humans can do. And if we modified that a bit, say, we removed the toes, we should also expect that the brain should work the same.
But the more changes we make, the less confident we are that we can even make something that does similar stuff.
Therefore, the extent to which we're almost at "human-like" AI depends on how similar the internals of AI are to human brains (and other animals).
You might argue that planes were invented and don't work the same way that birds do. And that's true. But we have theories of aerodynamics. And to some extent it's a simple constraint solving mechanical problem. And even then it took a while to do - around 2 centuries for heavier than air flight. But with AI, we don't have a good theory of cognition. We don't know what constraints and equations to satisfy to get intelligence in the things we can do. So in this case I think that the road to AI will probably come from brains.
What about recent AI progress?
What about the progress with AI systems? Doesn't that indicate that maybe we've found a convergent path? Every year neural networks are doing better at many different tasks, from language to image recognition and more.
There are a few problems with this. For one, this has been the case since the dawn of history. We've been automating things that humans can do for a long time, and each new paradigm showed bursts of advancement. But it seems strange to argue that the invention of the steam engine means we will make machines that will be generally intelligent. (Although you might get a different impression in Isaac Asimov's work). With each new technology, people had new metaphors for human cognition, and neural networks are just the latest version.
I expect progress with neural networks. And I even expect it to get better than people at certain things. But I don't expect this paradigm to get all the way there, because it doesn't have the right building blocks.
The second problem with the argument from AI progress is that if you zoom in and look at the progress, it seems like a trick if you think that this is part of the pathway towards AI. A lot of it is smoke and mirrors.
Consider AI and chess. Chess players like Garry Kasparov are mocked for having said that computers will never be able to play chess as well as a human. And now the best human chess players stand no chance against the best computer chess players. But I think that humans are still better at chess in key ways.
For a start, consider the number of moves and positions that need to be calculated to get to a good move. Stockfish, the symbolic program, examines around 70 million positions per second. Alphazero, the deep reinforment learning model, examines around 80,000 per second. A human grandmaster examins a few dozen at most.
AlphaZero also played around 20 million games to train itself. It took many many games for it to reach human level.
Whilst AlphaZero was created to be more general than AlphaGo, with fewer pre-programmed rules, and able to play games such as Go, chess, and shogi, it's clearly more specialised towards chess than humans are. Games like chess are their reason for being. They aren't concerned with vision, language, motor control, or anything else. Humans were not evolved with the intention of being good at chess, that's just a side effect from the other things humans have evolved to be good at. And yet it takes less work for a human to get pretty good at chess, whilst it takes specialist programs a lot more work to get to human level, even though they can get far beyond human chess ability. In many posititions, humans actually evaluate things better than the best engines, and you either have to increase their search depth or wait until you get a few moves in for them to suddenly agree with the human.
How is this possible? I think it's because humans are repurposing useful representational schemes and abilities towards chess enabling them to learn it very quickly. For example, humans can see motion in a static chess position, the same way they can see motion from a still image of a lion pouncing towards a zebra, and can easily extrapolate what will happen. A chess master can see that a position is weak if it looks like there are more attackers than defenders, and see that a piece is "supporting" many other pieces. They can see that a king is "drafty", or that a bishop is "claustrophobic". And then don't need to calculate to know that this is bad. They can see what happens. They don't just talk about positions and ideas, but also "resources". They can see how having a piece in a certain position and not having it there makes a big difference for many of the lines that follow from a position and don't need to recalculate it. They're repurposing their ability to perform intuitive physics, causal reasoning with counterfactuals, and intuitive psychology, in a new domain. These metaphors aren't just words used for poetic purposes, they're reusing circuits develped as part of vision processing and other forms of processing that we've evolved. And there's no clear pathway towards this way of doing things by giving a chess engine more data or more computing power.
As impressive as modern day chess engines are (I highly reccommend watching their games - check out Stockfish vs Leela), their ability to play chess is like a symbolic program's ability to do accounting. It is not enough to extrapolate towards general intelligence.
What about language? GPT-3 is eerily good. But for long pieces of text the cracks within the facade begin to appear. The problem with AI language models is that human language is about things. We have a world model. Language has only evolved in humans, and it's evolved on top of a vast amount of structure for interpreting the world in terms of physical objects that work certain ways, and agents with intentions etc. But language models aren't build on top of a world model. So whilst they can look like they conform to one because of statistical relations from other people talking about the world in the corpuses they're trained on, they deal with seemingly novel sentences far worse than humans because their words aren't about anything. They're trying to run before they can walk. It's easy to trip them up with the right questions.
What if we're actually like AI
Ok, if I'm right that having functionally simlilar internals is the key to getting AGI anytime soon, then what if we actually are similar under the hood?
People like Yoshua Bengio and Yann LeCun seem to think so. They agree that world models are necessary for AGI, but think that it will be achieved with the right learning algorithm, and argue that this is what humans do. The idea that everything can be learned is the central dogma in deep learning.
Yann is fond of citing how babies react to peekaboo as evidence that they don't understand object permance and learn it. The problem is that this is exactly backwards. The reason they are surprised isn't because they don't understand object permanence. It's because they don't understand tricks. The fact that they're surprised means that it's not what they expected, they expect object permanence. This is actually a standard way of testing what babies and animals understand - what they pay attention to and what they ignore. Many experiments show that babies at a young age expect object permance.
In fact, this whole debate occured within psychology a century ago, between the nativists and the empiricists. And deep learning researchers are bordering on Skinnerian behaviourists. They've found a hammer and everything looks like a nail to them. But to cognitive psychologists, whilst some aspects of deep learning are inspired by the brain, and some other aspects converge (e.g some of the features convolutional neural networks recognise), current neural networks are very different from brains.
Conclusion
- To get to AGI, we need something that works like the brain.
- Recent progress isn't an indication that it will get there. There's no reason to think that this time we'll get there when all the previous technological advancements since the conception of Pygmalion didn't.
- Current AI architectures very weakly resemble the brain
- If deep learning researchers carry on as they are, it's unlikely that they'll even ever achieve AGI without significant paradigm shifts. If they do, we are many nobel prizes and turing awards away from getting there.
- We don't have any good reason to think that it will be happening any time soon.