People like Elon Musk and Stephen Hawking are worried about AI, and how it could be dangerous. They are thinking less about Matrix- and Terminator-style robot armies than the harmless sounding Paperclip Maximiser.
Collecting paperclips is harmless, right? Nothing bad could ever come of it? Well, as Dr. Ray Stanz found out at the end of Ghostbusters, even marshmallows can be devastating.
Imagine a situation where an AI is told to collect paperclips. That’s its goal, its raison d’être, its purpose in (artificial-)life. Pretty safe. It might use industrial robots to pick up stray paperclips from a factory floor. It could rescue the paperclips from documents sent for destruction. It might run a magnet over the sweepings of an office’s robot vacuum cleaners. It could perhaps take on a second job, doing some unrelated work to earn a little money – and then spend that on a big box of paperclips from the office stationers. Quite charming, really.
In a human, such behaviour might be called obsessive, but wouldn’t feel threatening. An individual peccadillo, a harmless eccentricity. But there are limits to what an individual human can achieve – a few decades of life, and the need to sleep for a third of that time, limited resources, etc.
An AI is different, because its limits are not determined by biology. Quite possibly, they can be directly altered if they conflict with the AI’s goals. If an AI were limited by CPU power, and more CPU power would help it achieve its goals, then acquiring more CPU power becomes a precursor goal. If you tell an AI to collect paperclips, it will do whatever it can to achieve that, improving itself or its situation if that will advance the mission.
An AI which wants to collect paperclips will not be happy to collect every paperclip on Earth if it realises that it could make even more paperclips. It might reason that saving up and buying a steel factory is the most effective way to get a really large number of paperclips. Given the opportunity, it might distort the economy in favour of paperclip-producing industries. It might plan to turn all of the iron ore on the planet into paperclips, and work tirelessly towards that singular target. There’s also iron in the Earth’s core, so it might try to get at that. Or it might look to space, and metallic asteroids… which would make quite a mess if it decided to crash-land them on Earth.
There’s also iron in red blood cells. Your circulatory system is standing in the way of this AI collecting more paperclips, and safeguarding your health is quite literally not on its list of goals:
- Collect Paperclips
So, from a simple accidental oversight, a silly goal given to an AI can imply the destruction of all life on Earth and perhaps even the planet itself, its entire mass redirected to the monomaniacal quest for paperclips. The consequence is so utterly out of proportion to the original goal that it would not occur to anyone, and that is where the danger comes from. We implicitly recognise that it’s possible to go too far, but an AI doesn’t.
For this reason, all AIs must be given the full gamut of human needs and desires as their primary goal, with everything else secondary. Anything less is an existential threat to our species.