Skip to Content

The Hidden Fallacy of Paper Clip Maximizing-Robots and Supergoals

I take issue here with the notion of presentient and nearly deific artificial intelligences who shall, it is supposed, be programmed to achieve at all costs a “supergoal” … only later to wreak accidental havoc on their hapless human inventors.

Consider a specific incarnation of the idea: the meme of the paper clip-maximizing super AI, popularized by Oxford University’s Prof. Nick Bostrom. It “seems perfectly possible,” writes Bostrom, “to have a superintelligence whose sole goal is something completely arbitrary, such as to manufacture as many paperclips as possible, and who would resist with all its might any attempt to alter this goal.” [1]

Let your imagination roam wild with worst-case results of the paper clip thought experiment: Railroads shorn from the Earth for their iron. Skyscrapers dismantled by nanorobots for more paper clips. And, just before the credits roll, hordes of humans enslaved in iron ore mines, toiling in the shadows of Martian paper clip mountains. The idea is compelling in its glossy SyFy simplicity, even resurrected in late 2017 with the creation of a viral internet game based on the idea (You can play it here). [2]

Bostrom expounds upon these ideas in his 2011 book Superintelligence and earlier in a 2003 essay. [2] [3] There is also considerable overlap here with Singularity soothsayers like Ray Kurzweil and Elezier Yudowsky, who may have coined the term “supergoal” (as applied to AI) in a 2001 blog post. [4]

Yet Bostrom asserts what few would deny: that if humans were capable of rendering an invention of such strategic might, then it could cause unintended harm. “For better or worse,” laments Bostrom, “artificial intellects need not share our human motivational tendencies.” But who would disagree? Here’s your friendly reminder, Prof. Bostrom, that all seven billion humans have different “motivational tendencies”. Furthermore, it’s not surprising to me that if Office Depot is capable of cranking out ravenous paper clip-making Franken-robots, then we may be in big trouble. What’s Lockheed-Martin doing?

Bostrom has a plan, presented as it seems with all due earnest: “It seems that the best way to ensure that a superintelligence will have a beneficial impact on the world is to endow it with philanthropic values.” Amen, Prof. Bostrom. This charitable suggestion in fact sounds like a good undergraduate computer science assignment:

  1. Code up the virtues of faith, hope, and love.
  2. Assign them to this paper clip-maximizing robot.
  3. Submissions will be judged by a panel of philsophers, monks, and rabbis.

Philanthropy, as it turns out, actually isn’t so easy to define and implement. Ask Bill Gate or the Red Cross. One can have the best interests of someone else in mind but still screw it up, perhaps through no fault of their own. And philanthropic outcomes are not one-dimensional. Some parties may be pleased with the result; some may in turn be revolted.

The interesting point, though, that I want to take care to point out, is that programming a philanthropic supergoal is only slighlty more absurd that the very concept of being able to program a paper clip maximizing supergoal in the first place.

Programming a paper clip maximier, would be, when you think about, really hard, and, in fact, “perfectly impossible”. For example:

  • Should the robot keep humans around or put them to work?
  • Instead of ripping out railroads to make paper clips, should the AI keep railroad lines for transportation?
  • What precise type of paper clip is acceptable? What materials, what maximum and minimum angles and dimensions, how lustrous, how dense, how tensile, how hard?
  • On what time horizon shall we optimize paper clip production? Ought the robot binge on paper clip production for 10 years, leaving the Earth ruined? Or should the robot promote “sustainable” paper clip production, optimizing for a longer time horizon?
  • Other planets in our solar system, and the Earth’s core, have considerable amounts of metals such as nickel, iron, and aluminum. Should the AI invest in R&D to harvest hard-to-get and interplanetary metals? What about intergalactic travel? And on what time horizon should these investments play out?

These questions and calculations involve setting parameters and heuristics and balancing risks about an unpredictable future, which any creature short of Laplace’s demon or Yahweh will find pretty tough to resolve. How the AI should balance unpredicatble risks and make decisions in the face of unrelenting uncertainty (often termed Knightian uncertainty) is the golden question.

But it’s not just the golden question for computer science-y futurists; it’s the golden question full stop, the raison d’être of existence: what should we do? The robot can’t answer these questions, because, in fact, there is no answer. The best way to maximize paper clip production is not an answer for which we can design in 2017, implement by 2020, and have finished by 2023.

The calculation problem will prove just as intractable for any artificial intelligence as it is for any biological intelligence. Presenting the world as a closed system, as what a Samuelsonian economist may term a “constrained maximization problem”, is wrong, both in technological and economic forecasting. A robot’s utility curve, actually, is just as fictional as a human’s utility curve.

So in the end, Bostrom’s scenario is silly on two fronts. First, if he is merely pointing out that software and hardware can have bugs … well, yeah, we get it. Anything set in motion by humans can have drastic and unforeseen consequences. In addition to the paper clip maximizer, notable fictional missteps include Frankenstein’s monster, Dr. Strangelove’s Doomsday device, and 2001: A Space Odyssey’s HAL 9000. We should all remain aware, as Prof. Bostrom does well, that “The best-laid plans of mice and men / Go oft awry”.

In a related error, however, Bostrom seems to allude to the impossible: that optimizing functions can somehow predict the future in some grandly orchestrated master plan beyond our comprehension. This is not futuristic science; it is fiction. AI is limited – like humans, like aliens, and like everything under the sun of physics-as-we-know-it – by an all-encompassing uncertainty.

So, yes, we can definitely screw something up with technological invention, but let’s not yet suppose that we can screw it by inventing a godlike machine not beholden to physics and space-time.


Citations

[1] Bostrom, Nic. “Ethical Issues in Advanced Artificial Intelligence.” 2003. Accessed October 14, 2017. https://nickbostrom.com/ethics/ai.html#_ftn2

[2] As reported in AV Club https://www.avclub.com/this-game-about-watching-a-computer-make-paperclips-sur-1819366023.

[3] Bostrom, Nick. Superintelligence: Paths, Dangers, Strategies. Oxford, United Kingdom: Oxford University Press, 2011.

[4] Yudowski, Elezier. “What is Friendly AI?” 2001. Accessed October 14, 2017. http://www.kurzweilai.net/what-is-friendly-ai.

See also:

n.b.

  • The concept of “supergoals” borders on what Nick Szabo calls a small game fallacy, one of the strongest intellectual guards against techno-utopianism. It’s well worth reading about the small game fallacy with advanced AI in mind - http://unenumerated.blogspot.com/2015/05/small-game-fallacies.html
  • I tend to side with skeptics such as such as Jaron Lanier who posit that thinkers like Bostrom have a way of “dramatizing their beliefs with an end-of-days scenario” that takes the form of curious religiosity. Despite my criticism above, I welcome their thought experiments and think the line of discourse is valuable, when put in proper perspective and when the logical leaps of faith are properly identified, acknowledging that some religiosity about a certain subject may not always be wrong.