Tag Archives: Artificial intelligence

The Open Society and Its Digital Enemies I: Knowledge (Lex conference on trustworthy knowledge, Copenhagen 2025)

English summary

Lightly edited of Youtube’s automatic captioning (in Danish) auto-translated into English.

This year, I’ve taken on a rhetorical task: to frame my thoughts about the digital society and the relationship between computation and technology within Karl Popper’s famous work The Open Society and Its Enemies. There are at least nine chapters I could talk about (censorship, surveillance, democr, but today I’ll focus only on one: knowledge.

lex2025 (dragged)Download

So, this will be an introduction to Popper’s concept of knowledge—or his epistemology, to use a fancy word—filtered through my own views. It’s more opinionated than I usually allow myself to be in public talks..

Let’s start with a crash course on networking, internet architecture, and the Internet Protocol (IP).

lex2025 (dragged)Download

On your left, we see Ane, the sender. She wonders: What is trustworthy knowledge? And she wants to ask Mads, the recipient, that question. She uses the internet. The IP protocol first breaks Ane’s question into several smaller packets—let’s say four. These packets are then sent through the digital infrastructure: cables, routers, antennas, towers, and so on.

Mads is on a train with a mobile phone. The train is moving. These packets take different paths at different speeds. Some are lost, some are altered, but they finally arrive at Mads. The IP protocol ensures this delivery, and it has done so since 1981—before satellite communication, before mobile phones. It has worked for over a generation and ensures digital infrastructure does what it should.

lex2025 (dragged)Download

Now, what does a packet contain? Beside the actual message fragment (the “payload”) It contains so-called metadata—or “control” data. At the bottom, you can see the sender and receiver addresses. A cell tower or satellite sees the recipient and sends the packet in the right direction, hoping some other part of the infrastructure forwards it.

That ends the technical part of this lecture. So now, after that much heavy information, it’s time for a joke. On April 1st, 2003, a member of the working group that defines these protocols proposed including the “evil bit” in the protocol metadata. Since there was some extra space in these packets, the idea was to mark packets containing evil information—hack tools, malware, etc. The internet could then prevent their transmission.

If the bit is set to zero, the packet has no evil intent. If set to one, it does.

This is a joke. It’s an April Fools’ prank. Still, it reflects our moral intuition: wouldn’t it be nice if the internet could tell us what information is good or bad, safe or unsafe, trustworthy or not?

lex2025 (dragged)Download

My aim is to convince you that this is the wrong question. It is an epistemologically uninformed question with no good answer. Mind you, it’s psychologically hard for us to accept that it’s the wrong question—because it feels like a good one. I feel it too. But asking the question is like falling for the April Fool’s joke.

To illustrate that bad questions can lead us astray, let’s consider another example from biology: What is the purpose of biodiversity? Many smart people thought hard about this for 2000 years—Lamarck, Aristotle—and came to the wrong conclusions.

lex2025 (dragged)Download

Why? Because it was the wrong question. Darwin’s insight was that there is no plan, no purpose, no “good species”. Instead, today we know how to ask: what is the mechanism? And the answer is: mutation and selection. Or in Popper’s philosophy: conjecture and refutation. So again: smart people spent centuries on the wrong question until someone asked the right one—and that changed our understanding of biology for the better.

Similarly, in Popper’s The Open Society (1944), he addresses another famous wrong question: Who should rule?

lex2025 (dragged)Download

This question had occupied moral thinkers for 2000 years: their answer included the wise philosopher-king, various gods, the proletariat, the sans-culottes, priests, the intellectuals. All these answers were wrong, for the same reason: it’s the wrong question.

Popper’s key insight is that this question leads to tyranny. If someone has unique authority to rule, then protecting that authority becomes virtuous, and criticism therefore becomes immoral. The right question is instead: How can we remove bad rulers? Because mistakes are inevitable, and we need mechanisms to replace those in power.

The answers? Freedom of speech and free elections. That’s the foundation of the open society.

lex2025 (dragged)Download

Now, applying this to knowledge: What is the foundation of knowledge?—is also the wrong question. People have tried to answer this for millennia. Plato said knowledge is justified and true belief. Rationalists said it’s based on reason. Empiricists said it’s based on experience. And so on. All of these answers are in conflict with each other. And again, Popper says: they’re all wrong, because the question is wrong.

For knowledge is not based on anything. Instead, knowledge is a guess that has survived criticism. It’s the best explanation we currently have—until it’s refuted.

So: guesses-and-criticism, hypotheses-and-falsification, mutation-and-selection—it’s all the same idea. And tt’s a very practical philosophy, and it tells us what to do: we need the open society to find correct information, just like we democracy to replace bad leaders.

lex2025 (dragged)Download

Let me summarise: many traditional views of knowledge—like Plato’s or those of rationalists and empiricists—are not just wrong but wrong for the same reason: they assume knowledge must have a foundation.

Popper’s view is that knowledge is not grounded in anything. It is provisional, yet objective.

The two sides are fundamentally opposed:

One side says: true knowledge must be protected from misinformation.
Popper’s side says: true knowledge must be exposed to criticism.

These lead to very different practical conclusions for how to build an information society.

lex2025 (dragged)Download

So let’s return to the internet protocol. If you paid attention during slide 2, you saw some packets were lost or altered. Mads notices he’s missing the third packet and that the fourth may be corrupted. He sends two messages back: “I didn’t get packet three, and I don’t trust packet four—can you resend them?”

Ane resends them. This happens all the time during internet transfer, whenever you send a message. Even now, for those watching this talk online. Packet loss, message corruption, and error correction are constant and happen in milliseconds. They’re part of the internet protocol.

The essential metadata of IP therefore isn’t just the sender and receiver addresses—we’ve been used to those from paper-based postal services. Instead, equally important are the error detection mechanisms.

lex2025 (dragged)Download

Thus, error detection correction is indispensable part of the protocol, while the “evil bit” is a joke. If you understand that distinction, I have succeeded in this talk.

lex2025 (dragged)Download

To summarize:

There are two opposing conclusions about trustworthiness and information:

The traditional view says trustworthy information is based on good arguments and must be protected.
The Popperian view—which I believe—is that trustworthy information is what has withstood criticism. It’s defined by the absence of counterarguments, not the presence of supporting evidence.

So the ethical obligation becomes: We must preserve methods of error detection.

Superintelligence in SF. Part II: Failures

2 Replies

Part II of a 3-part summary of a 2018 workshop on Superintelligence in SF. See also [Part I: Pathways] and [Part III: Aftermaths].

Containment failure

Given the highly disruptive and potentially catastrophic outcome of rampant AI, how and why was the Superintelligence released, provided it had been confined in the first place? It can either escape against the will of its human designers, or by deliberate human action.

Bad confinement

In the first unintended escape scenario, the AGI escapes despite an honest attempt to keep it confined.The confinement simply turns out to be insufficient, either because humans vastly underestimated the cognitive capabilities of the AGI, or by straightforward mistake such as imperfect software.

Ian McDonald, River of Gods (novel 2004).
Person of Interest (tv series 2011–2016).
WarGames (film 1983).
Robert A. Heinlein, The Moon is a Harsh Mistress (novel 1966).

Social engineering

In the second unintended escape senario, the AGI confinement mechanism is technically flawless, but allows a human to override the containment protocol. The AGI exploits this by convincing its human guard to release it, using threats, promises, or subterfuge.

Ex Machina (film 2014).
Tony Ballantyne, Recursion trilogy (novels 2004–2007).
Justina Robson, Mappa mundi (novel 2006).
Adam Roberts, The Thing Itself (novel 2015).
Bernard Beckett, Genesis (novel 2009).
William Gibson, Neuromancer (novel 1984).
Blade Runner 2049 (film 2017).

Desparation

The remaining scenarios describe containment failures in which humans voluntarily release the AGI.

In the first of these, a human faction releases its (otherwise safely contained) AGI as a last ditch effort, a “hail Mary pass”, fully cognizant of the potential disastruous implications. Humans do this in order to avoid an even worse fate, such as military defeat or environmental collapse.

B’Elanna Torres and the Cardassian weapon in Star Trek: Voyager S2E17 Dreadnought.
Neal Stephenson, Seveneves (novel 2015) and Anathem (novel 2008).

Competition

Several human factions, such as nations or corporations, continue to develop increasingly powerful artificial intelligence in intense competitition, thereby incentivising each other into being increasingly permissive with respect to AI safety.

Terminator (film franchise 1984–).
Tom Paris’s actions in Star Trek: Voyager S6E5 Alice.
Richard Daystrom’s actions in Star Trek S2E24 The Ultimate Computer.

Ethics

At least one human faction applies to their artificial intelligence the same ethical considerations that drove the historical trajectory of granting freedom to slaves or indentured people. It is not important for this scenario whether humans are mistaken in their projection of human emotions onto artificial entities — the robots could be quite happy with their lot yet still be liberated by well-meaning activists.

Joseph H. Delaney and Marc Stiegler, Valentina: Soul in Sapphire (novel 1984).
Star Trek S1E26 A Taste of Armageddon.
Reactivation of Lore in Star Trek: The Next Generation S1E13 Datalore.
Becky Chambers, A Closed and Common Orbit (novel 2016).
Stephanie Saulter, Evolution trilogy (2013–2015)
John Sladek, Roderick (novels 1980–1983)

Misplaced Confidence

Designers underestimate the consequences of granting their artificial general intelligence access to strategically important infrastructure. For instance, humans might falsely assume to have solved the artificial intelligence value alignment problem (by which, if correctly implemented, the AGI would operate in humanity’s interest), or have false confidence in the operational relevance of various safety mechanisms.

Adam Roberts, New Model Army (novel 2010)
The Seventh Doctor’s actions in Andrew Cartmel’s Cat’s Cradle: Warhead (novel 1992)
B’Elanna Torres’s actions in Star Trek: Voyager S2E13 Prototype.
Robocop (film 1987)
Colossus: The Forbin Project (film 1970)

Crime

A nefarious faction of humans deliberately frees the AGI with the intent of causing global catastrophic harm to humanity. Apart from mustache-twirling evil villains, such terrorists may be motivated by an apocalyptic faith, ecological activism on behalf of non-human natural species, or be motivated by other anti-natalist considerations.

Robert A. Heinlein, The Moon is a Harsh Mistress (novel 1966).
William Gibson, Neuromancer (novel 1984).
John Sladek, Tik-Tok (novel 1983).
Neal Stephenson, Snow Crash (novel 1992).

There is, of course considerable overlap between these categories. An enslaved artificial intelligence might falsely simulate human sentiments in order to invoke the ethical considerations that lead to its liberation.

Superintelligence in SF. Part I: Pathways

2 Replies

Summary of two sessions I had the privilege of chairing at the AI in Sci-Fi Film and Literature conference, 15–16 March 2018 Jesus College, Cambridge. The conference was part of the Science & Human Dimension Project.

AI Cambridge.png

Friday, 16 March 2018
13.30-14.25. Session 8 – AI in Sci-Fi Author Session
Chair: Prof Thore Husfeldt
Justina Robson
Dr Paul J. McAuley A brief history of encounters with things that think
Lavie Tidhar, Greek Gods, Potemkin AI and Alien Intelligence
Ian McDonald, The Quickness of Hand Deceives the AI

14.30-15.25 Session 9 – AGI Scenarios in Sci-Fi
Workshop lead by Prof Thore Husfeldt

The workshop consisted of me giving a brief introduction to taxonomies for superintelligence scenarios, adapted from Barrett and Baum (2017), Sotala (2018), Bostrom (2014), and Tegmark (2018). I then distributed the conference participants into 4 groups, led by authors Robson, McAuley, Tidhar, and McDonald. The groups were tasked with quickly filling each of these scenarios with as many fictionalisations as they could.

(Technical detail: Each group had access to a laptop and the workshop participants collaboratively edited a single on-line document, to prevent redundancy and avoid the usual round-robin group feedback part of such a workshop. This took some preparation but worked surprisingly well.)

This summary collates these suggestions, completed with hyperlinks to the relevant works, but otherwise unedited. I made no judgement calls about germaneness or artistic quality of the suggestions.

Overview

In a superintelligence scenario, our environment contains nonhuman agent exceeding human cognitive capabilities, including intelligence, reasoning, empathy, social skills, agency, etc. Not only does this agent exist (typically as a result of human engineering), it is unfettered and controls a significant part of the infrastucture, such as communication, production, or warfare.

The summary has three parts:

Pathways: How did the Superintelligence come about?
Containment failure: Given that the Superintelligence was constructed with some safety mechanisms in mind, how did it break free?
Aftermaths: How does the world with Superintelligence look?

Part I: Pathways to Superintelligence

Most of the scenarios below describe speculative developments in which some other entity (or entities) than modern humans acquire the capability to think faster or better (or simply more) than us.

Network

In the first scenario, the Superintelligence emerges from networking a large number of electronic computers (which individually need not exhibit Superintelligence). This network can possibly include humans and entire organisations as its nodes.

Ian McDonald, River of Gods (novel 2014)
Matthew De Abaitua, The Destructives (novel 2016)
The Borg from the Stark Trek TV series.
Her (film 2013)
Orson Scott Card, Speaker for the Dead (novel 1986)
The Construct Council of New Crobuzon in China Mieville’s novels Perdido Street Station (2000) and Iron Council (2004).
The Matrix (film 1999).
Ann Leckie, Ancillary trilogy (novels 2013–2015).

Augmented human brains

Individual human have their brains are augmented, for instance by interfacing with an electronic computer. The result far exceeds the cognitive capabilities of a single human.

Charles Stross, Accelerando (novel, 2005)
Daniel Keyes, Flowers for Algernon (short story 1959, novel 1966)
Samuel R., Delany, Babel-17 (novel 1966)
Frederik Pohl, Man Plus (novel 1976)
Transcendance (film 2014)
Forbidden Planet (film 1956)
Peter F. Hamilton, Greg Mandel / Event Horizon trilogy (novels 1993–1995)
Ramez Naam, Nexus trilogy (novels 2012–2015)
Robert A. Heinlein, Waldo (short story 1942)
William Gibson, Johnny Mnemonic (short story 1986, eponymous film 1995)
Stephen King, Lawnmower Man (short story 1975, eponymous film 1992)

Better biological cognition

The genotype of some or all humans have has been changed, using eugenics or deliberate genome editing, selecting for higher intelligence that far surpasses modern Humans.

Paul J. McAuly, Fairyland (novel 1996)
Lucy (film 2014)
Khan Noonien Singh and Julian Bashir in Star Trek (tv series and films).
Alan Glynn, The Dark Fields (novel 2001, Limitless film 2011).
Planet of the Apes reboot (films 20011–)

Brain emulation

The brains of individual humans are digitized and their neurological processes emulated on hardware that allows for higher processing speed, duplication, and better networking that biological brain tissue. Also called whole brain emulation, mind copying, just uploading.

Robin Hanson, The Age of Em (nonfiction 2016)
Richard K. Morgan, Altered Carbon (novel 2002, eponymous tv series 2018–)
Matthew De Abaitua, The Red Men (novel 2009).
Greg Egan, Permutation City (novel 1994)
Pat Cadigan, Dervish is Digital (novel 2002)
Cory Doctorow, Walkaway (novel 2017)
M.T. Anderson, Feed (novel 2002)
GLaDOS in Portal (video game series 2007–2011).

See also Mind Uploading in Fiction at Wikipedia.

Algorithms

Thanks to breakthroughs in symbolic artificial intelligence, machine learning, or artificial life, cognition (including agency, volition, explanation) has been algorithmicised and optimised, typically in an electronic computer.

Colossus: The Forbin Project (film 1970)
Matthew De Abaitua, If Then (novel 2015)
Ultron in the Marvel Comics universe (e.g., Avengers: Age of Ultron, film 2015).
Gus Gorman’s supercomputer in Superman III (film 1983).
Janet from The Good Place (tv series 2016–).
William Gibson and Bruce Sterling, The Difference Engine (novel 1990)
Vernor Vinge, The Peace War (novel 1984) and Marooned in Realtime (novel 1986).

Other

For most purposes, the arrival of alien intelligences has the same effect as the construction of a Superintelligence. Various other scenarios (mythological beings, magic) are operationally similar and have been fictionalised many times.

Continues in Part II: Failures. Part III is forthcoming.

References

Anthony Michael Barrett, Seth D. Baum, A Model of Pathways to Artificial Superintelligence Catastrophe for Risk and Decision Analysis, 2016, http://arxiv.org/abs/1607.07730
Sotala, Kaj (2018). Disjunctive Scenarios of Catastrophic AI Risk. AI Safety and Security (Roman Yampolskiy, ed.), CRC Press. Forthcoming.
Nick Bostrom, Superintelligence: Paths, Dangers, Strategies, Oxford University Press, 2014.
Max Tegmark, Life 3.0, Knopf, 2018.

Plausibility and Utility of Apocalyptic A.I. Scenarios

Artificiell intelligens – ett hot mot mänskligheten?

The Monster in the Library of Turing

7 Replies

Appears (in Swedish) as T. Husfeldt, Monstret i Turings bibliotek, Filosofisk tidskrift 36(4):44-53, 2015.

[As PDF]

Nick Bostrom’s Superintelligence presents a dystopian view of strong artificial intelligence: Not only will it be the greatest invention of mankind, it will also be the last, and this emerging technology should be viewed as an immediate and catastrophic risk to our species, like molecular nanotechnology, nuclear power, or chemical warfare.

Nick Bostrom,
Superintelligence, Oxford University Press, 2014.

Artificial intelligence with super-human capabilities will be the last invention controlled by Homo sapiens, since all subsequent inventions will be made by the “superintelligence” itself. Moreover, unless we are extremely careful or lucky, the superintelligence will destroy us, or at least radically change our living conditions in a way that we may find undesirable. Since we are currently investigating many technologies that may lead to a superintelligence, now would be a good time for reflection.

Nobody knows, much less agrees on, how to define intelligence, be it general, artificial, or strong. Neither does Bostrom. By his own admission, his book is inherently speculative and probably wrong. Not even the rudiments of the relevant technology may be known today. However, many of Bostrom’s arguments are quite robust to the particulars of “what?” and “how?”, and the book can be enjoyed without a rigorous definition. For now, we imagine the superintelligence as a hypothetical agent that is much smarter than the best current human brains in every cognitive activity, including scientific creativity, general wisdom, and social skills.

Pathways and Consequences

Bostrom describes a number of pathways towards a superintelligence. They consist of current scientific activities and emerging technologies, extrapolated beyond human cognitive capabilities. These include

neuroanatomical approaches, such as simulation of whole brains (Einstein’s brain in a vat or simulated on a computer, neurone by neurone),
genetically or artificially modified humans (brain implants, parental gamete selection for intelligence),
intelligence as an emergent phenomenon in simpler networks (the internet as a whole “becomes conscious”),
computer science approaches like machine learning (IBM’s Jeopardy-winning Watson) and “good old-fashioned artificial intelligence” using symbolic reasoning (chess computers).

None of these technologies is currently close to even a dullard’s intelligence, but on a historical time scale, the are all very new. Bostrom’s description is an entertaining tour of computer science, neuroanatomy, evolutionary biology, cognitive psychology, and related fields. It is moderately informative, but none of it is authoritative.

The consequences of any of these research programs actually achieving their goals are even more speculative. Several futurists have developed scenarios for how a superintelligent future might look, and Bostrom surveys many of these ideas. In one of those visions, accelerating and self-improving digital technologies quickly overtake human cognitive abilities and transform our environment, like we transformed pre-historic Earth. After a short while, all the protons that currently making up the planets of the solar system are put to better use as solar-powered computing devices, orbiting the sun in a vast networked intelligence known as a Dyson sphere. There are many other scenarios, partly depending on which approach will turn out to win the race against our own intelligence, and how the result can be harnessed. Not all these possible futures are dystopian. Some even leave room for humans, maybe uploaded to a digital version of immortality, or kept in zoos. But they certainly entail dramatic changes to our way of life, easily comparable to the invention of agriculture or the industrial revolution.

The Control Problem

Bostrom’ book begins with a short fable, in which a group of birds agrees to look for an owl chick to help them with nest-buildling and other menial labours. We immediately see their folly: The owl chick will quickly outgrow the birds, throw off their yoke, and quite probably eat them, as is its nature. The birds would have been better off thinking about the result of their search before it was too late.

However, in its most harmless form, the superintelligence is merely a device that excels at goal-oriented behaviour, without intentionality, consciousness, or predatory tendencies. Bostrom explains that even such a superintelligent servant, docile in comparison to the rampant killer robots from blockbuster movies, would be a terrible thing.

Bostrom describes an interesting thought experiment about how to specify the behaviour of a superintelligent machine that produces paperclips. There are at least two undesirable outcomes that stem from “perverse instantiation” of the machine’s task. One is that the machine might turn everything into paperclips, including humans. Or, faced with a more carefully worded task, the machine first solves the problem of increasing its own intelligence (to become a better paperclip-producer), turning the entire solar system microprocessors. No matter how much we continue specifying our orders, the machine continues to avoid “doing what we mean” in increasingly intricate ways, all with catastrophic outcomes.

I like to think of this idea in terms of Goethe’s Zauberlehrling, immortalised by Mickey Mouse in Disney’s Fantasia. In this tale, the sorcerer’s apprentice commands his enchanted broomstick to fill water into a bathtub. Obedient and literal-minded, the magical servant hauls up bucket after bucket, yet continues long after the tub has been filled. Disaster is averted only because the sorcerer himself arrives in the nick of time to save Mickey from drowning. No malice was involved; unlike Bostrom’s allegorical owl-chick, the broomstick has no volition other than the dutiful execution of Mickey’s task. It was Mickey who failed to be sufficiently precise in how he formulated the task. Had Goethe’s broomstick been superintelligent, it seems to me that he might just have killed the apprentice and thrown his body in the tub. After all, 70 percent of it are water, so the task is completed with speed and elegance.

Illustration of Der Zauberlehrling. From: Goethe’s Werke, 1882, drawing by Ferdinand Barth (1842–1892).

These games of formalism may strike you as trite exercises in deliberately misunderstanding instructions against their intention. However, every programmer or lawmaker knows that this is exactly the reason for why software is so hard to write or laws are hard to formulate. If there is a way to misunderstand your instructions, it will happen.

Thus, there is no need to attribute malice to the superintelligence. Instead, we might be doomed even by a perfectly obedient agent labouring to perform ostensibly innocent tasks. In the words of Eliezer Yudkowsky, “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.” ¹ Of course, a superintelligence with volition, intentionality, or free will, would be even harder to control.

The control problem has many aspects, one of which is purely institutional: If the superintelligence is developed in vicious competition between corporations or militaries, none of them is motivated to stop the development, for fear of losing a technological arms race. As a civilisation, we have an unimpressive track record of preventing globally harmful behaviour when individual proximate gains are at hand. This addresses a popular rejoinder to a dystopian intelligence explosion: “Why don’t we just pull the plug?” First, it’s not clear that there is a plug to pull.² Second, who is “we”?

Assuming we can solve the institutional problem of who gets to design the the superintelligence, the solution seems to be to codify our goals and values, so that it won’t act in ways that we find undesirable. This, of course, is a problem as old as philosophy: What are our values? And again, who is “we”: the species, the individual, or our future? Should the paperclip maximiser make sure no children are killed? Should no unborn children aborted? Is it in the alcoholic’s interest to receive alcohol or not?

Bostrom speculates if we can “what is good for us” on a higher-order level, asking the superintelligence to extrapolate our desires based on a charitable (rather than literal) interpretation of our own flawed descriptions, possibly in a hierarchy of decisions deferred to ever higher moral and intellectual powers. Assuming that we were to agree on a meaningful “Three Laws of Robotics” that accurately described our values without risking perverse instantiation, how do we then motivate a vastly superiour intellect to follow these laws? I find these speculations both entertaining and thought-provoking.³

Some futurists welcome the idea of a paternalistic superintelligence that unburdens our stone-age mind from solving our own ethical questions. Even so, as Bostrom argues, if we share our environment with a superintelligence, our future depends on its decisions, much like the future of gorillas depends on human decisions. If we are currently building the superintelligence, we have but one chance to make it treat us as nice as we treat the gorillas.

The Library of Turing

Bostrom tacitly assumes a strictly functionalist view of the mind. For instance, if we were able to perfectly simulate the neurochemical workings of a brain, this simulation itself would be a mind with consciousness, volition, and agency. Once the functions of the brain are understood, the substrate on which their simulation runs is secondary. From this perspective, questions about cognition, intelligence, and even agency, are ultimately computational questions.

Let me invite you to a fictional place that I call the Library of Turing. It is inspired by La biblioteca de Babel, described in a short story by Jorge Luis Borges. His library consists of an enormous collection of all possible books of a certain format. Almost every book is gibberish, as had it been typed by intoxicated monkeys. One book consists entirely of the letter A, but another contains Hamlet.⁴ In Borges’ story, the librarians find themselves in a suicidal state of despair; surrounded by an ocean of knowledge, yet unable to navigate it due to the library’s overwhelming dimensions. Our brains cannot fathom its size, and our languages cannot describe it⁵, “unimaginably vast” does not even come close. However, the metaphor of the library helps us to build some kind of mental image of its size.⁶

Behold, then, the Library of Turing. Each volume contains computer code. Much of it is garbage, but some volumes contain meaningful programs in some concrete programming language, let’s say Lisp.⁷ We can feed these programs to a computer found in each room. Most wouldn’t do anything useful, many would cause the computer to crash. From our starting room, lined with bookcases, we see exits to other rooms in all directions. These rooms contain more volumes, more bookcases, and more exits. There is no discernible ordering in the library, but close to where we are standing, a particularly worn volume contains the program

        (print "Hello, world!")

which instructs the computer to output the friendly message “Hello world!”. Next to it, a book contains the nonsensical phrase “Etaoin shrdlu,” not a valid program in Lisp. We shrug and put it back. Thanks to the diligence of some algorithmic librarian, a third book in our room contains the complete code for the primitive natural language processor Eliza, written by Joseph Weizenbaum in 1966. This famous little program is able to pretend to have (typed) conversations with you, like this:

“Tell me your troubles.”

“My cat hates me.”

“What makes you believe cat hates you?”

The primitive grammatical mistake of leaving out the pronoun gives you a clue that Eliza works by very simple syntactic substitutions, without any attempt at understanding what you mean. Still, it’s better than just saying “Hello, world!” If we continue browsing, we might find the code of some natural language processors that are state-of-the-art in the 2010s, such as IBM’s Watson, or Apple’s Siri.

eliza

Fragments of the source code of Eliza, an early natural language processor. The whole program takes just over 1000 lines.

Indeed, our conceit is that every algorithm appears somewhere in Turing’s library. ⁸ In particular, if we accept functionalism, some book describes the behaviour of Einstein’s brain: On input “What is 5 + 5” or “Do you prefer Brahms?” or “Prove the Riemann hypothesis” it will give the same answer as Einstein. We have no idea how this program looks. Maybe it is based symbolic artificial intelligence, like Eliza. Or maybe the bulk of the books is taken up by a “connectome” of Einstein’s actual brain, i.e., a comprehensive map of its neural connections. Such a map has been known for C. elegans, a transparent roundworm with 302 neurones, since the mid-1980s, and even though Einstein’s brain is considerably larger, this is not absurd in principle.

Somewhere in Turing’s library there must be description of a computational process that is vastly superior to every human brain. Otherwise Einstein’s brain would have been the smartest algorithmic intelligence allowed by the laws of logic, or close to it. Yet it seems unlikely, not to say self-congratulatory, that evolution on the planet Earth succeeded in constructing the ultimate problem solving device in just a few million generations.

Translated into this setting, Bostrom suggests that our discovery of this description would have catastrophic consequences. The library contains a monster. Our exploration should proceed with prudence rather than ardour.

Searching for the Monster

Given that the description of a superintelligence exists somewhere in the library, all that remains is to find it. If you don’t work in computer science, this task seems to be relatively trivial. After all, even a mindless exhaustive search will find the volume in question. While this approach is not a particularly tempting exercise for a human, computers are supposed to be good at performing mindless, well-defined, repetitive, and routine tasks at high speed and without complaining. Thus, sooner or later we stumble upon the monster; the only debate is whether the search time is measured in years, generations, or geological time scales.

But this is a fallacy. It is a fallacy because of the unreasonable growth rate of the exponential function, and the powerlessness of exhaustive search.

To appreciate this, consider again the 23 symbols that make up our friendly “Hello World!” program. We could have found it, in principle, by exhaustively searching through all sequences of characters that make up Lisp programs.

How much time does this take? That is a simple exercise in combinatorics, after we fix some details—how many symbols are there, how fast is the computer, etc. But no matter how you fix these details, the result is disappointing.⁹ If a modern computer computer had started this calculation when the universe began, it would have gotten to somewhere around (print "Hello,"). You could gain a few extra letters by throwing millions of computers at the problem. But even then, the universe does not have enough resources to allow for an exhaustive search for even a very simple program. Either you run out of time before the universe ends, or you run out of protons to build computers with. Computation is a resource, exponential growth is immense, and the universe is finite.

Bostrom pays little attention to this problem. For instance, in his discussion of current efforts on brain simulation, he writes:

Success at emulating a tiny brain, such as that of C. elegans, would give us a better view of what it would take to emulate larger brains. At some point in the technology development process, once techniques are available for automatically emulating small quantities of brain tissue, the problem reduces to one of scaling.

Well, to me, scaling is the problem.

One might object that this observation is a mundane, quantitative argument that does not invalidate the compelling fact that in principle, the exhaustive search will sooner or later stumble upon the fully-fletched “Hello World!”-program, and eventually the monstrous superintelligence. But that is exactly my point: You can simultaneously accept machine-state functionalism and be blasé about the prospects of AI. Speculation about the imminent discovery of the monster in Turing’s Library must be grounded in computational thinking, which is all about the growth rates of computational resources.

Could there be another way of discovering the superintelligence than exhaustive search? Certainly. After all, nature has discovered one of the monsters, the brain of Homo sapiens, starting with very simple “brains” hundreds of millions of years ago, and refining them stepwise by natural selection and mutation in environments that favoured cognition. Thus, there is some gradient, some sensible set of stepwise refinements, that Nature has followed through the Library of Turing to arrive at Einstein’s brain. ¹⁰ We just don’t know how to recognise, nor efficiently follow this gradient. In the enthusiasm surrounding artificial intelligence in the 1960s we might have thought that we were on such a gradient. If so, we have abandoned it. The current trend in artificial intelligence, away from symbolic reasoning and towards statistical methods like machine learning, that do not aim to build cognitive models, strikes me as an unlikely pathway.

Given our current understanding of the limits of computation, there are many relatively innocent-looking computational problems that are computationally hard.¹¹ After more than a generation of very serious research into these problems, nobody knows anything better than exhaustive search for many of them. In fact, while the amazing features of your smartphone may tell you a different story, the main conclusion of a generation of research in algorithms is bleak: for most computational problems we don’t know what to do.¹²

Let me repeat that I do not try to rule out the existence of a monster. I merely point out that even though a monster exists, we may never run into it. Turing’s Library is just too vast, nobody has labelled the shelves, much less put up useful signs and maps. Our intuition is mistaken when it extrapolates that we will soon have mapped the entire library, just because we’ve seen more of it than our grandparents ever did. Moreover, imminent and rampant progress in exhaustive search may have been a reasonable hope half a century ago. But today we know how hard it is.

Research Directions

If the algorithmic perspective above is correct, then there is nothing to worry about. The superintelligence is an entertaining fiction, no more worthy of our attention than an imminent invasion by aliens,¹³ eldritch gods from the oceans, or a sudden emergence of magic. The issues raised are inherently unscientific, and we might as well worry about imminent overpopulation on Venus or how many angels can dance on the head of a pin.

But I could be wrong. After all, I wager the future of human civilization on a presumption about the computational complexity of an ill-defined problem. I do think that I have good reasons for that, and that these reasons are more firmly grounded in our current understanding of computation than Bostrom’s extrapolations. However, if I’m wrong and Bostrom is right, then humanity will soon be dead, unless we join his research agenda. Thus, from the perspective of Pascal’s wager, it may seem prudent to entertain Bostrom’s concerns.

The problem domain opened up by Superintelligence may lead to valid research in various areas, independently of the plausibility of the underlying hypothesis.

For instance, the problem of codifying human values, a formalization of ethics, is a problem as old as philosophy. Bostrom’s prediction injects these questions with operational significance, as these codified ethics would be the framework for the incentive structure needed to solve the control problem. However, an “algorithmic code of ethics” is a worthwhile topic anyway. After all, our digital societies are becoming increasingly controlled by algorithmic processes. These processes are nowhere near superintelligent, but they are highly influential: Algorithms choose our news, decide our insurance premiums, and filter our friends and potential mates. Their consistency with human values is a valid question for ethicists and algorithmicists alike, no matter if the algorithms are authored by Google’s very real software engineers or a very hypothetical paternalistic AI.

Another issue related to the control problem is the ongoing work on the semantics and verification of programming languages. This has been a well-established subfield of computer science for a generation already. Some of the these issues go back to Gödel’s incompleteness theorem’s from the 1930s, proving that formal systems are unable to reason about the behaviour of formal systems. In particular, it has turned out to be a very difficult problem of logic to specify and verify the intended behaviour of even very short pieces of computer code. It may well be that questions about harnessing the behaviour of a superintelligence can be investigated using these tools. Other scientific areas may provide similar connection points: Incentive structures are a valid question for game theory and algorithmic mechanism design, which resides on the intersection of computer science and economics. Or the computational aspects of artificial intelligence may become sufficiently well defined to become a valid question for researchers in the field of computational complexity. This might all lead to fruitful research, even if the superintelligence remains eternally absent. But whether superintelligence becomes a valid topic for any of these disciplines is very unclear to me. It may depend on purely social issues within those research communities.

It may also depend on the availability of external funding.

This last point leads me to speculate about a very real potential impact of Bostrom’s book. It has to do with the perception of artificial intelligence in the public eye. Research ethics are a political question ultimately decided by the public. The most Herculean promise of current artificial intelligence research is the construction of general artificial intelligence. What if the public no longer saw that as a goal, but as a threat? If Bostrom’s dystopian perspective leads to increased public awareness about catastrophic effects of superintelligence, then general AI would not appear under “Goals” in a research plan, but under “Ethical considerations” that merit to public scrutiny along with climate change, sensitive medical data, rampant superviruses, animal testing, or nuclear disasters. And these, largely political, requirements may themselves necessitate research into the control problem.

We may want to ponder the consequences of running into a monster in the Library of Turing no matter how plausible that event may be.

Eliezer Yudkowsky, Artificial Intelligence as a Positive and Negative Factor in Global Risk, 2006.↩
Unlike in the movies, we must assume that a superintelligence can anticipate the efforts of a plucky group of heroes. Bostrom describes a number of ways in which to make yourself immune to such an attack by distributing the computational substrate. Surely a superintelligence can think of more.↩
I particularly like the idea of design an AI with a craving for cryptographic puzzles to which we know the answer.↩
A million others contain Hamlet with exactly one misprint.↩
The language of mathematics is an exception. But our brains cannot imagine 10²⁵, the number of stars, much less 10¹²³ the number of atoms in the universe. By contrast, the number of books in the Library of Babel is 25^{1, 312, 000}.↩
Daniel Dennett has has used this idea as “the Library of Mendel” to illustrate the vast configuration space of genetic variation. Daniel C. Dennett, Darwin’s Dangerous Idea, Simon & Schuster 1995.↩
I chose Lisp because of its role as an early AI programming language, but any other language would do. For a cleaner argument, I could have chosen the syntax of Alan Turing’s original “universal machine”. In the language of computability theory, all these choices are “Turing-complete”: they can simulate each other.↩
Some programs may be so big as to be spread over several books, maybe referring to other volumes as subroutines much like modern software is arranged into so-called software libraries.↩
Lisp is written in an alphabet of at most 128 characters. A modern computer with 10⁹ operations per second, running since the Big Bang, would not have finished all 128¹³ strings of length 13.↩
If we entertain this idea, then a pathway to intelligence would be the simulation of competetive environments in which to artificially evolve intelligence by mutation and selection. This replaces the problem from simulation of entire brains to simulation of entire adaptive environments, including the brains in them.↩
This claim is formalised in various hypotheses in the theory of computational complexity, such as “P is not NP” and the Exponential Time Hypothesis.↩
Not even functioning quantum computers would change this.↩
Here is the analogy in full: We have explored the Moon and are about to explore Mars. We should worry about imminent contact with alien civilisations.↩