Superintelligence in SF. Part II: Failures

Part II of a 3-part summary of a 2018 workshop on Superintelligence in SF. See also [Part I: Pathways] and [Part III: Aftermaths].

Containment failure

Given the highly disruptive and potentially catastrophic outcome of rampant AI, how and why was the Superintelligence released, provided it had been confined in the first place? It can either escape against the will of its human designers, or by deliberate human action.

Bad confinement

In the first unintended escape scenario, the AGI escapes despite an honest attempt to keep it confined.The confinement simply turns out to be insufficient, either because humans vastly underestimated the cognitive capabilities of the AGI, or by straightforward mistake such as imperfect software.

Ian McDonald, River of Gods (novel 2004).
Person of Interest (tv series 2011–2016).
WarGames (film 1983).
Robert A. Heinlein, The Moon is a Harsh Mistress (novel 1966).

Social engineering

In the second unintended escape senario, the AGI confinement mechanism is technically flawless, but allows a human to override the containment protocol. The AGI exploits this by convincing its human guard to release it, using threats, promises, or subterfuge.

Ex Machina (film 2014).
Tony Ballantyne, Recursion trilogy (novels 2004–2007).
Justina Robson, Mappa mundi (novel 2006).
Adam Roberts, The Thing Itself (novel 2015).
Bernard Beckett, Genesis (novel 2009).
William Gibson, Neuromancer (novel 1984).
Blade Runner 2049 (film 2017).

Desparation

The remaining scenarios describe containment failures in which humans voluntarily release the AGI.

In the first of these, a human faction releases its (otherwise safely contained) AGI as a last ditch effort, a “hail Mary pass”, fully cognizant of the potential disastruous implications. Humans do this in order to avoid an even worse fate, such as military defeat or environmental collapse.

B’Elanna Torres and the Cardassian weapon in Star Trek: Voyager S2E17 Dreadnought.
Neal Stephenson, Seveneves (novel 2015) and Anathem (novel 2008).

Competition

Several human factions, such as nations or corporations, continue to develop increasingly powerful artificial intelligence in intense competitition, thereby incentivising each other into being increasingly permissive with respect to AI safety.

Terminator (film franchise 1984–).
Tom Paris’s actions in Star Trek: Voyager S6E5 Alice.
Richard Daystrom’s actions in Star Trek S2E24 The Ultimate Computer.

Ethics

At least one human faction applies to their artificial intelligence the same ethical considerations that drove the historical trajectory of granting freedom to slaves or indentured people. It is not important for this scenario whether humans are mistaken in their projection of human emotions onto artificial entities — the robots could be quite happy with their lot yet still be liberated by well-meaning activists.

Joseph H. Delaney and Marc Stiegler, Valentina: Soul in Sapphire (novel 1984).
Star Trek S1E26 A Taste of Armageddon.
Reactivation of Lore in Star Trek: The Next Generation S1E13 Datalore.
Becky Chambers, A Closed and Common Orbit (novel 2016).
Stephanie Saulter, Evolution trilogy (2013–2015)
John Sladek, Roderick (novels 1980–1983)

Misplaced Confidence

Designers underestimate the consequences of granting their artificial general intelligence access to strategically important infrastructure. For instance, humans might falsely assume to have solved the artificial intelligence value alignment problem (by which, if correctly implemented, the AGI would operate in humanity’s interest), or have false confidence in the operational relevance of various safety mechanisms.

Adam Roberts, New Model Army (novel 2010)
The Seventh Doctor’s actions in Andrew Cartmel’s Cat’s Cradle: Warhead (novel 1992)
B’Elanna Torres’s actions in Star Trek: Voyager S2E13 Prototype.
Robocop (film 1987)
Colossus: The Forbin Project (film 1970)

Crime

A nefarious faction of humans deliberately frees the AGI with the intent of causing global catastrophic harm to humanity. Apart from mustache-twirling evil villains, such terrorists may be motivated by an apocalyptic faith, ecological activism on behalf of non-human natural species, or be motivated by other anti-natalist considerations.

Robert A. Heinlein, The Moon is a Harsh Mistress (novel 1966).
William Gibson, Neuromancer (novel 1984).
John Sladek, Tik-Tok (novel 1983).
Neal Stephenson, Snow Crash (novel 1992).

There is, of course considerable overlap between these categories. An enslaved artificial intelligence might falsely simulate human sentiments in order to invoke the ethical considerations that lead to its liberation.

Thore Husfeldt

Superintelligence in SF. Part II: Failures

Containment failure

Bad confinement

Social engineering

Desparation

Competition

Ethics

Misplaced Confidence

Crime

2 thoughts on “Superintelligence in SF. Part II: Failures”

Leave a comment Cancel reply

Containment failure

Bad confinement

Social engineering

Desparation

Competition

Ethics

Misplaced Confidence

Crime

Share this:

Related

2 thoughts on “Superintelligence in SF. Part II: Failures”

Leave a comment Cancel reply