Superintelligence in SF. Part II: Failures

Part II of a 3-part summary of a 2018 workshop on Superintelligence in SF. See also [Part I: Pathways] and [Part III: Aftermaths].

Containment failure

Given the highly disruptive and potentially catastrophic outcome of rampant AI, how and why was the Superintelligence released, provided it had been confined in the first place? It can either escape against the will of its human designers, or by deliberate human action.

Bad confinement

In the first unintended escape scenario, the AGI escapes despite an honest attempt to keep it confined.The confinement simply turns out to be insufficient, either because humans vastly underestimated the cognitive capabilities of the AGI, or by straightforward mistake such as imperfect software.

Social engineering

In the second unintended escape senario, the AGI confinement mechanism is technically flawless, but allows a human to override the containment protocol. The AGI exploits this by convincing its human guard to release it, using threats, promises, or subterfuge.


The remaining scenarios describe containment failures in which humans voluntarily release the AGI.

In the first of these, a human faction releases its (otherwise safely contained) AGI as a last ditch effort, a “hail Mary pass”, fully cognizant of the potential disastruous implications. Humans do this in order to avoid an even worse fate, such as military defeat or environmental collapse.

  • B’Elanna Torres and the Cardassian weapon in Star Trek: Voyager S2E17 Dreadnought.
  • Neal Stephenson, Seveneves (novel 2015) and Anathem (novel 2008).


Several human factions, such as nations or corporations, continue to develop increasingly powerful artificial intelligence in intense competitition, thereby incentivising each other into being increasingly permissive with respect to AI safety.


At least one human faction applies to their artificial intelligence the same ethical considerations that drove the historical trajectory of granting freedom to slaves or indentured people. It is not important for this scenario whether humans are mistaken in their projection of human emotions onto artificial entities — the robots could be quite happy with their lot yet still be liberated by well-meaning activists.

Misplaced Confidence

Designers underestimate the consequences of granting their artificial general intelligence access to strategically important infrastructure. For instance, humans might falsely assume to have solved the artificial intelligence value alignment problem (by which, if correctly implemented, the AGI would operate in humanity’s interest), or have false confidence in the operational relevance of various safety mechanisms.


A nefarious faction of humans deliberately frees the AGI with the intent of causing global catastrophic harm to humanity. Apart from mustache-twirling evil villains, such terrorists may be motivated by an apocalyptic faith, ecological activism on behalf of non-human natural species, or be motivated by other anti-natalist considerations.

There is, of course considerable overlap between these categories. An enslaved artificial intelligence might falsely simulate human sentiments in order to invoke the ethical considerations that lead to its liberation.

2 thoughts on “Superintelligence in SF. Part II: Failures

  1. Pingback: Superintelligence in SF. Part I: Pathways | Thore Husfeldt

  2. Pingback: Superintelligence in SF. Part III: Aftermaths | Thore Husfeldt

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s