hostility

Will an artificial superintelligence be hostile?

The casual observer my ask why it is often assumed that an artificial superintelligence is hostile to humans. After all, we are biological systems with needs and requirements very different to machines. Why would there ever be a conflict and isn't this simply the stuff of science fiction movies?

There are two problems. Russell (2019, p.137) discusses the first problem as a failure of value alignment: Humans give artificial intelligence a goal to achieve and it has very negative side effects for individuals or society in general. Nonetheless, the original goal given to machines to solve may be perfectly inno-cent and rational. An example are the algorithms mentioned above for the optimisation of click-through rates on social media (Diederich, 2021). The goal is perfectly innocent: Select content that allows users to click on links so that people stay social media platforms longer. The negative side effect is that more and more extreme content engages people, enables click-through and results in radicalisation. Western societies, in particular, have to deal with the consequences that have already occurred.

The second problem is also about achieving goals that may be perfectly innocent to begin with. Russell (2019, p.141) talks about the "fetching coffee problem." A machine is giving the goal to get some coffee. The machine will immediately realise that it cannot achieve the goal if it is destroyed or switched off. Hence, the machine will create a subgoal to ensure that it is operational at all times. This may include self-modification so that it cannot be switched off by a human user. It is possible to generalise this problem immediately to realise that achieving any goal requires the existence of the machine and the generation of subgoals that insure self-preservation. Any argument that claims we can simply turn off the machines simply falls short. Self-preservation is logically built into any advanced form of artificial intelligence. If humans are somehow in conflict with the existence and the drive of self-preservation built into the machines, then the superior AI systems may decide to remove the humans.

For many complex reasons, self preservation is a drive in humans. Suicidality is a problem in many advanced societies too but it is still relatively rare. For purely logical reasons, self-preservation is also a part of very advanced artificial intelligence. An this drive may dominate.

References

Diederich J, The Psychology of Artificial Superintelligence. Springer Nature Switzerland AG 2021, ISBN 978-3-030-71841-1, DOI https://doi.org/10.1007/978-3-030-71842-8

Russell S, Human Compatible. Artificial Intelligence and the Problem of Control. Viking. Penguin Random House, 2019.