On the Controllability of Artificial Intelligence: An Analysis of Limitations

Authors

  • Roman V. Yampolskiy University of Louisville, USA

DOI:

https://doi.org/10.13052/jcsm2245-1439.1132

Keywords:

AI safety, control problem, safer AI, uncontrollability, unverifiability, X-risk

Abstract

The invention of artificial general intelligence is predicted to cause a shift in the trajectory of human civilization. In order to reap the benefits and avoid the pitfalls of such a powerful technology it is important to be able to control it. However, the possibility of controlling artificial general intelligence and its more advanced version, superintelligence, has not been formally established. In this paper, we present arguments as well as supporting evidence from multiple domains indicating that advanced AI cannot be fully controlled. The consequences of uncontrollability of AI are discussed with respect to the future of humanity and research on AI, and AI safety and security.

Downloads

Download data is not yet available.

Author Biography

Roman V. Yampolskiy, University of Louisville, USA

Roman V. Yampolskiy has a BS/MS in Computer Science 2004 (RIT), PhD in Engineering and Computer Science 2008 (UB). He is a tenured associate professor in the department of Computer Science and Engineering at the Speed School of Engineering, University of Louisville (2008–). He is the founding and current director of the Cyber Security Lab and an author of many books including Artificial Superintelligence: a Futuristic Approach. During his tenure at UofL, Dr. Yampolskiy has been recognized as: Distinguished Teaching Professor, Professor of the Year, Faculty Favorite, Top 4 Faculty, Leader in Engineering Education, Top 10 of Online College Professor of the Year, and Outstanding Early Career in Education award winner among many other honors and distinctions. Dr. Yampolskiy is a Senior member of IEEE, AGI and Member of Kentucky Academy of Science. His main areas of interest are AI safety and cybersecurity, he is an author of over 200 publications including multiple journal articles and books.

References

Devlin, J., et al., Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.

Goodfellow, I., et al. Generative adversarial nets. in Advances in neural information processing systems. 2014.

Mnih, V., et al., Human-level control through deep reinforcement learning. Nature, 2015. 518(7540): pp. 529–533.

Silver, D., et al., Mastering the game of go without human knowledge. Nature, 2017. 550(7676): p. 354.

Clark, P., et al., From ‘F’ to ‘A’ on the NY Regents Science Exams: An Overview of the Aristo Project. arXiv preprint arXiv:1909.01958, 2019.

Vinyals, O., et al., Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 2019: pp. 1–5.

Yampolskiy, R.V., Predicting future AI failures from historic examples. foresight, 2019. 21(1): pp. 138–152.

Scott, P.J. and R.V. Yampolskiy, Classification Schemas for Artificial Intelligence Failures. arXiv preprint arXiv:1907.07771, 2019.

Brundage, M., et al., The malicious use of artificial intelligence: Forecasting, prevention, and mitigation. arXiv preprint arXiv:1802.07228, 2018.

Paulas, R., The Moment When Humans Lose Control of AI. February 8, 2017: Available at: https://www.vocativ.com/400643/when-humans-lose-control-of-ai.

Russell, S., D. Dewey, and M. Tegmark, Research Priorities for Robust and Beneficial Artificial Intelligence. AI Magazine, 2015. 36(4).

Yampolskiy, R., Artificial Intelligence Safety and Security. 2018: CRC Press.

Sotala, K. and R.V. Yampolskiy, Responses to catastrophic AGI risk: a survey. Physica Scripta, 2014. 90(1): p. 018001.

Amodei, D., et al., Concrete problems in AI safety. arXiv preprint arXiv:1606.06565, 2016.

Everitt, T., G. Lea, and M. Hutter, AGI safety literature review. arXiv preprint arXiv:1805.01109, 2018.

Charisi, V., et al., Towards Moral Autonomous Systems. arXiv preprint arXiv:1703.04741, 2017.

Callaghan, V., et al., Technological Singularity. 2017: Springer.

Majot, A.M. and R.V. Yampolskiy. AI safety engineering through introduction of self-reference into felicific calculus via artificial pain and pleasure. in 2014 IEEE International Symposium on Ethics in Science, Technology and Engineering. 2014. IEEE.

Aliman, N.-M., et al. Orthogonality-Based Disentanglement of Responsibilities for Ethical Intelligent Systems. in International Conference on Artificial General Intelligence. 2019. Springer.

Miller, J.D. and R. Yampolskiy, An AGI with Time-Inconsistent Preferences. arXiv preprint arXiv:1906.10536, 2019.

Yampolskiy, R.V., Personal Universes: A Solution to the Multi-Agent Value Alignment Problem. arXiv preprint arXiv:1901.01851, 2019.

Behzadan, V., R.V. Yampolskiy, and A. Munir, Emergence of Addictive Behaviors in Reinforcement Learning Agents. arXiv preprint arXiv:1811.05590, 2018.

Trazzi, M. and R.V. Yampolskiy, Building safer AGI by introducing artificial stupidity. arXiv preprint arXiv:1808.03644, 2018.

Behzadan, V., A. Munir, and R.V. Yampolskiy. A psychopathological approach to safety engineering in ai and agi. in International Conference on Computer Safety, Reliability, and Security. 2018. Springer.

Duettmann, A., et al., Artificial General Intelligence: Coordination & Great Powers. Foresight Institute: Palo Alto, CA, USA, 2018.

Ramamoorthy, A. and R. Yampolskiy, Beyond Mad?: The Race for Artificial General Intelligence. ITU Journal: ICT Discoveries, 2017.

Ozlati, S. and R. Yampolskiy. The Formalization of AI Risk Management and Safety Standards. in Workshops at the Thirty-First AAAI Conference on Artificial Intelligence. 2017.

Brundage, M., et al., Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims. arXiv preprint arXiv:2004.07213, 2020.

Trazzi, M. and R.V. Yampolskiy, Artificial Stupidity: Data We Need to Make Machines Our Equals. Patterns, 2020. 1(2): p. 100021.

Miller, J.D., R. Yampolskiy, and O. Häggström, An AGI Modifying Its Utility Function in Violation of the Orthogonality Thesis. arXiv preprint arXiv:2003.00812, 2020.

Callaghan, V., et al., The Technological Singularity: Managing the Journey. 2017: Springer.

Davis, M., The undecidable: Basic papers on undecidable propositions, unsolvable problems and computable functions. 2004: Courier Corporation.

Turing, A.M., On Computable Numbers, with an Application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 1936. 42: pp. 230–265.

Gans, J.S., Self-regulating artificial general intelligence. 2018, National Bureau of Economic Research.

Chong, E.K., The Control Problem [President’s Message]. IEEE Control Systems Magazine, 2017. 37(2): pp. 14–16.

Yudkowsky, E., Artificial intelligence as a positive and negative factor in global risk. Global catastrophic risks, 2008. 1(303): p. 184.

Yudkowsky, E., On Doing the Impossible, in Less Wrong. October 6, 2008: Available at: https://www.lesswrong.com/posts/fpecAJLG9czABgCe9/on-doing-the-impossible.

Babcock, J., J. Kramar, and R.V. Yampolskiy, Guidelines for Artificial Intelligence Containment, in Next-Generation Ethics: Engineering a Better Society (Ed.) Ali. E. Abbas. 2019, Cambridge University Press: Padstow, UK. pp. 90–112.

Goertzel, B. and C. Pennachin, Artificial general intelligence. Vol. 2. 2007: Springer.

Yampolskiy, R.V. On the limits of recursively self-improving AGI. in International Conference on Artificial General Intelligence. 2015. Springer.

Shin, D. and Y.J. Park, Role of fairness, accountability, and transparency in algorithmic affordance. Computers in Human Behavior, 2019. 98: pp. 277–284.

Cave, S. and S.S. Óhéigeartaigh, Bridging near-and long-term concerns about AI. Nature Machine Intelligence, 2019. 1(1): p. 5.

Papadimitriou, C.H., Computational complexity. 2003: John Wiley and Sons Ltd.

Gentry, C. Toward basing fully homomorphic encryption on worst-case hardness. in Annual Cryptology Conference. 2010. Springer.

Yoe, C., Primer on risk analysis: decision making under uncertainty. 2016: CRC press.

Du, D.-Z. and P.M. Pardalos, Minimax and applications. Vol. 4. 2013: Springer Science & Business Media.

Anonymous, Worst-case scenario, in Wikipedia. Retrieved June 18, 2020: Available at: https://en.wikipedia.org/wiki/Worst-case_scenario

Dewar, J.A., Assumption-based planning: a tool for reducing avoidable surprises. 2002: Cambridge University Press.

Ineichen, A.M., Asymmetric returns: The future of active asset management. Vol. 369. 2011: John Wiley & Sons.

Sotala, K. and L. Gloor, Superintelligence as a cause or cure for risks of astronomical suffering. Informatica, 2017. 41(4).

Maumann, T., An introduction to worst-case AI safety in S-Risks. July 5, 2018: Available at: http://s-risks.org/an-introduction-to-worst-case-ai-safety/.

Baumann, T., Focus areas of worst-case AI safety, in S-Risks. September 16, 2017: Available at: http://s-risks.org/focus-areas-of-worst-case-ai-safety/.

Daniel, M., S-risks: Why they are the worst existential risks, and how to prevent them, in EAG Boston. 2017: Available at: https://foundational-research.org/s-risks-talk-eag-boston-2017/.

Ziesche, S. and R.V. Yampolskiy, Do No Harm Policy for Minds in Other Substrates. Journal of Evolution & Technology, 2019. 29(2).

Pistono, F. and R.V. Yampolskiy. Unethical Research: How to Create a Malevolent Artificial Intelligence. in 25th International Joint Conference on Artificial Intelligence (IJCAI-16). Ethics for Artificial Intelligence Workshop (AI-Ethics-2016). 2016.

Seshia, S.A., D. Sadigh, and S.S. Sastry, Towards verified artificial intelligence. arXiv preprint arXiv:1606.08514, 2016.

Levin, L.A., Average case complete problems. SIAM Journal on Computing, 1986. 15(1): pp. 285–286.

Bostrom, N., What is a Singleton? Linguistic and Philosophical Investigations, 2006 5(2): pp. 48–54.

Bostrom, N., Superintelligence: Paths, dangers, strategies. 2014: Oxford University Press.

Yampolskiy, R.V., Leakproofing Singularity-Artificial Intelligence Confinement Problem. Journal of Consciousness Studies JCS, 2012.

Babcock, J., J. Kramar, and R. Yampolskiy, The AGI Containment Problem, in The Ninth Conference on Artificial General Intelligence (AGI2015). July 16–19, 2016: NYC, USA.

Armstrong, S., A. Sandberg, and N. Bostrom, Thinking inside the box: Controlling and using an oracle ai. Minds and Machines, 2012. 22(4): pp. 299–324.

Hadfield-Menell, D., et al. The off-switch game. in Workshops at the Thirty-First AAAI Conference on Artificial Intelligence. 2017.

Wängberg, T., et al. A game-theoretic analysis of the off-switch game. in International Conference on Artificial General Intelligence. 2017. Springer.

Yampolskiy, R.V., On Defining Differences Between Intelligence and Artificial Intelligence. Journal of Artificial General Intelligence, 2020. 11(2): pp. 68–70.

Legg, S. and M. Hutter, Universal Intelligence: A Definition of Machine Intelligence. Minds and Machines, December 2007. 17(4): pp. 391–444.

Russell, S. and P. Norvig, Artificial Intelligence: A Modern Approach. 2003, Upper Saddle River, NJ: Prentice Hall.

Legg, S., Friendly AI is Bunk, in Vetta Project. 2006: Available at: http://commonsenseatheism.com/wp-content/uploads/2011/02/Legg-Friendly-AI-is-bunk.pdf.

Juliano, D., Saving the Control Problem. December 18, 2016: Available at: http://dustinjuliano.com/papers/juliano2016a.pdf.

Christiano, P., Benign AI, in AI-Alignment. November 29, 2016: Available at: https://ai-alignment.com/benign-ai-e4eb6ec6d68e.

Armstrong, S. and S. Mindermann. Occam’s razor is insufficient to infer the preferences of irrational agents. in Advances in Neural Information Processing Systems. 2018.

Armstrong, S. and S. Mindermann, Impossibility of deducing preferences and rationality from human policy. arXiv preprint arXiv:1712.05812, 2017.

Russell, S., Provably Beneficial Artificial Intelligence, in The Next Step: Exponential Life. 2017: Available at: https://www.bbvaopenmind.com/en/articles/provably-beneficial-artificial-intelligence/.

M0zrat, Is Alignment Even Possible?!, in Control Problem Forum/Comments. 2018: Available at: https://www.reddit.com/r/ControlProblem/comments/8p0mru/is_alignment_even_possible/.

Baumann, T., Why I expect successful (narrow) alignment, in S-Risks. December 29, 2018: Available at: http://s-risks.org/why-i-expect-successful-alignment/.

Christiano, P., AI “safety” vs “control” vs “alignment”. November 19, 2016: Available at: https://ai-alignment.com/ai-safety-vs-control-vs-alignment-2a4b42a863cc.

Pichai, S., AI at Google: Our Principles. June 7, 2018: Available at: https://blog.google/topics/ai/ai-principles/.

Vinding, M., Is AI Alignment Possible? . Decemeber 14, 2018: Available at: https://magnusvinding.com/2018/12/14/is-ai-alignment-possible/.

Asilomar AI Principles. in Principles developed in conjunction with the 2017 Asilomar conference [Benevolent AI 2017]. 2017.

Critch, A. and D. Krueger, AI Research Considerations for Human Existential Safety (ARCHES). arXiv preprint arXiv:2006.04948, 2020.

AI Control Problem, in Encyclopedia wikipedia. 2019: Available at: https://en.wikipedia.org/wiki/AI_control_problem.

Leike, J., et al., Scalable agent alignment via reward modeling: a research direction. arXiv preprint arXiv:1811.07871, 2018.

Aliman, N.M. and L. Kester, Transformative AI Governance and AI-Empowered Ethical Enhancement Through Preemptive Simulations. Delphi – Interdisciplinary review of emerging technologies, 2019. 2(1).

Russell, S.J., Human compatible: Artificial intelligence and the problem of control. 2019: Penguin Random House.

Christiano, P., Conversation with Paul Christiano, in AI Impacts. September 11, 2019: Available at: https://aiimpacts.org/conversation-with-paul-christiano/.

Shah, R., Why AI risk might be solved without additional intervention from longtermists, in Alignment Newsletter. January 2, 2020: Available at: https://mailchi.mp/b3dc916ac7e2/an-80-why-ai-risk-might-be-solved-without-additional-intervention-from-longtermists.

Gabriel, I., Artificial Intelligence, Values and Alignment. arXiv preprint arXiv:2001.09768, 2020.

Dewey, D., Three areas of research on the superintelligence control problem, in Global Priorities Project. October 20, 2015: Available at: http://globalprioritiesproject.org/2015/10/three-areas-of-research-on-the-superintelligence-control-problem/.

Critch, A., et al., CS 294-149: Safety and Control for Artificial General Intelligence, in Berkeley. 2018: Available at: http://inst.eecs.berkeley.edu/$sim$cs294-149/fa18/.

Pfleeger, S. and R. Cunningham, Why measuring security is hard. IEEE Security & Privacy, 2010. 8(4): pp. 46–54.

Asimov, I., Runaround in Astounding Science Fiction. March 1942.

Clarke, R., Asimov’s Laws of Robotics: Implications for Information Technology, Part 1. IEEE Computer, 1993. 26(12): pp. 53–61.

Clarke, R., Asimov’s Laws of Robotics: Implications for Information Technology, Part 2. IEEE Computer, 1994. 27(1): pp. 57–66.

Soares, N., The value learning problem. Machine Intelligence Research Institute, Berkley, CA, USA, 2015.

Christiano, P., Human-in-the-counterfactual-loop, in AI Alignment. January 20, 2015: Available at: https://ai-alignment.com/counterfactual-human-in-the-loop-a7822e36f399.

Muehlhauser, L. and C. Williamson, Ideal Advisor Theories and Personal CEV. Machine Intelligence Research Institute, 2013.

Kurzweil, R., The Singularity is Near: When Humans Transcend Biology. 2005: Viking Press.

Musk, E., An integrated brain-machine interface platform with thousands of channels. BioRxiv, 2019: p. 703801.

Hossain, G. and M. Yeasin, Cognitive ability-demand gap analysis with latent response models. IEEE Access, 2014. 2: pp. 711–724.

Armstrong, A.J., Development of a methodology for deriving safety metrics for UAV operational safety performance measurement. Report of Master of Science in Safety Critical Systems Engineering at the Department of Computer Science, the University of York, 2010.

Sheridan, T.B. and W.L. Verplank, Human and computer control of undersea teleoperators. 1978, Massachusetts Inst of Tech Cambridge Man-Machine Systems Lab.

Clarke, R., Why the world wants controls over Artificial Intelligence. Computer Law & Security Review, 2019. 35(4): pp. 423–433.

Parasuraman, R., T.B. Sheridan, and C.D. Wickens, A model for types and levels of human interaction with automation. IEEE Transactions on systems, man, and cybernetics-Part A: Systems and Humans, 2000. 30(3): pp. 286–297.

Joy, B., Why the future doesn’t need us. Wired magazine, 2000. 8(4): pp. 238–262.

Werkhoven, P., L. Kester, and M. Neerincx. Telling autonomous systems what to do. in Proceedings of the 36th European Conference on Cognitive Ergonomics. 2018. ACM.

SquirrelInHell, The AI Alignment Problem Has Already Been Solved(?) Once, in Comment on LessWrong by magfrump. April 22, 2017: Available at: https://www.lesswrong.com/posts/Ldzoxz3BuFL4Ca8pG/the-ai-alignment-problem-has-already-been-solved-once.

Yudkowsky, E., The AI alignment problem: why it is hard, and where to start, in Symbolic Systems Distinguished Speaker. 2016: Available at: https://intelligence.org/2016/12/28/ai-alignment-why-its-hard-and-where-to-start/.

Russell, S.J., Provably beneficial artificial intelligence, in Exponential Life, The Next Step. 2017: Available at: https://people.eecs.berkeley.edu/$sim$russell/papers/russell-bbvabook17-pbai.pdf.

Russell, S., Should we fear supersmart robots? Scientific American, 2016. 314(6): pp. 58–59.

Yudkowsky, E., Shut up and do the impossible!, in Less Wrong. October 8, 2008: Available at: https://www.lesswrong.com/posts/nCvvhFBaayaXyuBiD/shut-up-and-do-the-impossible.

Everitt, T. and M. Hutter, The alignment problem for Bayesian history-based reinforcement learners, in Technical Report. 2018: Available at: https://www.tomeveritt.se/papers/alignment.pdf.

Proof of Impossibility, in Wikipedia. 2020: Available at: https://en.wikipedia.org/wiki/Proof_of_impossibility.

Yudkowsky, E., Proving the Impossibility of Stable Goal Systems, in SL4. March 5, 2006: Available at: http://www.sl4.org/archive/0603/14296.html.

Clarke, R. and R.P. Eddy, Summoning the Demon: Why superintelligence is humanity’s biggest threat, in Geek Wire. May 24, 2017: Available at: https://www.geekwire.com/2017/summoning-demon-superintelligence-humanitys-biggest-threat/.

Creighton, J., OpenAI Wants to Make Safe AI, but That May Be an Impossible Task, in Futurism. March 15, 2018: Available at: https://futurism.com/openai-safe-ai-michael-page.

Keiper, A. and A.N. Schulman, The Problem with’Friendly’Artificial Intelligence. The New Atlantis, 2011: pp. 80–89.

Friendly Artificial Intelligence, in Wikipedia. 2019: Available at: https://en.wikipedia.org/wiki/Friendly_artificial_intelligence.

Tegmark, M., Life 3.0: Being human in the age of artificial intelligence. 2017: Knopf.

Kornai, A., Bounding the impact of AGI. Journal of Experimental & Theoretical Artificial Intelligence, 2014. 26(3): pp. 417–438.

Good, I.J., Human and Machine Intelligence: Comparisons and Contrasts. Impact of Science on Society, 1971. 21(4): pp. 305–322.

De Garis, H., What if AI succeeds? The rise of the twenty-first century artilect. AI magazine, 1989. 10(2): pp. 17–17.

Garis, H.d., The Rise of the Artilect Heaven or Hell. 2009: Available at: http://www.agi-conf.org/2009/papers/agi-09artilect.doc.

Spencer, M., Artificial Intelligence Regulation May Be Impossible, in Forbes. March 2, 2019: Available at: https://www.forbes.com/sites/cognitiveworld/2019/03/02/artificial-intelligence-regulation-will-be-impossible/amp.

Menezes, T., Non-Evolutionary Superintelligences Do Nothing, Eventually. arXiv preprint arXiv:1609.02009, 2016.

Pamlin, D. and S. Armstrong, 12 Risks that Threaten Human Civilization, in Global Challenges. February 2015: Available at: https://www.pamlin.net/material/2017/10/10/without-us-progress-still-possible-article-in-china-daily-m9hnk.

Alfonseca, M., et al., Superintelligence cannot be contained: Lessons from Computability Theory. arXiv preprint arXiv:1607.00913, 2016.

Barrat, J., Our final invention: Artificial intelligence and the end of the human era. 2013: Macmillan.

Taylor, J., Autopoietic systems and difficulty of AGI alignment, in Intelligent Agent Foundations Forum. August 18, 2017: Available at: https://agentfoundations.org/item?id=1628.

meanderingmoose, Emergence and Control, in My Brain’s Thoughts. Retrieved on June 16, 2020: Available at: https://mybrainsthoughts.com/?p=136.

capybaralet, Imitation learning considered unsafe?, in Less Wrong. January 6, 2019: Available at: https://www.lesswrong.com/posts/whRPLBZNQm3JD5Zv8/imitation-learning-considered-unsafe.

Kaczynski, T., Industrial Society and Its Future, in The New York Times. September 19, 1995.

Asimov, I., A choice of catastrophes: The disasters that threaten our world. 1979: Simon & Schuster.

Zittrain, J., The Hidden Costs of Automated Thinking, in New Yorker. July 23, 2019: Available at: https://www.newyorker.com/tech/annals-of-technology/the-hidden-costs-of-automated-thinking.

Rodrigues, R. and A. Rességuier, The underdog in the AI ethical and legal debate: human autonomy. June 12, 2019: Available at: https://www.ethicsdialogues.eu/2019/06/12/the-underdog-in-the-ai-ethical-and-legal-debate-human-autonomy/.

Hall, J.S., Beyond AI: Creating the conscience of the machine. 2009: Prometheus books.

Gödel, K., On formally undecidable propositions of Principia Mathematica and related systems. 1992: Courier Corporation.

Yudkowsky, E.S., Coherent Extrapolated Volition. May 2004 Singularity Institute for Artificial Intelligence: Available at: http://singinst.org/upload/CEV.html.

Smuts, A., To be or never to have been: Anti-Natalism and a life worth living. Ethical Theory and Moral Practice, 2014. 17(4): pp. 711–729.

Metzinger, T., Benevolent Artificial Anti-Natalism (BAAN), in EDGE Essay. 2017: Available at: https://www.edge.org/conversation/thomas_metzinger-benevolent-artificial-anti-natalism-baan.

Watson, E.N., The Supermoral Singularity—AI as a Fountain of Values. Big Data and Cognitive Computing, 2019. 3(2): p. 23.

Yampolskiy, R.V., L. Ashby, and L. Hassan, Wisdom of artificial crowds—a metaheuristic algorithm for optimization. Journal of Intelligent Learning Systems & Applications, 2012. 4(2).

Alexander, G.M., et al., The sounds of science–a symphony for many instruments and voices. Physica Scripta, 2020. 95(6).

Sutton, R. Artificial intelligence as a control problem: Comments on the relationship between machine learning and intelligent control. in IEEE International Symposium on Intelligent Control. 1988.

Wiener, N., Cybernetics or Control and Communication in the Animal and the Machine. Vol. 25. 1961: MIT press.

Fisher, M., N. Lynch, and M. Peterson, Impossibility of Distributed Consensus with One Faulty Process. Journal of ACM, 1985. 32(2): pp. 374–382.

Grossman, S.J. and J.E. Stiglitz, On the impossibility of informationally efficient markets. The American economic review, 1980. 70(3): pp. 393–408.

Kleinberg, J.M. An impossibility theorem for clustering. in Advances in neural information processing systems. 2003.

Strawson, G., The impossibility of moral responsibility. Philosophical studies, 1994. 75(1): pp. 5–24.

Bazerman, M.H., K.P. Morgan, and G.F. Loewenstein, The impossibility of auditor independence. Sloan Management Review, 1997. 38: pp. 89–94.

List, C. and P. Pettit, Aggregating sets of judgments: An impossibility result. Economics & Philosophy, 2002. 18(1): pp. 89–110.

Dufour, J.-M., Some impossibility theorems in econometrics with applications to structural and dynamic models. Econometrica: Journal of the Econometric Society, 1997: pp. 1365–1387.

Calude, C.S. and K. Svozil, Is Feasibility in Physics Limited by Fantasy Alone?, in A Computable Universe: Understanding and Exploring Nature as Computation. 2013, World Scientific. pp. 539–547.

Lumbreras, S., The limits of machine ethics. Religions, 2017. 8(5): p. 100.

Shah, N.B. and D. Zhou. On the impossibility of convex inference in human computation. in Twenty-Ninth AAAI Conference on Artificial Intelligence. 2015.

Pagnia, H. and F.C. Gärtner, On the impossibility of fair exchange without a trusted third party. 1999, Technical Report TUD-BS-1999-02, Darmstadt University of Technology: Germany.

Popper, K. and D. Miller, A proof of the impossibility of inductive probability. Nature, 1983. 302(5910): pp. 687–688.

Van Dijk, M. and A. Juels, On the impossibility of cryptography alone for privacy-preserving cloud computing. HotSec, 2010. 10: pp. 1–8.

Goldwasser, S. and Y.T. Kalai. On the impossibility of obfuscation with auxiliary input. in 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS’05). 2005. IEEE.

Fekete, A., et al., The impossibility of implementing reliable communication in the face of crashes. Journal of the ACM (JACM), 1993. 40(5): pp. 1087–1107.

Strawson, G., The impossibility of moral responsibility. Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition, 1994. 75(1/2): pp. 5–24.

Fich, F. and E. Ruppert, Hundreds of impossibility results for distributed computing. Distributed computing, 2003. 16(2-3): pp. 121–163.

Kidron, D. and Y. Lindell, Impossibility results for universal composability in public-key models and with fixed inputs. Journal of cryptology, 2011. 24(3): pp. 517–544.

Lynch, N. A hundred impossibility proofs for distributed computing. in Proceedings of the eighth annual ACM Symposium on Principles of distributed computing. 1989.

Sprenger, J., Two impossibility results for measures of corroboration. The British Journal for the Philosophy of Science, 2018. 69(1): pp. 139–159.

Schmidt, B., P. Schaller, and D. Basin. Impossibility results for secret establishment. in 2010 23rd IEEE Computer Security Foundations Symposium. 2010. IEEE.

Fischer, M.J., N.A. Lynch, and M.S. Paterson, Impossibility of distributed consensus with one faulty process. Journal of the ACM (JACM), 1985. 32(2): pp. 374–382.

Barak, B., et al. On the (im) possibility of obfuscating programs. in Annual International Cryptology Conference. 2001. Springer.

Velupillai, K.V., The impossibility of an effective theory of policy in a complex economy, in Complexity hints for economic policy. 2007, Springer. pp. 273–290.

Schweizer, U., Universal possibility and impossibility results. Games and Economic Behavior, 2006. 57(1): pp. 73–85.

Roth, A.E., An impossibility result concerningn-person bargaining games. International Journal of Game Theory, 1979. 8(3): pp. 129–132.

Man, P.T. and S. Takayama, A unifying impossibility theorem. Economic Theory, 2013. 54(2): pp. 249–271.

Parks, R.P., An impossibility theorem for fixed preferences: a dictatorial Bergson-Samuelson welfare function. The Review of Economic Studies, 1976. 43(3): pp. 447–450.

Sen, A., The impossibility of a Paretian liberal. Journal of political economy, 1970. 78(1): pp. 152–157.

Anonymous, Control Theory, in Wikipedia. Retrieved June 18, 2020: Available at: https://en.wikipedia.org/wiki/Control_theory.

Klamka, J., Controllability of dynamical systems. A survey. Bulletin of the Polish Academy of Sciences: Technical Sciences, 2013. 61(2): pp. 335–342.

Klamka, J., Uncontrollability of composite systems. IEEE Transactions on Automatic Control, 1974. 19(3): pp. 280–281.

Wang, P., Invariance, uncontrollability, and unobservaility in dynamical systems. IEEE Transactions on Automatic Control, 1965. 10(3): pp. 366–367.

Klamka, J., Uncontrollability and unobservability of composite systems. IEEE Transactions on Automatic Control, 1973. 18(5): pp. 539–540.

Klamka, J., Uncontrollability and unobservability of multivariable systems. IEEE Transactions on Automatic Control, 1972. 17(5): pp. 725–726.

Milanese, M., Unidentifiability versus “actual” observability. IEEE Transactions on Automatic Control, 1976. 21(6): pp. 876–877.

Arkin, R., Governing lethal behavior in autonomous robots. 2009: CRC Press.

Conant, R.C. and W. Ross Ashby, Every good regulator of a system must be a model of that system. International journal of systems science, 1970. 1(2): pp. 89–97.

Ashby, W.R., An introduction to cybernetics. 1961: Chapman & Hall Ltd.

Ashby, M., How to apply the Ethical Regulator Theorem to crises. Acta Europeana Systemica (AES), 2018: p. 53.

Ashby, W.R., Requisite variety and its implications for the control of complex systems. Cybernetica 1 (2): 83–99. 1958.

Touchette, H. and S. Lloyd, Information-theoretic approach to the study of control systems. Physica A: Statistical Mechanics and its Applications, 2004. 331(1–2): pp. 140–172.

Touchette, H. and S. Lloyd, Information-theoretic limits of control. Physical review letters, 2000. 84(6): p. 1156.

Aliman, N.-M. and L. Kester, Requisite Variety in Ethical Utility Functions for AI Value Alignment. arXiv preprint arXiv:1907.00430, 2019.

McKeever, S. and M. Ridge, The many moral particularisms. Canadian Journal of Philosophy, 2005. 35(1): pp. 83–106.

Dancy, J., Moral reasons. 1993: Wiley-Blackwell.

McDowell, J., Virtue and reason. The monist, 1979. 62(3): pp. 331–350.

Rawls, J., A theory of justice. 1971: Harvard university press.

Little, M.O., Virtue as knowledge: objections from the philosophy of mind. Nous, 1997. 31(1): pp. 59–79.

Purves, D., R. Jenkins, and B.J. Strawser, Autonomous machines, moral judgment, and acting for the right reasons. Ethical Theory and Moral Practice, 2015. 18(4): pp. 851–872.

Valiant, L., Probably Approximately Correct: Nature’s Algorithms for Learning and Prospering in a Complex World. 2013: Basic Books (AZ).

Good, I.J., Ethical machines, in Intelligent Systems: Practice and Perspective, D.M. J. E. Hayes, and Y.-H. Pao, Editor. 1982, Ellis Horwood Limited: Chichester. pp. 555–560.

Bogosian, K., Implementation of Moral Uncertainty in Intelligent Machines. Minds and Machines, 2017. 27(4): pp. 591–608.

Eckersley, P., Impossibility and Uncertainty Theorems in AI Value Alignment (or why your AGI should not have a utility function). arXiv preprint arXiv:1901.00064, 2018.

Arrow, K.J., A difficulty in the concept of social welfare. Journal of political economy, 1950. 58(4): pp. 328–346.

Parfit, D., Reasons and persons. 1984: OUP Oxford.

Arrhenius, G., An impossibility theorem for welfarist axiologies. Economics & Philosophy, 2000. 16(2): pp. 247–266.

Arrhenius, G., The impossibility of a satisfactory population ethics, in Descriptive and normative approaches to human behavior. 2012, World Scientific. pp. 1–26.

Greaves, H., Population axiology. Philosophy Compass, 2017. 12(11): p. e12442.

Friedler, S.A., C. Scheidegger, and S. Venkatasubramanian, On the (im) possibility of fairness. arXiv preprint arXiv:1609.07236, 2016.

Miconi, T., The impossibility of “fairness”: a generalized impossibility result for decisions. arXiv preprint arXiv:1707.01195, 2017.

Kleinberg, J., S. Mullainathan, and M. Raghavan, Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807, 2016.

Chouldechova, A., Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 2017. 5(2): pp. 153–163.

Aaron, Impossibility Results in Fairness as Bayesian Inference, in Adventures in Computation. February 26, 2019: Available at: https://aaronsadventures.blogspot.com/2019/02/impossibility-results-in-fairness-as.html.

Rice, H.G., Classes of recursively enumerable sets and their decision problems. Transactions of the American Mathematical Society, 1953. 74(2): pp. 358–366.

Evans, D., On the impossibility of virus detection. 2017: Available at: http://www.cs.virginia.edu/evans/pubs/virus.pdf.

Selçuk, A.A., F. Orhan, and B. Batur. Undecidable problems in malware analysis. in 2017 12th International Conference for Internet Technology and Secured Transactions (ICITST). 2017. IEEE.

Anonymous, AI Safety Mindset, in Arbital. 2018: Available at: https://arbital.com/p/AI_safety_mindset/.

Yampolskiy, R.V. The space of possible mind designs. in International Conference on Artificial General Intelligence. 2015. Springer.

Baum, S., Superintelligence skepticism as a political tool. Information, 2018. 9(9): p. 209.

Yampolskiy, R.V. Taxonomy of Pathways to Dangerous Artificial Intelligence. in Workshops at the Thirtieth AAAI Conference on Artificial Intelligence. 2016.

Yampolskiy, R.V., Artificial Intelligence Safety Engineering: Why Machine Ethics is a Wrong Approach, in Philosophy and Theory of Artificial Intelligence (PT-AI2011). October 3–4, 2011: Thessaloniki, Greece.

Urban, T., Neuralink and the Brain’s Magical Future, in Wait But Why. April 20, 2017: Available at: https://waitbutwhy.com/2017/04/neuralink.html.

Smith, B.C., The limits of correctness. ACM SIGCAS Computers and Society, 1985. 14(1): pp. 18–26.

Rodd, M., Safe AI—is this possible? Engineering Applications of Artificial Intelligence, 1995. 8(3): pp. 243–250.

Carlson, K.W., Safe Artificial General Intelligence via Distributed Ledger Technology. arXiv preprint arXiv:1902.03689, 2019.

Demetis, D. and A.S. Lee, When humans using the IT artifact becomes IT using the human artifact. Journal of the Association for Information Systems, 2018. 19(10): p. 5.

Reyzin, L., Unprovability comes to machine learning. Nature, 2019.

Ben-David, S., et al., Learnability can be undecidable. Nature Machine Intelligence, 2019. 1(1): p. 44.

Reynolds, C. On the computational complexity of action evaluations. in 6th International Conference of Computer Ethics: Philosophical Enquiry (University of Twente, Enschede, The Netherlands, 2005). 2005.

Yampolskiy, R.V., Construction of an NP Problem with an Exponential Lower Bound. Arxiv preprint arXiv:1111.0305, 2011.

Foster, D.P. and H.P. Young, On the impossibility of predicting the behavior of rational agents. Proceedings of the National Academy of Sciences, 2001. 98(22): pp. 12848–12853.

Kahneman, D. and P. Egan, Thinking, fast and slow. Vol. 1. 2011: Farrar, Straus and Giroux New York.

Tarter, J., The search for extraterrestrial intelligence (SETI). Annual Review of Astronomy and Astrophysics, 2001. 39(1): pp. 511–548.

Carrigan Jr, R.A., Do potential SETI signals need to be decontaminated? Acta Astronautica, 2006. 58(2): pp. 112–117.

Hippke, M. and J.G. Learned, Interstellar communication. IX. Message decontamination is impossible. arXiv preprint arXiv:1802.02180, 2018.

Miller, J.D. and D. Felton, The Fermi paradox, Bayes’ rule, and existential risk management. Futures, 2017. 86: pp. 44–57.

Wolpert, D., Constraints on physical reality arising from a formalization of knowledge. arXiv preprint arXiv:1711.03499, 2017.

Wolpert, D.H., Physical limits of inference. Physica D: Nonlinear Phenomena, 2008. 237(9): pp. 1257–1281.

Wolpert, D.H., Computational capabilities of physical systems. Physical Review E, 2001. 65(1): p. 016128.

Yudkowsky, E., Safely aligning a powerful AGI is difficult., in Twitter. December 4, 2018: Available at: https://twitter.com/ESYudkowsky/status/1070095112791715846.

Yudkowsky, E., On Doing the Improbable, in Less Wrong. October 28, 2018: Available at: https://www.lesswrong.com/posts/st7DiQP23YQSxumCt/on-doing-the-improbable.

Soares, N., in Talk at Google. April 12, 2017: Available at: https://intelligence.org/2017/04/12/ensuring/.

Garrabrant, S., Optimization Amplifies, in Less Wrong. June 26, 2018: Available at: https://www.lesswrong.com/posts/zEvqFtT4AtTztfYC4/optimization-amplifies.

Patch resistance, in Arbital. June 18, 2015: Available at: https://arbital.com/p/patch_resistant/.

Yudkowsky, E., Aligning an AGI adds significant development time, in Arbital. February 21, 2017: Available at: https://arbital.com/p/aligning_adds_time/.

Yudkowsky, E., Security Mindset and Ordinary Paranoia. November 25, 2017: Available at: https://intelligence.org/2017/11/25/security-mindset-ordinary-paranoia/.

Yudkowsky, E., Security Mindset and the Logistic Success Curve. November 26, 2017: Available at: https://intelligence.org/2017/11/26/security-mindset-and-the-logistic-success-curve/.

Soares, N., 2018 Update: Our New Research Directions. November 22, 2018: Available at: https://intelligence.org/2018/11/22/2018-update-our-new-research-directions/#section2.

Garrabrant, S. and A. Demski, Embedded Agency. November 15, 2018: Available at: https://www.lesswrong.com/posts/i3BTagvt3HbPMx6PN/embedded-agency-full-text-version.

Yudkowsky, E., The Rocket Alignment Problem, in Less Wrong. October 3, 2018: Available at: https://www.lesswrong.com/posts/Gg9a4y8reWKtLe3Tn/the-rocket-alignment-problem.

Hubinger, E., et al., Risks from Learned Optimization in Advanced Machine Learning Systems. arXiv preprint arXiv:1906.01820, 2019.

Hadfield-Menell, D., et al., Cooperative inverse reinforcement learning, in Advances in neural information processing systems. 2016. pp. 3909–3917.

Problem of fully updated deference, in Arbital. Retrieved April 4, 2020 from: https://arbital.com/p/updated_deference/.

Christiano, P., ALBA: An explicit proposal for aligned AI, in AI Alignment. February 23, 2016: Available at: https://ai-alignment.com/alba-an-explicit-proposal-for-aligned-ai-17a55f60bbcf.

Yudkowsky, E., Challenges to Christiano’s capability amplification proposal, in Less Wrong. May 19, 2018: Available at: https://www.lesswrong.com/posts/S7csET9CgBtpi7sCh/challenges-to-christiano-s-capability-amplification-proposal.

Christiano, P., Prize for probable problems, in Less Wrong. March 8, 2018: Available at: https://www.lesswrong.com/posts/SqcPWvvJJwwgZb6aH/prize-for-probable-problems.

Armstrong, S., The Limits to Corrigibility, in Less Wrong. April 10, 2018: Available at: https://www.lesswrong.com/posts/T5ZyNq3fzN59aQG5y/the-limits-of-corrigibility.

Armstrong, S., Problems with Amplification/Distillation, in Less Wrong. March 27, 2018: Available at: https://www.lesswrong.com/posts/ZyyMPXY27TTxKsR5X/problems-with-amplification-distillation.

Gwern, Why Tool AIs Want to Be Agent AIs: The Power of Agency. August 28, 2018: Available at: https://www.gwern.net/Tool-AI.

Yudkowsky, E., There’s No Fire Alarm for Artificial General Intelligence. October 14, 2017: Available at: https://intelligence.org/2017/10/13/fire-alarm/.

Yudkowsky, E., Inadequate Equilibria: Where and How Civilizations Get Stuck. 2017, Machine Intelligence Research Institute: Available at: https://equilibriabook.com/.

Yudkowsky, E., Cognitive biases potentially affecting judgment of global risks. Global catastrophic risks, 2008. 1(86): p. 13.

Bengio, Y., The fascinating Facebook debate between Yann LeCun, Stuart Russel and Yoshua Bengio about the risks of strong AI. October 7, 2019: Available at: http://www.parlonsfutur.com/blog/the-fascinating-facebook-debate-between-yann-lecun-stuart-russel-and-yoshua.

Faggella, D., in AI Value Alignment isn’t a Problem if We Don’t Coexist. March 8, 2019: Available at: https://danfaggella.com/ai-value-alignment-isnt-a-problem-if-we-dont-coexist/

Turchin, A., AI Alignment Problem: “Human Values” don’t Actually Exist, in Less Wrong. 2019: Available at: https://www.lesswrong.com/posts/ngqvnWGsvTEiTASih/ai-alignment-problem-human-values-don-t-actually-exist.

Burden, J. and J. Hernández-Orallo, Exploring AI Safety in Degrees: Generality, Capability and Control, in SafeAI. February 7, 2020: New York, USA.

Yampolskiy, R.V., Behavioral Modeling: an Overview. American Journal of Applied Sciences, 2008. 5(5): pp. 496–503.

Steven, Agents That Learn From Human Behavior Can’t Learn Human Values That Humans Haven’t Learned Yet, in Less Wrong. July 10, 2018: Available at: https://www.lesswrong.com/posts/DfewqowdzDdCD7S9y/agents-that-learn-from-human-behavior-can-t-learn-human.

Ng, A.Y. and S.J. Russell, Algorithms for inverse reinforcement learning, in Seventeenth International Conference on Machine Learning (ICML). 2000. pp. 663–670.

Amin, K. and S. Singh, Towards resolving unidentifiability in inverse reinforcement learning. arXiv preprint arXiv:1601.06569, 2016.

Babcock, J., J. Kramar, and R.V. Yampolskiy, Guidelines for Artificial Intelligence Containment. arXiv preprint arXiv:1707.08476, 2017.

Pittman, J.M. and C.E. Soboleski, A cyber science based ontology for artificial general intelligence containment. arXiv preprint arXiv:1801.09317, 2018.

Chalmers, D., The singularity: A philosophical analysis. Science fiction and philosophy: From time travel to superintelligence, 2009: pp. 171–224.

Pittman, J.M., J.P. Espinoza, and C.S. Crosby, Stovepiping and Malicious Software: A Critical Review of AGI Containment. arXiv preprint arXiv:1811.03653, 2018.

Arnold, T. and M. Scheutz, The “big red button” is too late: an alternative model for the ethical evaluation of AI systems. Ethics and Information Technology, 2018. 20(1): pp. 59–69.

Omohundro, S.M., The Basic AI Drives, in Proceedings of the First AGI Conference, Volume 171, Frontiers in Artificial Intelligence and Applications, P. Wang, B. Goertzel, and S. Franklin (eds.). February 2008, IOS Press.

Orseau, L. and M. Armstrong, Safely interruptible agents. 2016: Available at: https://intelligence.org/files/Interruptibility.pdf.

Riedl, M., Big Red Button. Retrieved on January 23, 2020: Available at: https://markriedl.github.io/big-red-button/.

Goertzel, B., Does humanity need an AI nanny, in H+ Magazine. 2011.

de Garis, H., The artilect war: Cosmists vs. Terrans. 2005, Palm Springs, CA: ETC Publications.

Legg, S. Unprovability of Friendly AI. Vetta Project 2006 Sep. 15. [cited 2012 Jan. 15]; Available from: http://www.vetta.org/2006/09/unprovability-of-friendly-ai/.

Yudkowsky, E., Open problems in friendly artificial intelligence, in Singularity Summit. 2011: New York.

Yudkowsky, E. Timeless decision theory. 2010 [cited 2012 Jan. 15]; Available from: http://singinst.org/upload/TDT-v01o.pdf.

Drescher, G., Good and real: Demystifying paradoxes from physics to ethics Bradford Books. 2006, Cambridge, MA: MIT Press.

Yampolskiy, R. and J. Fox, Safety Engineering for Artificial General Intelligence. Topoi, 2012: pp. 1–10.

Vinge, V. Technological singularity. in VISION-21 Symposium sponsored by NASA Lewis Research Center and the Ohio Aerospace Institute. 1993.

Cognitive Uncontainability, in Arbital. Retrieved May 19, 2019: Available at: https://arbital.com/p/uncontainability/.

Yampolskiy, R.V., Unpredictability of AI: On the Impossibility of Accurately Predicting All Actions of a Smarter Agent. Journal of Artificial Intelligence and Consciousness, 2020. 7(01): pp. 109–118.

Buiten, M.C., Towards intelligent regulation of Artificial Intelligence. European Journal of Risk Regulation, 2019. 10(1): pp. 41–59.

Yampolskiy, R., Unexplainability and Incomprehensibility of Artificial Intelligence. arXiv:1907.03869 2019.

Charlesworth, A., Comprehending software correctness implies comprehending an intelligence-related limitation. ACM Transactions on Computational Logic (TOCL), 2006. 7(3): pp. 590–612.

Charlesworth, A., The comprehensibility theorem and the foundations of artificial intelligence. Minds and Machines, 2014. 24(4): pp. 439–476.

Hernández-Orallo, J. and N. Minaya-Collado. A formal definition of intelligence based on an intensional variant of algorithmic complexity. in Proceedings of International Symposium of Engineering of Intelligent Systems (EIS98). 1998.

Li, M. and P. Vitányi, An introduction to Kolmogorov complexity and its applications. Vol. 3. 1997: Springer.

Trakhtenbrot, B.A., A Survey of Russian Approaches to Perebor (Brute-Force Searches) Algorithms. IEEE Annals of the History of Computing, 1984. 6(4): pp. 384–400.

Goertzel, B., The Singularity Institute’s Scary Idea (and Why I Don’t Buy It). October 29, 2010: Available at: http://multiverseaccordingtoben.blogspot.com/2010/10/singularity-institutes-scary-idea-and.html.

Legg, S., Unprovability of Friendly AI. September 2006: Available at: https://web.archive.org/web/20080525204404/http://www.vetta.org/2006/09/unprovability-of-friendly-ai/.

Bieger, J., K.R. Thórisson, and P. Wang. Safe baby AGI. in International Conference on Artificial General Intelligence. 2015. Springer.

Herley, C., Unfalsifiability of security claims. Proceedings of the National Academy of Sciences, 2016. 113(23): pp. 6415–6420.

Yampolskiy, R.V., What are the ultimate limits to computational techniques: verifier theory and unverifiability. Physica Scripta, 2017. 92(9): p. 093001.

Muehlhauser, L., Gerwin Klein on Formal Methods, in Intelligence.org. February 11, 2014: Available at: https://intelligence.org/2014/02/11/gerwin-klein-on-formal-methods/.

Muehlhauser, L., Mathematical Proofs Improve But Don’t Guarantee Security, Safety, and Friendliness, in Intelligence.org. October 3, 2013: Available at: https://intelligence.org/2013/10/03/proofs/.

Jilk, D.J., Limits to Verification and Validation of Agentic Behavior. arXiv preprint arXiv:1604.06963, 2016.

Jilk, D.J., et al., Anthropomorphic reasoning about neuromorphic AGI safety. Journal of Experimental & Theoretical Artificial Intelligence, 2017. 29(6): pp. 1337–1351.

Fetzer, J.H., Program verification: the very idea. Communications of the ACM, 1988. 31(9): pp. 1048–1063.

Petke, J., et al., Genetic improvement of software: a comprehensive survey. IEEE Transactions on Evolutionary Computation, 2017. 22(3): pp. 415–432.

Yampolskiy, R.V., Utility Function Security in Artificially Intelligent Agents. Journal of Experimental and Theoretical Artificial Intelligence (JETAI), 2014: pp. 1–17.

Everitt, T., et al., Reinforcement learning with a corrupted reward channel. arXiv preprint arXiv:1705.08417, 2017.

Lanzarone, G.A. and F. Gobbo, Is computer ethics computable. Living, Working and Learning Beyond, 2008: p. 530.

Moor, J.H., Is ethics computable? Metaphilosophy, 1995. 26(1/2): pp. 1–21.

Allen, C., G. Varner, and J. Zinser, Prolegomena to any future artificial moral agent. Journal of Experimental and Theoretical Artificial Intelligence, 2000. 12: pp. 251–261.

Weld, D. and O. Etzioni. The first law of robotics (a call to arms). in Proceedings of the Twelfth AAAI National Conference on Artificial Intelligence. 1994.

Brundage, M., Limitations and risks of machine ethics. Journal of Experimental & Theoretical Artificial Intelligence, 2014. 26(3): pp. 355–372.

Russell, S., The purpose put into the machine, in Possible minds: twenty-five ways of looking at AI. 2019, Penguin Press. pp. 20–32.

Calude, C.S., E. Calude, and S. Marcus, Passages of proof. arXiv preprint math/0305213, 2003.

Aliman, N.-M., et al., Error-Correction for AI Safety, in Artificial General Intelligence (AGI20). June 23-26, 2020: St. Petersburg, Russia.

Wiener, N., Some moral and technical consequences of automation. Science, 1960. 131(3410): pp. 1355–1358.

Versenyi, L., Can robots be moral? Ethics, 1974. 84(3): pp. 248–259.

Yampolskiy, R., Turing Test as a Defining Feature of AI-Completeness, in Artificial Intelligence, Evolutionary Computing and Metaheuristics, X.-S. Yang, Editor. 2013, Springer Berlin Heidelberg. pp. 3–17.

Yampolskiy, R.V., AI-Complete CAPTCHAs as Zero Knowledge Proofs of Access to an Artificially Intelligent System. ISRN Artificial Intelligence, 2011. 271878.

Yampolskiy, R.V., AI-Complete, AI-Hard, or AI-Easy–Classification of Problems in AI. The 23rd Midwest Artificial Intelligence and Cognitive Science Conference, Cincinnati, OH, USA, 2012.

Brown, T.B., et al., Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 2020.

Yampolskiy, R.V., Efficiency Theory: a Unifying Theory for Information, Computation and Intelligence. Journal of Discrete Mathematical Sciences & Cryptography, 2013. 16(4-5): pp. 259–277.

Ziesche, S. and R.V. Yampolskiy, Towards the Mathematics of Intelligence. The Age of Artificial Intelligence: An Exploration, 2020: p. 1.

Drexler, K.E., Reframing Superintelligence: Comprehensive AI Services as General Intelligence, in Technical Report #2019-1, Future of Humanity Institute, University of Oxford. 2019, Oxford University, Oxford University: Available at: https://www.fhi.ox.ac.uk/wp-content/uploads/Reframing_Superintelligence_FHI-TR-2019-1.1-1.pdf.

Minsky, M., Society of mind. 1988: Simon and Schuster.

Soares, N., et al. Corrigibility. in Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence. 2015.

Lipton, R., Are Impossibility Proofs Possible?, in Gödel’s Lost Letter and P=NP. September 13, 2009: Available at: https://rjlipton.wordpress.com/2009/09/13/are-impossibility-proofs-possible/.

Wadman, M., US biologists adopt cloning moratorium. 1997, Nature Publishing Group.

Sauer, F., Stopping ‘Killer Robots’: Why Now Is the Time to Ban Autonomous Weapons Systems. Arms Control Today, 2016. 46(8): pp. 8–13.

Ashby, M. Ethical regulators and super-ethical systems. in Proceedings of the 61st Annual Meeting of the ISSS-2017 Vienna, Austria. 2017.

Downloads

Published

2022-05-25

How to Cite

1.
Yampolskiy RV. On the Controllability of Artificial Intelligence: An Analysis of Limitations. JCSANDM [Internet]. 2022 May 25 [cited 2024 Apr. 23];11(03):321-404. Available from: https://journals.riverpublishers.com/index.php/JCSANDM/article/view/16219

Issue

Section

Articles