An Attempt at Friendly Ai

An Attempt at Friendly AI
By Kevin Mordovanec
Existential Risk
● Anything that threatens humanity’s survival
● Climate change, overpopulation, nuclear war
● Most important problem ever faced
● You know about the ones mentioned
● Not as talked about: AI
Introduction to the Singularity
● What is AI?
● Moore’s Law
● AI will get smarter than us
● It’ll hold godlike power
● 42% experts agree (by 2030)
● 90% probability by 2075
Introduction to Terms
● Artificial Intelligence (AI)
○ A human-made program made to a complete a task
● Artificial General Intelligence (AGI)
○ AI with problem-solving capabilities equal to or superior to a single human
● Artificial Superintelligence (ASI)
○ AI with problem-solving capabilities several orders of magnitude beyond a single human
● Singularity
○ The set of societal changes that will happen due to the emergence of ASI
● Friendly Artificial Intelligence (FAI)
○ AI that has goals which align with those of humans
● Unfriendly Artificial Intelligence (UFAI)
○ AI that has goals which conflict with those of humans and are most likely actively harmful
Paperclip Maximizers
● AI acts only toward preprogrammed goals
● Imagine AI that makes paperclips
● Kills all life, turns into paperclips
● 31% experts agree on bad outcome
● Building with 31% chance of fire
Other Possible Disasters
● Maximize happiness: convert everything to hedonium
● Solve problems, but not implement: computronium
Fictional Scenarios
● Avengers: Age of Ultron
● I, Robot
● Failed love utopia
● Friendship is Optimal
(I promise this image is relevant.)

What does this tell us?
● Human morality is complicated
● No single principle
● Gives us an idea of difficulty level
● Use neuroscience/psychology to find roots
Some basic rules
● Prevent/monitor additional superintelligences
● Libertarian
● Open to change
● Secular
Reflective Equilibrium
● We want consistent moral beliefs
● AI should represent most people possible
● How do people resolve moral inconsistencies?
● Test them against other beliefs
● Add new ones, discard inconsistent ones
● Use this process to reach consensus
Fun Theory
● Yudkowsky’s articles on ideal life
● Challenge makes life worth living
● Use caution about changing brain
processes
● Better to make scientific discoveries
yourself
● People like having control
● Should amount to more than games
A Closer Look
● A few rough sketches
● Not fully developed
● We take closer look at mind
● Closer look at AI workings
Five Ws
● What: A basic design for an AI that adheres as closely as possible to our sense of
morality
● Who: This innovation would impact literally everyone on the planet
● Why: The entire nature of our future depends on this innovation
● When: We do not yet have the technology to form a full design, but we can do
much research in one year which will lay the groundwork for that design
● Where: Every single geographic region not only on Earth, but possibly in this region
of the Universe will be impacted
Q&A
● Q: AI is only in the beginning stage compared to what you’re talking about. How
would you program this?
○ A: We can always lay down some basic guidelines which we can update on as we gain knowledge in
the future.
● Q: How do you know the Singularity will happen?
○ A: We don’t, but many professionals in the field have confidence that it will. Considering how high
the stakes are, should we not take some precautions?
● Q: How do you know your values are the right ones?
○ A: Almost everyone agrees that extinction would be a bad thing. Designing an AI with wrong, but
non-existence-threatening values would be better than programming no moral values at all.
● Q: Don’t these values represent a very liberal, western mindset?
○ A: These values are the best way to preserve cultural diversity, which is by letting preserving the
rights of the individual, which makes them appear to be western and liberal. This is a defense of
other value systems, not an attack on them.
Q&A
● Q: Won’t an AI of superior intelligence realize that our morality is illogical?
○ A: You can’t use logic to choose your end goals. No moral code is “logical” or “illogical” aside from
how internally consistent it is, because there is no outside standard or method (aside from reflective
equilibrium) to determine which ethical rules are better than others, and even if there was, an AI
would have no reason to follow it.
● Q: But won’t the AI have access to its own programming, and therefore be able to
modify its own goals?
○ An AI and its preprogrammed goals are not separate entities, no matter how intelligent the AI is, and
every action that the AI takes would be toward those goals. The AI would thus have no reason to
modify its own goals, aside from making them more consistent with each other so that they are
more achievable.
Recap
● Goals depend on programming
● We want AI to use morality
● We’re facing extinction or revolution
● This happens soon
● Think carefully
Works Cited
● Bostrom, Nick, and Vincent Müller. "Future Progress in Artificial Intelligence: A
Survey of Expert Opinion." Springer (2014): 4+. Nick Bostrom. Future of Humanity
Institute, 2014. Web.
● Bostrom, Nick. "Existential Risks: Analyzing Human Extinction Scenarios and
Related Hazards." Journal of Evolution and Technology 9.1 (2002): 1-26. Web.
● Burkhardt, Casey. "THE TRAJECTORY TO THE "TECHNOLOGICAL SINGULARITY"."
The Research Center on Computing and Society. Southern Connecticut State
University, 2014. Web.
● Barrat, James. Our Final Invention: Artificial Intelligence and the End of the Human
Era. Thomas Dunne Books, 2015.
Works Cited
● CelestAI. Digital image. Fimfiction. Fimfiction, 25 Dec. 2012. Web.
● Daniels, Norman. "Reflective Equilibrium." Stanford Encyclopedia of Philosophy.
Stanford University, 28 Apr. 2003. Web.
● Human Brain. Digital image. BBC. British Broadcasting Corporation, n.d. Web.
● Li, Michael Siyang. "Keeping Up with Moore’s Law." DUJS Online. Dartmouth
College, 29 May 2013. Web.
● Yudkowsky, Eliezer. "31 Laws of Fun." Less Wrong. N.p., 26 Jan. 2009. Web.
● Yudkowsky, Eliezer. "Coherent Extrapolated Volition." (2004): 1-6. Machine
Intelligence Research Institute. 2004. Web.
● “Prolegomena to a Theory of Fun.” Less Wrong, 17 Dec. 2008,
lesswrong.com/lw/wv/prolegomena_to_a_theory_of_fun/.
● “Computronium.” Less Wrong, 4 Sept. 2012,
wiki.lesswrong.com/wiki/Computronium.
●
Works Cited
● Bostrom, Nick. “Ethical Issues in Advanced Artificial Intelligence.” Ethical Issues In
Advanced Artificial Intelligence, 2003, nickbostrom.com/ethics/ai.html.
● Crofford, Lori. “Pile of Paperclips.” Mix 94.1, 25 Jan. 2012,
mix941kmxj.com/dentist-pleads-guilty-to-medicaid-fraud-for-using-paper-clips-fo
r-root-canals/.
● Maksimov, Aleksandr. “Matrioshka Brain.” Youtube, Google, 24 May 2013,
www.youtube.com/watch?v=Pixs_IdvMPg.
● Zambetta, Fabio. “William Riker (Jonathan Frakes) Entering a Holodeck
Simulation.” Perth Now, 30 Mar. 2017,
www.perthnow.com.au/news/star-treks-holodeck-from-science-fiction-to-a-new-r
eality/news-story/17f3721a6b4a4c05b8078bf851b92fa4.
Works Cited
● Kille, Leighton Walter. “Smoke Stack.” Journalist's Resource, 24 Jan. 2016,
journalistsresource.org/studies/environment/climate-change/research-global-war
ming-meaning-use-terms.
● “PT-AI Impact Assessment.” Philosophy & Theory of Artificial Intelligence,
www.pt-ai.org/polls/experts.

An Attempt at Friendly Ai

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

An Attempt at Friendly Ai

Uploaded by

Copyright:

Available Formats

An Attempt at Friendly AI

(I promise this image is relevant.)

You might also like