Artificial Intelligence
A Modern Approach
Fourth Edition
PEARSON SERIES
IN ARTIFICIAL INTELLIGENCE
Stuart Russell and Peter Norvig, Editors
Artificial Intelligence
A Modern Approach
Fourth Edition
Stuart J. Russell and Peter Norvig
Contributing writers:
Ming-Wei Chang
Jacob Devlin
Anca Dragan
David Forsyth
Ian Goodfellow
Jitendra M. Malik
Vikash Mansinghka
Judea Pearl
Michael Wooldridge
Copyright © 2021, 2010, 2003 by Pearson Education, Inc. or its affiliates, 221 River Street,
Hoboken, NJ 07030. All Rights Reserved. Manufactured in the United States of America.
This publication is protected by copyright, and permission should be obtained from the
publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission
in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise.
For information regarding permissions, request forms, and the appropriate contacts within
the Pearson Education Global Rights and Permissions department, please visit
www.pearsoned.com/permissions/.
Acknowledgments of third-party content appear on the appropriate page within the text.
Cover Images:
Alan Turing – Science History Images/Alamy Stock Photo
Statue of Aristotle – Panos Karas/Shutterstock
Ada Lovelace – Pictorial Press Ltd/Alamy Stock Photo
Autonomous cars – Andrey Suslov/Shutterstock
Atlas Robot – Boston Dynamics, Inc.
Berkeley Campanile and Golden Gate Bridge – Ben Chu/Shutterstock
Background ghosted nodes – Eugene Sergeev/Alamy Stock Photo
Chess board with chess figure – Titania/Shutterstock
Mars Rover – Stocktrek Images, Inc./Alamy Stock Photo
Kasparov – KATHY WILLENS/AP Images
http://www.pearsoned.com/permissions/
PEARSON, ALWAYS LEARNING is an exclusive trademark owned by Pearson Education,
Inc. or its affiliates in the U.S. and/or other countries.
Unless otherwise indicated herein, any third-party trademarks, logos, or icons that may
appear in this work are the property of their respective owners, and any references to third-
party trademarks, logos, icons, or other trade dress are for demonstrative or descriptive
purposes only. Such references are not intended to imply any sponsorship, endorsem*nt,
authorization, or promotion of Pearson’s products by the owners of such marks, or any
relationship between the owner and Pearson Education, Inc., or its affiliates, authors,
licensees, or distributors.
Library of Congress Cataloging-in-Publication Data
Names: Russell, Stuart J. (Stuart Jonathan), author. | Norvig, Peter, author.
Title: Artificial intelligence : a modern approach / Stuart J. Russell and Peter Norvig.
Description: Fourth edition. | Hoboken : Pearson, [2021] | Series: Pearson series in artificial
intelligence | Includes bibliographical references and index. | Summary: “Updated edition
of popular textbook on Artificial Intelligence.”— Provided by publisher.
Identifiers: LCCN 2019047498 | ISBN 9780134610993 (hardcover)
Subjects: LCSH: Artificial intelligence.
Classification: LCC Q335 .R86 2021 | DDC 006.3–dc23
LC record available at https://lccn.loc.gov/2019047498
ScoutAutomatedPrintCode
ISBN-10: 0-13-461099-7
ISBN-13: 978-0-13-461099-3
https://lccn.loc.gov/2019047498
For Loy, Gordon, Lucy, George, and Isaac — S.J.R.
For Kris, Isabella, and Juliet — P.N.
Preface
Artificial Intelligence (AI) is a big field, and this is a big book. We have tried to explore the
full breadth of the field, which encompasses logic, probability, and continuous mathematics;
perception, reasoning, learning, and action; fairness, trust, social good, and safety; and
applications that range from microelectronic devices to robotic planetary explorers to online
services with billions of users.
The subtitle of this book is “A Modern Approach.” That means we have chosen to tell the
story from a current perspective. We synthesize what is now known into a common
framework, recasting early work using the ideas and terminology that are prevalent today.
We apologize to those whose subfields are, as a result, less recognizable.
New to this edition
This edition reflects the changes in AI since the last edition in 2010:
We focus more on machine learning rather than hand-crafted knowledge engineering,
due to the increased availability of data, computing resources, and new algorithms.
Deep learning, probabilistic programming, and multiagent systems receive expanded
coverage, each with their own chapter.
The coverage of natural language understanding, robotics, and computer vision has
been revised to reflect the impact of deep learning.
The robotics chapter now includes robots that interact with humans and the application
of reinforcement learning to robotics.
Previously we defined the goal of AI as creating systems that try to maximize expected
utility, where the specific utility information—the objective—is supplied by the human
designers of the system. Now we no longer assume that the objective is fixed and
known by the AI system; instead, the system may be uncertain about the true objectives
of the humans on whose behalf it operates. It must learn what to maximize and must
function appropriately even while uncertain about the objective.
We increase coverage of the impact of AI on society, including the vital issues of ethics,
fairness, trust, and safety.
We have moved the exercises from the end of each chapter to an online site. This allows
us to continuously add to, update, and improve the exercises, to meet the needs of
instructors and to reflect advances in the field and in AI-related software tools.
Overall, about 25% of the material in the book is brand new. The remaining 75% has
been largely rewritten to present a more unified picture of the field. 22% of the citations
in this edition are to works published after 2010.
Overview of the book
The main unifying theme is the idea of an intelligent agent. We define AI as the study of
agents that receive percepts from the environment and perform actions. Each such agent
implements a function that maps percept sequences to actions, and we cover different ways
to represent these functions, such as reactive agents, real-time planners, decision-theoretic
systems, and deep learning systems. We emphasize learning both as a construction method
for competent systems and as a way of extending the reach of the designer into unknown
environments. We treat robotics and vision not as independently defined problems, but as
occurring in the service of achieving goals. We stress the importance of the task
environment in determining the appropriate agent design.
Our primary aim is to convey the ideas that have emerged over the past seventy years of AI
research and the past two millennia of related work. We have tried to avoid excessive
formality in the presentation of these ideas, while retaining precision. We have included
mathematical formulas and pseudocode algorithms to make the key ideas concrete;
mathematical concepts and notation are described in Appendix A and our pseudocode is
described in Appendix B .
This book is primarily intended for use in an undergraduate course or course sequence. The
book has 28 chapters, each requiring about a week’s worth of lectures, so working through
the whole book requires a two-semester sequence. A one-semester course can use selected
chapters to suit the interests of the instructor and students. The book can also be used in a
graduate-level course (perhaps with the addition of some of the primary sources suggested
in the bibliographical notes), or for self-study or as a reference.
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015390.xhtml#P7001017158000000000000000015390
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001543B.xhtml#P700101715800000000000000001543B
Throughout the book, important points are marked with a triangle icon in the margin.
Wherever a new term is defined,
,itself, is not enough.
Probability
1.1.4 Acting rationally: The rational agent approach
Agent
An agent is just something that acts (agent comes from the Latin agere, to do). Of course, all
computer programs do something, but computer agents are expected to do more: operate
autonomously, perceive their environment, persist over a prolonged time period, adapt to
change, and create and pursue goals. A rational agent is one that acts so as to achieve the
best outcome or, when there is uncertainty, the best expected outcome.
Rational agent
In the “laws of thought” approach to AI, the emphasis was on correct inferences. Making
correct inferences is sometimes part of being a rational agent, because one way to act
rationally is to deduce that a given action is best and then to act on that conclusion. On the
other hand, there are ways of acting rationally that cannot be said to involve inference. For
example, recoiling from a hot stove is a reflex action that is usually more successful than a
slower action taken after careful deliberation.
All the skills needed for the Turing test also allow an agent to act rationally. Knowledge
representation and reasoning enable agents to reach good decisions. We need to be able to
generate comprehensible sentences in natural language to get by in a complex society. We
need learning not only for erudition, but also because it improves our ability to generate
effective behavior, especially in circ*mstances that are new.
The rational-agent approach to AI has two advantages over the other approaches. First, it is
more general than the “laws of thought” approach because correct inference is just one of
several possible mechanisms for achieving rationality. Second, it is more amenable to
scientific development. The standard of rationality is mathematically well defined and
completely general. We can often work back from this specification to derive agent designs
that provably achieve it—something that is largely impossible if the goal is to imitate human
behavior or thought processes.
For these reasons, the rational-agent approach to AI has prevailed throughout most of the
field’s history. In the early decades, rational agents were built on logical foundations and
formed definite plans to achieve specific goals. Later, methods based on probability theory
and machine learning allowed the creation of agents that could make decisions under
uncertainty to attain the best expected outcome. In a nutshell, AI has focused on the study and
construction of agents that do the right thing. What counts as the right thing is defined by the
objective that we provide to the agent. This general paradigm is so pervasive that we might
call it the standard model. It prevails not only in AI, but also in control theory, where a
controller minimizes a cost function; in operations research, where a policy maximizes a
sum of rewards; in statistics, where a decision rule minimizes a loss function; and in
economics, where a decision maker maximizes utility or some measure of social welfare.
Do the right thing
Standard model
We need to make one important refinement to the standard model to account for the fact
that perfect rationality—always taking the exactly optimal action—is not feasible in complex
environments. The computational demands are just too high. Chapters 5 and 17 deal
with the issue of limited rationality—acting appropriately when there is not enough time to
do all the computations one might like. However, perfect rationality often remains a good
starting point for theoretical analysis.
Limited rationality
1.1.5 Beneficial machines
The standard model has been a useful guide for AI research since its inception, but it is
probably not the right model in the long run. The reason is that the standard model assumes
that we will supply a fully specified objective to the machine.
For an artificially defined task such as chess or shortest-path computation, the task comes
with an objective built in—so the standard model is applicable. As we move into the real
world, however, it becomes more and more difficult to specify the objective completely and
correctly. For example, in designing a self-driving car, one might think that the objective is
to reach the destination safely. But driving along any road incurs a risk of injury due to other
errant drivers, equipment failure, and so on; thus, a strict goal of safety requires staying in
the garage. There is a tradeoff between making progress towards the destination and
incurring a risk of injury. How should this tradeoff be made? Furthermore, to what extent
can we allow the car to take actions that would annoy other drivers? How much should the
car moderate its acceleration, steering, and braking to avoid shaking up the passenger?
These kinds of questions are difficult to answer a priori. They are particularly problematic in
the general area of human–robot interaction, of which the self-driving car is one example.
The problem of achieving agreement between our true preferences and the objective we put
into the machine is called the value alignment problem: the values or objectives put into
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F90E.xhtml#P700101715800000000000000000F90E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124DA.xhtml#P70010171580000000000000000124DA
the machine must be aligned with those of the human. If we are developing an AI system in
the lab or in a simulator—as has been the case for most of the field’s history—there is an
easy fix for an incorrectly specified objective: reset the system, fix the objective, and try
again. As the field progresses towards increasingly capable intelligent systems that are
deployed in the real world, this approach is no longer viable. A system deployed with an
incorrect objective will have negative consequences. Moreover, the more intelligent the
system, the more negative the consequences.
Value alignment problem
Returning to the apparently unproblematic example of chess, consider what happens if the
machine is intelligent enough to reason and act beyond the confines of the chessboard. In
that case, it might attempt to increase its chances of winning by such ruses as hypnotizing or
blackmailing its opponent or bribing the audience to make rustling noises during its
opponent’s thinking time. It might also attempt to hijack additional computing power for
itself. These behaviors are not “unintelligent” or “insane”; they are a logical consequence of defining
winning as the sole objective for the machine.
3 In one of the first books on chess, Ruy Lopez (1561) wrote, “Always place the board so the sun is in your opponent’s eyes.”
It is impossible to anticipate all the ways in which a machine pursuing a fixed objective
might misbehave. There is good reason, then, to think that the standard model is
inadequate. We don’t want machines that are intelligent in the sense of pursuing their
objectives; we want them to pursue our objectives. If we cannot transfer those objectives
perfectly to the machine, then we need a new formulation—one in which the machine is
pursuing our objectives, but is necessarily uncertain as to what they are. When a machine
knows that it doesn’t know the complete objective, it has an incentive to act cautiously, to
ask permission, to learn more about our preferences through observation, and to defer to
human control. Ultimately, we want agents that are provably beneficial to humans. We will
return to this topic in Section 1.5 .
3
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016087
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EDBE.xhtml#P700101715800000000000000000EDBE
Provably beneficial
1.2 The Foundations of Artificial Intelligence
,In this section, we provide a brief history of the disciplines that contributed ideas,
viewpoints, and techniques to AI. Like any history, this one concentrates on a small number
of people, events, and ideas and ignores others that also were important. We organize the
history around a series of questions. We certainly would not wish to give the impression
that these questions are the only ones the disciplines address or that the disciplines have all
been working toward AI as their ultimate fruition.
1.2.1 Philosophy
Can formal rules be used to draw valid conclusions?
How does the mind arise from a physical brain?
Where does knowledge come from?
How does knowledge lead to action?
Aristotle (384–322 BCE) was the first to formulate a precise set of laws governing the rational
part of the mind. He developed an informal system of syllogisms for proper reasoning,
which in principle allowed one to generate conclusions mechanically, given initial premises.
Ramon Llull (c. 1232–1315) devised a system of reasoning published as Ars Magna or The
Great Art (1305). Llull tried to implement his system using an actual mechanical device: a set
of paper wheels that could be rotated into different permutations.
Around 1500, Leonardo da Vinci (1452–1519) designed but did not build a mechanical
calculator; recent reconstructions have shown the design to be functional. The first known
calculating machine was constructed around 1623 by the German scientist Wilhelm
Schickard (1592–1635). Blaise Pascal (1623–1662) built the Pascaline in 1642 and wrote that
it “produces effects which appear nearer to thought than all the actions of animals.”
Gottfried Wilhelm Leibniz (1646–1716) built a mechanical device intended to carry out
operations on concepts rather than numbers, but its scope was rather limited. In his 1651
book Leviathan, Thomas Hobbes (1588–1679) suggested the idea of a thinking machine, an
“artificial animal” in his words, arguing “For what is the heart but a spring; and the nerves,
but so many strings; and the joints, but so many wheels.” He also suggested that reasoning
was like numerical computation: “For ‘reason’ ... is nothing but ‘reckoning,’ that is adding
and subtracting.”
It’s one thing to say that the mind operates, at least in part, according to logical or numerical
rules, and to build physical systems that emulate some of those rules. It’s another to say that
the mind itself is such a physical system. René Descartes (1596–1650) gave the first clear
discussion of the distinction between mind and matter. He noted that a purely physical
conception of the mind seems to leave little room for free will. If the mind is governed
entirely by physical laws, then it has no more free will than a rock “deciding” to fall
downward. Descartes was a proponent of dualism. He held that there is a part of the human
mind (or soul or spirit) that is outside of nature, exempt from physical laws. Animals, on the
other hand, did not possess this dual quality; they could be treated as machines.
Dualism
An alternative to dualism is materialism, which holds that the brain’s operation according
to the laws of physics constitutes the mind. Free will is simply the way that the perception of
available choices appears to the choosing entity. The terms physicalism and naturalism are
also used to describe this view that stands in contrast to the supernatural.
Given a physical mind that manipulates knowledge, the next problem is to establish the
source of knowledge. The empiricism movement, starting with Francis Bacon’s (1561–1626)
Novum Organum, is characterized by a dictum of John Locke (1632–1704): “Nothing is in
the understanding, which was not first in the senses.”
4 The Novum Organum is an update of Aristotle’s Organon, or instrument of thought.
Empiricism
4
Induction
David Hume’s (1711–1776) A Treatise of Human Nature (Hume, 1739) proposed what is now
known as the principle of induction: that general rules are acquired by exposure to repeated
associations between their elements.
Building on the work of Ludwig Wittgenstein (1889–1951) and Bertrand Russell (1872–
1970), the famous Vienna Circle [2017], a group of philosophers and mathematicians
meeting in Vienna in the 1920s and 1930s, developed the doctrine of logical positivism.
This doctrine holds that all knowledge can be characterized by logical theories connected,
ultimately, to observation sentences that correspond to sensory inputs; thus logical
positivism combines rationalism and empiricism.
Logical positivism
Observation sentence
The confirmation theory of Rudolf Carnap (1891–1970) and Carl Hempel (1905–1997)
attempted to analyze the acquisition of knowledge from experience by quantifying the
degree of belief that should be assigned to logical sentences based on their connection to
observations that confirm or disconfirm them. Carnap’s book The Logical Structure of the
World (1928) was perhaps the first theory of mind as a computational process.
Confirmation theory
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015DBE
The final element in the philosophical picture of the mind is the connection between
knowledge and action. This question is vital to AI because intelligence requires action as
well as reasoning. Moreover, only by understanding how actions are justified can we
understand how to build an agent whose actions are justifiable (or rational).
Aristotle argued (in De Motu Animalium) that actions are justified by a logical connection
between goals and knowledge of the action’s outcome:
But how does it happen that thinking is sometimes accompanied by action and sometimes not,
sometimes by motion, and sometimes not? It looks as if almost the same thing happens as in the case
of reasoning and making inferences about unchanging objects. But in that case the end is a
speculative proposition … whereas here the conclusion which results from the two premises is an
action. … I need covering; a cloak is a covering. I need a cloak. What I need, I have to make; I need a
cloak. I have to make a cloak. And the conclusion, the “I have to make a cloak,” is an action.
In the Nicomachean Ethics (Book III. 3, 1112b), Aristotle further elaborates on this topic,
suggesting an algorithm:
We deliberate not about ends, but about means. For a doctor does not deliberate whether he shall
heal, nor an orator whether he shall persuade, … They assume the end and consider how and by
what means it is attained, and if it seems easily and best produced thereby; while if it is achieved by
one means only they consider how it will be achieved by this and by what means this will be
achieved, till they come to the first cause, … and what is last in the order of analysis seems to be first
in the order of becoming. And if we come on an impossibility, we give up the search, e.g., if we need
money and this cannot be got; but if a thing appears possible we try to do it.
Aristotle’s algorithm was implemented 2300 years later by Newell and Simon in their
General Problem Solver program. We would now call it a greedy regression planning
system (see Chapter 11 ). Methods based on logical planning to achieve definite goals
dominated the first few decades of theoretical research in AI.
Utility
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B83.xhtml#P7001017158000000000000000010B83
Thinking purely in terms of actions achieving goals is often useful but sometimes
inapplicable. For example, if there are several different ways to achieve a goal, there needs
to be some way to choose among them. More importantly, it may not be possible to achieve
a goal with certainty, but some action must still be taken. How then should one decide?
Antoine Arnauld (1662), analyzing the notion of rational decisions in gambling, proposed a
quantitative formula for maximizing the expected monetary
,value of the outcome. Later,
Daniel Bernoulli (1738) introduced the more general notion of utility to capture the internal,
subjective value of an outcome. The modern notion of rational decision making under
uncertainty involves maximizing expected utility, as explained in Chapter 16 .
In matters of ethics and public policy, a decision maker must consider the interests of
multiple individuals. Jeremy Bentham (1823) and John Stuart Mill (1863) promoted the idea
of utilitarianism: that rational decision making based on maximizing utility should apply to
all spheres of human activity, including public policy decisions made on behalf of many
individuals. Utilitarianism is a specific kind of consequentialism: the idea that what is right
and wrong is determined by the expected outcomes of an action.
Utilitarianism
In contrast, Immanuel Kant, in 1875 proposed a theory of rule-based or deontological
ethics, in which “doing the right thing” is determined not by outcomes but by universal
social laws that govern allowable actions, such as “don’t lie” or “don’t kill.” Thus, a utilitarian
could tell a white lie if the expected good outweighs the bad, but a Kantian would be bound
not to, because lying is inherently wrong. Mill acknowledged the value of rules, but
understood them as efficient decision procedures compiled from first-principles reasoning
about consequences. Many modern AI systems adopt exactly this approach.
Deontological ethics
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000155A6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015696
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015678
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016195
1.2.2 Mathematics
What are the formal rules to draw valid conclusions?
What can be computed?
How do we reason with uncertain information?
Philosophers staked out some of the fundamental ideas of AI, but the leap to a formal
science required the mathematization of logic and probability and the introduction of a new
branch of mathematics: computation.
The idea of formal logic can be traced back to the philosophers of ancient Greece, India,
and China, but its mathematical development really began with the work of George Boole
(1815–1864), who worked out the details of propositional, or Boolean, logic (Boole, 1847).
In 1879, Gottlob Frege (1848–1925) extended Boole’s logic to include objects and relations,
creating the first-order logic that is used today. In addition to its central role in the early
period of AI research, first-order logic motivated the work of Gödel and Turing that
underpinned computation itself, as we explain below.
5 Frege’s proposed notation for first-order logic—an arcane combination of textual and geometric features—never became popular.
Formal logic
The theory of probability can be seen as generalizing logic to situations with uncertain
information—a consideration of great importance for AI. Gerolamo Cardano (1501–1576)
first framed the idea of probability, describing it in terms of the possible outcomes of
gambling events. In 1654, Blaise Pascal (1623–1662), in a letter to Pierre Fermat (1601–
1665), showed how to predict the future of an unfinished gambling game and assign average
payoffs to the gamblers. Probability quickly became an invaluable part of the quantitative
sciences, helping to deal with uncertain measurements and incomplete theories. Jacob
Bernoulli (1654–1705, uncle of Daniel), Pierre Laplace (1749–1827), and others advanced
the theory and introduced new statistical methods. Thomas Bayes (1702–1761) proposed a
rule for updating probabilities in the light of new evidence; Bayes’ rule is a crucial tool for AI
systems.
5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001570D
Probability
Statistics
The formalization of probability, combined with the availability of data, led to the
emergence of statistics as a field. One of the first uses was John Graunt’s analysis of London
census data in 1662. Ronald Fisher is considered the first modern statistician (Fisher, 1922).
He brought together the ideas of probability, experiment design, analysis of data, and
computing—in 1919, he insisted that he couldn’t do his work without a mechanical
calculator called the MILLIONAIRE (the first calculator that could do multiplication), even
though the cost of the calculator was more than his annual salary (Ross, 2012).
The history of computation is as old as the history of numbers, but the first nontrivial
algorithm is thought to be Euclid’s algorithm for computing greatest common divisors. The
word algorithm comes from Muhammad ibn Musa al-Khwarizmi, a 9th century
mathematician, whose writings also introduced Arabic numerals and algebra to Europe.
Boole and others discussed algorithms for logical deduction, and, by the late 19th century,
efforts were under way to formalize general mathematical reasoning as logical deduction.
Algorithm
Kurt Gödel (1906–1978) showed that there exists an effective procedure to prove any true
statement in the first-order logic of Frege and Russell, but that first-order logic could not
capture the principle of mathematical induction needed to characterize the natural numbers.
In 1931, Gödel showed that limits on deduction do exist. His incompleteness theorem
showed that in any formal theory as strong as Peano arithmetic (the elementary theory of
natural numbers), there are necessarily true statements that have no proof within the
theory.
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AFA
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001644E
Incompleteness theorem
This fundamental result can also be interpreted as showing that some functions on the
integers cannot be represented by an algorithm—that is, they cannot be computed. This
motivated Alan Turing (1912–1954) to try to characterize exactly which functions are
computable—capable of being computed by an effective procedure. The Church–Turing
thesis proposes to identify the general notion of computability with functions computed by a
Turing machine (Turing, 1936). Turing also showed that there were some functions that no
Turing machine can compute. For example, no machine can tell in general whether a given
program will return an answer on a given input or run forever.
Computability
Although computability is important to an understanding of computation, the notion of
tractability has had an even greater impact on AI. Roughly speaking, a problem is called
intractable if the time required to solve instances of the problem grows exponentially with
the size of the instances. The distinction between polynomial and exponential growth in
complexity was first emphasized in the mid-1960s (Cobham, 1964; Edmonds, 1965). It is
important because exponential growth means that even moderately large instances cannot
be solved in any reasonable time.
Tractability
The theory of NP-completeness, pioneered by Cook (1971) and Karp (1972), provides a
basis for analyzing the tractability of problems: any problem class to which the class of NP-
complete problems can be reduced is likely to be intractable. (Although it has not been
proved that NP-complete problems are necessarily intractable, most theoreticians believe
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166D0
,https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000158AE
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015A6F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000158CE
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015E86
it.) These results contrast with the optimism with which the popular press greeted the first
computers—“Electronic Super-Brains” that were “Faster than Einstein!” Despite the
increasing speed of computers, careful use of resources and necessary imperfection will
characterize intelligent systems. Put crudely, the world is an extremely large problem
instance!
NP-completeness
1.2.3 Economics
How should we make decisions in accordance with our preferences?
How should we do this when others may not go along?
How should we do this when the payoff may be far in the future?
The science of economics originated in 1776, when Adam Smith (1723–1790) published An
Inquiry into the Nature and Causes of the Wealth of Nations. Smith proposed to analyze
economies as consisting of many individual agents attending to their own interests. Smith
was not, however, advocating financial greed as a moral position: his earlier (1759) book
The Theory of Moral Sentiments begins by pointing out that concern for the well-being of
others is an essential component of the interests of every individual.
Most people think of economics as being about money, and indeed the first mathematical
analysis of decisions under uncertainty, the maximum-expected-value formula of Arnauld
(1662), dealt with the monetary value of bets. Daniel Bernoulli (1738) noticed that this
formula didn’t seem to work well for larger amounts of money, such as investments in
maritime trading expeditions. He proposed instead a principle based on maximization of
expected utility, and explained human investment choices by proposing that the marginal
utility of an additional quantity of money diminished as one acquired more money.
Léon Walras (pronounced “Valrasse”) (1834–1910) gave utility theory a more general
foundation in terms of preferences between gambles on any outcomes (not just monetary
outcomes). The theory was improved by Ramsey (1931) and later by John von Neumann
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000155A6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015696
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000163C3
and Oskar Morgenstern in their book The Theory of Games and Economic Behavior (1944).
Economics is no longer the study of money; rather it is the study of desires and preferences.
Decision theory, which combines probability theory with utility theory, provides a formal
and complete framework for individual decisions (economic or otherwise) made under
uncertainty—that is, in cases where probabilistic descriptions appropriately capture the
decision maker’s environment. This is suitable for “large” economies where each agent need
pay no attention to the actions of other agents as individuals. For “small” economies, the
situation is much more like a game: the actions of one player can significantly affect the
utility of another (either positively or negatively). Von Neumann and Morgenstern’s
development of game theory (see also Luce and Raiffa, 1957) included the surprising result
that, for some games, a rational agent should adopt policies that are (or least appear to be)
randomized. Unlike decision theory, game theory does not offer an unambiguous
prescription for selecting actions. In AI, decisions involving multiple agents are studied
under the heading of multiagent systems (Chapter 18 ).
Decision theory
Economists, with some exceptions, did not address the third question listed above: how to
make rational decisions when payoffs from actions are not immediate but instead result
from several actions taken in sequence. This topic was pursued in the field of operations
research, which emerged in World War II from efforts in Britain to optimize radar
installations, and later found innumerable civilian applications. The work of Richard
Bellman (1957) formalized a class of sequential decision problems called Markov decision
processes, which we study in Chapter 17 and, under the heading of reinforcement
learning, in Chapter 22 .
Operations research
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016737
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000160AF
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B87.xhtml#P7001017158000000000000000012B87
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001566A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124DA.xhtml#P70010171580000000000000000124DA
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4
Work in economics and operations research has contributed much to our notion of rational
agents, yet for many years AI research developed along entirely separate paths. One reason
was the apparent complexity of making rational decisions. The pioneering AI researcher
Herbert Simon (1916–2001) won the Nobel Prize in economics in 1978 for his early work
showing that models based on satisficing—making decisions that are “good enough,” rather
than laboriously calculating an optimal decision—gave a better description of actual human
behavior (Simon, 1947). Since the 1990s, there has been a resurgence of interest in decision-
theoretic techniques for AI.
Satisficing
1.2.4 Neuroscience
How do brains process information?
Neuroscience is the study of the nervous system, particularly the brain. Although the exact
way in which the brain enables thought is one of the great mysteries of science, the fact that
it does enable thought has been appreciated for thousands of years because of the evidence
that strong blows to the head can lead to mental incapacitation. It has also long been known
that human brains are somehow different; in about 335 BCE Aristotle wrote, “Of all the
animals, man has the largest brain in proportion to his size.” Still, it was not until the
middle of the 18th century that the brain was widely recognized as the seat of
consciousness. Before then, candidate locations included the heart and the spleen.
6 It has since been discovered that the tree shrew and some bird species exceed the human brain/body ratio.
Neuroscience
Paul Broca’s (1824–1880) investigation of aphasia (speech deficit) in brain-damaged patients
in 1861 initiated the study of the brain’s functional organization by identifying a localized
6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016586
area in the left hemisphere—now called Broca’s area—that is responsible for speech
production. By that time, it was known that the brain consisted largely of nerve cells, or
neurons, but it was not until 1873 that Camillo Golgi (1843–1926) developed a staining
technique allowing the observation of individual neurons (see Figure 1.1 ). This technique
was used by Santiago Ramon y Cajal (1852–1934) in
,his pioneering studies of neuronal
organization. It is now widely accepted that cognitive functions result from the
electrochemical operation of these structures. That is, a collection of simple cells can lead to
thought, action, and consciousness. In the pithy words of John Searle (1992), brains cause
minds.
7 Many cite Alexander Hood (1824) as a possible prior source.
8 Golgi persisted in his belief that the brain’s functions were carried out primarily in a continuous medium in which neurons were
embedded, whereas Cajal propounded the “neuronal doctrine.” The two shared the Nobel Prize in 1906 but gave mutually
antagonistic acceptance speeches.
Figure 1.1
The parts of a nerve cell or neuron. Each neuron consists of a cell body, or soma, that contains a cell
nucleus. Branching out from the cell body are a number of fibers called dendrites and a single long fiber
called the axon. The axon stretches out for a long distance, much longer than the scale in this diagram
indicates. Typically, an axon is 1 cm long (100 times the diameter of the cell body), but can reach up to 1
meter. A neuron makes connections with 10 to 100,000 other neurons at junctions called synapses.
Signals are propagated from neuron to neuron by a complicated electrochemical reaction. The signals
control brain activity in the short term and also enable long-term changes in the connectivity of neurons.
These mechanisms are thought to form the basis for learning in the brain. Most information processing
7
8
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001651B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015D6C
goes on in the cerebral cortex, the outer layer of the brain. The basic organizational unit appears to be a
column of tissue about 0.5 mm in diameter, containing about 20,000 neurons and extending the full
depth of the cortex (about 4 mm in humans).
Neuron
We now have some data on the mapping between areas of the brain and the parts of the
body that they control or from which they receive sensory input. Such mappings are able to
change radically over the course of a few weeks, and some animals seem to have multiple
maps. Moreover, we do not fully understand how other areas can take over functions when
one area is damaged. There is almost no theory on how an individual memory is stored or
on how higher-level cognitive functions operate.
The measurement of intact brain activity began in 1929 with the invention by Hans Berger
of the electroencephalograph (EEG). The development of functional magnetic resonance
imaging (fMRI) (Ogawa et al., 1990; Cabeza and Nyberg, 2001) is giving neuroscientists
unprecedentedly detailed images of brain activity, enabling measurements that correspond
in interesting ways to ongoing cognitive processes. These are augmented by advances in
single-cell electrical recording of neuron activity and by the methods of optogenetics (Crick,
1999; Optogenetics Zemelman et al., 2002; Han and Boyden, 2007), which allow both
measurement and control of individual neurons modified to be light-sensitive.
Optogenetics
The development of brain–machine interfaces (Lebedev and Nicolelis, 2006) for both
sensing and motor control not only promises to restore function to disabled individuals, but
also sheds light on many aspects of neural systems. A remarkable finding from this work is
that the brain is able to adjust itself to interface successfully with an external device, treating
it in effect like another sensory organ or limb.
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001629F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000157E4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000158FC
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016861
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015CA9
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015FF2
Brain–machine interface
Brains and digital computers have somewhat different properties. Figure 1.2 shows that
computers have a cycle time that is a million times faster than a brain. The brain makes up
for that with far more storage and interconnection than even a high-end personal computer,
although the largest supercomputers match the brain on some metrics. Futurists make much
of these numbers, pointing to an approaching singularity at which computers reach a
superhuman level of performance (Vinge, 1993; Kurzweil, 2005; Doctorow and Stross,
2012), and then rapidly improve themselves even further. But the comparisons of raw
numbers are not especially informative. Even with a computer of virtually unlimited
capacity, we still require further conceptual breakthroughs in our understanding of
intelligence (see Chapter 28 ). Crudely put, without the right theory, faster machines just
give you the wrong answer faster.
Singularity
Figure 1.2
A crude comparison of a leading supercomputer, Summit (Feldman, 2017); a typical personal computer of
2019; and the human brain. Human brain power has not changed much in thousands of years, whereas
supercomputers have improved from megaFLOPs in the 1960s to gigaFLOPs in the 1980s, teraFLOPs in
the 1990s, petaFLOPs in 2008, and exaFLOPs in 2018 ( floating point operations per
second).
1.2.5 Psychology
1exaFLOP = 10
18
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016722
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015FA2
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015A0C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000152FF.xhtml#P70010171580000000000000000152FF
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AC5
How do humans and animals think and act?
The origins of scientific psychology are usually traced to the work of the German physicist
Hermann von Helmholtz (1821–1894) and his student Wilhelm Wundt (1832–1920).
Helmholtz applied the scientific method to the study of human vision, and his Handbook of
Physiological Optics has been described as “the single most important treatise on the physics
and physiology of human vision” (Nalwa, 1993, p.15). In 1879, Wundt opened the first
laboratory of experimental psychology, at the University of Leipzig. Wundt insisted on
carefully controlled experiments in which his workers would perform a perceptual or
associative task while introspecting on their thought processes. The careful controls went a
long way toward making psychology a science, but the subjective nature of the data made it
unlikely that experimenters would ever disconfirm their own theories.
Biologists studying animal behavior, on the other hand, lacked introspective data and
developed an objective methodology, as described byH. S. Jennings (1906) in his influential
work Behavior of the Lower Organisms. Applying this viewpoint to humans, the behaviorism
movement, led by John Watson (1878–1958), rejected any theory involving mental
processes on the grounds that introspection could not provide reliable evidence.
Behaviorists insisted on studying only objective
,measures of the percepts (or stimulus) given
to an animal and its resulting actions (or response). Behaviorism discovered a lot about rats
and pigeons but had less success at understanding humans.
Behaviorism
Cognitive psychology, which views the brain as an information-processing device, can be
traced back at least to the works of William James (1842–1910). Helmholtz also insisted that
perception involved a form of unconscious logical inference. The cognitive viewpoint was
largely eclipsed by behaviorism in the United States, but at Cambridge’s Applied Psychology
Unit, directed by Frederic Bartlett (1886–1969), cognitive modeling was able to flourish. The
Nature of Explanation, by Bartlett’s student and successor Kenneth Craik (1943), forcefully
reestablished the legitimacy of such “mental” terms as beliefs and goals, arguing that they
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016236
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015E11
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000158F4
are just as scientific as, say, using pressure and temperature to talk about gases, despite
gasses being made of molecules that have neither.
Cognitive psychology
Craik specified the three key steps of a knowledge-based agent: (1) the stimulus must be
translated into an internal representation, (2) the representation is manipulated by cognitive
processes to derive new internal representations, and (3) these are in turn retranslated back
into action. He clearly explained why this was a good design for an agent:
If the organism carries a “small-scale model” of external reality and of its own possible actions within
its head, it is able to try out various alternatives, conclude which is the best of them, react to future
situations before they arise, utilize the knowledge of past events in dealing with the present and
future, and in every way to react in a much fuller, safer, and more competent manner to the
emergencies which face it. (Craik, 1943)
After Craik’s death in a bicycle accident in 1945, his work was continued by Donald
Broadbent, whose book Perception and Communication (1958) was one of the first works to
model psychological phenomena as information processing. Meanwhile, in the United
States, the development of computer modeling led to the creation of the field of cognitive
science. The field can be said to have started at a workshop in September 1956 at MIT—just
two months after the conference at which AI itself was “born.”
At the workshop, George Miller presented The Magic Number Seven, Noam Chomsky
presented Three Models of Language, and Allen Newell and Herbert Simon presented The
Logic Theory Machine. These three influential papers showed how computer models could be
used to address the psychology of memory, language, and logical thinking, respectively. It is
now a common (although far from universal) view among psychologists that “a cognitive
theory should be like a computer program” (Anderson, 1980); that is, it should describe the
operation of a cognitive function in terms of the processing of information.
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000158F4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015570
For purposes of this review, we will count the field of human–computer interaction (HCI)
under psychology. Doug Engelbart, one of the pioneers of HCI, championed the idea of
intelligence augmentation—IA rather than AI. He believed that computers should augment
human abilities rather than automate away human tasks. In 1968, Engelbart’s “mother of all
demos” showed off for the first time the computer mouse, a windowing system, hypertext,
and video conferencing—all in an effort to demonstrate what human knowledge workers
could collectively accomplish with some intelligence augmentation.
Intelligence augmentation
Today we are more likely to see IA and AI as two sides of the same coin, with the former
emphasizing human control and the latter emphasizing intelligent behavior on the part of
the machine. Both are needed for machines to be useful to humans.
1.2.6 Computer engineering
How can we build an efficient computer?
The modern digital electronic computer was invented independently and almost
simultaneously by scientists in three countries embattled in World War II. The first
operational computer was the electromechanical Heath Robinson, built in 1943 by Alan
Turing’s team for a single purpose: deciphering German messages. In 1943, the same group
developed the Colossus, a powerful general-purpose machine based on vacuum tubes.
The first operational programmable computer was the Z-3, the invention of Konrad Zuse in
Germany in 1941. Zuse also invented floating-point numbers and the first high-level
programming language, Plankalkül. The first electronic computer, the ABC, was assembled
by John Atanasoff and his student Clifford Berry between 1940 and 1942 at Iowa State
University. Atanasoff’s research received little support or recognition; it was the ENIAC,
developed as part of a secret military project at the University of Pennsylvania by a team
including John Mauchly and J. Presper Eckert, that proved to be the most influential
forerunner of modern computers.
9
10
9 A complex machine named after a British cartoonist who depicted whimsical and absurdly complicated contraptions for everyday
tasks such as buttering toast.
10 In the postwar period, Turing wanted to use these computers for AI research—for example, he created an outline of the first chess
program (Turing et al., 1953) —but the British government blocked this research.
Moore’s law
Since that time, each generation of computer hardware has brought an increase in speed
and capacity and a decrease in price—a trend captured in Moore’s law. Performance
doubled every 18 months or so until around 2005, when power dissipation problems led
manufacturers to start multiplying the number of CPU cores rather than the clock speed.
Current expectations are that future increases in functionality will come from massive
parallelism—a curious convergence with the properties of the brain. We also see new
hardware designs based on the idea that in dealing with an uncertain world, we don’t need
64 bits of precision in our numbers; just 16 bits (as in the bfloat16 format) or even 8 bits will
be enough, and will enable faster processing.
We are just beginning to see hardware tuned for AI applications, such as the graphics
processing unit (GPU), tensor processing unit (TPU), and wafer scale engine (WSE). From
the 1960s to about 2012, the amount of computing power used to train top machine learning
applications followed Moore’s law. Beginning in 2012, things changed: from 2012 to 2018
there was a 300,000-fold increase, which works out to a doubling every 100 days or so
(Amodei and Hernandez, 2018). A machine learning model that took a full day to train in
2014 takes only two minutes in 2018 (Ying et al., 2018). Although it is not yet practical,
quantum computing holds out the promise of far greater accelerations for some important
subclasses of AI algorithms.
Quantum computing
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166D6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001556A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016834
Of course, there were calculating
,devices before the electronic computer. The earliest
automated machines, dating from the 17th century, were discussed on 6. The first
programmable machine was a loom, devised in 1805 by Joseph Marie Jacquard (1752–1834),
that used punched cards to store instructions for the pattern to be woven.
In the mid-19th century, Charles Babbage (1792–1871) designed two computing machines,
neither of which he completed. The Difference Engine was intended to compute
mathematical tables for engineering and scientific projects. It was finally built and shown to
work in 1991 (Swade, 2000). Babbage’s Analytical Engine was far more ambitious: it
included addressable memory, stored programs based on Jacquard’s punched cards, and
conditional jumps. It was the first machine capable of universal computation.
Babbage’s colleague Ada Lovelace, daughter of the poet Lord Byron, understood its
potential, describing it as “a thinking or ... a reasoning machine,” one capable of reasoning
about “all subjects in the universe” (Lovelace, 1843). She also anticipated AI’s hype cycles,
writing, “It is desirable to guard against the possibility of exaggerated ideas that might arise
as to the powers of the Analytical Engine.” Unfortunately, Babbage’s machines and
Lovelace’s ideas were largely forgotten.
AI also owes a debt to the software side of computer science, which has supplied the
operating systems, programming languages, and tools needed to write modern programs
(and papers about them). But this is one area where the debt has been repaid: work in AI
has pioneered many ideas that have made their way back to mainstream computer science,
including time sharing, interactive interpreters, personal computers with windows and mice,
rapid development environments, the linked-list data type, automatic storage management,
and key concepts of symbolic, functional, declarative, and object-oriented programming.
1.2.7 Control theory and cybernetics
How can artifacts operate under their own control?
Ktesibios of Alexandria (c. 250 BCE) built the first self-controlling machine: a water clock
with a regulator that maintained a constant flow rate. This invention changed the definition
of what an artifact could do. Previously, only living things could modify their behavior in
response to changes in the environment. Other examples of self-regulating feedback control
systems include the steam engine governor, created by James Watt (1736–1819), and the
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001663B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001608F
thermostat, invented by Cornelis Drebbel (1572–1633), who also invented the submarine.
James Clerk Maxwell (1868) initiated the mathematical theory of control systems.
A central figure in the post-war development of control theory was Norbert Wiener (1894–
1964). Wiener was a brilliant mathematician who worked with Bertrand Russell, among
others, before developing an interest in biological and mechanical control systems and their
connection to cognition. Like Craik (who also used control systems as psychological
models), Wiener and his colleagues Arturo Rosenblueth and Julian Bigelow challenged the
behaviorist orthodoxy (Rosenblueth et al., 1943). They viewed purposive behavior as arising
from a regulatory mechanism trying to minimize “error”—the difference between current
state and goal state. In the late 1940s, Wiener, along with Warren McCulloch, Walter Pitts,
and John von Neumann, organized a series of influential conferences that explored the new
mathematical and computational models of cognition. Wiener’s book Cybernetics (1948)
became a bestseller and awoke the public to the possibility of artificially intelligent
machines.
Control theory
Cybernetics
Meanwhile, in Britain, W. Ross Ashby pioneered similar ideas (Ashby, 1940). Ashby, Alan
Turing, Grey Walter, and others formed the Ratio Club for “those who had Wiener’s ideas
before Wiener’s book appeared.” Ashby’s Design for a Brain 1948, 1952 elaborated on his
idea that intelligence could be created by the use of homeostatic devices containing
appropriate feedback loops to achieve stable adaptive behavior.
Homeostatic
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016120
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016447
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167AB
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000155B7
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000155B9
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000155BB
Modern control theory, especially the branch known as stochastic optimal control, has as its
goal the design of systems that maximize a cost function over time. This roughly matches
the standard model of AI: designing systems that behave optimally. Why, then, are AI and
control theory two different fields, despite the close connections among their founders? The
answer lies in the close coupling between the mathematical techniques that were familiar to
the participants and the corresponding sets of problems that were encompassed in each
world view. Calculus and matrix algebra, the tools of control theory, lend themselves to
systems that are describable by fixed sets of continuous variables, whereas AI was founded
in part as a way to escape from these perceived limitations. The tools of logical inference
and computation allowed AI researchers to consider problems such as language, vision, and
symbolic planning that fell completely outside the control theorist’s purview.
Cost function
1.2.8 Linguistics
How does language relate to thought?
In 1957, B. F. Skinner published Verbal Behavior. This was a comprehensive, detailed
account of the behaviorist approach to language learning, written by the foremost expert in
the field. But curiously, a review of the book became as well known as the book itself, and
served to almost kill off interest in behaviorism. The author of the review was the linguist
Noam Chomsky, who had just published a book on his own theory, Syntactic Structures.
Chomsky pointed out that the behaviorist theory did not address the notion of creativity in
language—it did not explain how children could understand and make up sentences that
they had never heard before. Chomsky’s theory—based on syntactic models going back to
the Indian linguist Panini (c. 350 BCE)—could explain this, and unlike previous theories, it
was formal enough that it could in principle be programmed.
Computational linguistics
Modern linguistics and AI, then, were “born” at about the same time, and grew up together,
intersecting in a hybrid field called computational linguistics or natural language
processing. The problem of understanding language turned out to be considerably more
complex than it seemed in 1957. Understanding language requires an understanding of the
subject matter and context, not just an understanding of the structure of sentences. This
might seem obvious, but it was not widely appreciated until the 1960s. Much of the early
work in knowledge representation (the study of how to put knowledge into a form that a
computer can reason with) was tied to language and informed by research in linguistics,
which was connected in turn to decades of work on the philosophical analysis of language.
1.3 The History of Artificial Intelligence
,One quick way to summarize the milestones in AI history is to list the Turing Award
winners: Marvin Minsky (1969) and John McCarthy (1971) for defining the foundations of
the field based on representation and reasoning; Ed Feigenbaum and Raj Reddy (1994) for
developing expert systems that encode human knowledge to solve real-world problems;
Judea Pearl (2011) for developing probabilistic reasoning techniques that deal with
uncertainty in a principled manner; and finally Yoshua Bengio, Geoffrey Hinton, and Yann
LeCun (2019) for making “deep learning” (multilayer neural networks) a critical part of
modern computing. The rest of this section goes into more detail on each phase of AI
history.
1.3.1 The inception of artificial intelligence (1943–1956)
The first work that is now generally recognized as AI was done by Warren McCulloch and
Walter Pitts (1943). Inspired by the mathematical modeling work of Pitts’s advisor Nicolas
(1936, 1938), they drew on three sources: knowledge of the basic physiology and function of
neurons in the brain; a formal analysis of propositional logic due to Russell and Whitehead;
and Turing’s theory of computation. They proposed a model of artificial neurons in which
each neuron is characterized as being “on” or “off,” with a switch to “on” occurring in
response to stimulation by a sufficient number of neighboring neurons. The state of a
neuron was conceived of as “factually equivalent to a proposition which proposed its
adequate stimulus.” They showed, for example, that any computable function could be
computed by some network of connected neurons, and that all the logical connectives (AND,
OR, NOT, etc.) could be implemented by simple network structures. McCulloch and Pitts also
suggested that suitably defined networks could learn. Donald Hebb (1949) demonstrated a
simple updating rule for modifying the connection strengths between neurons. His rule,
now called Hebbian learning, remains an influential model to this day.
Hebbian learning
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000161A7
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016146
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015CFB
Two undergraduate students at Harvard, Marvin Minsky (1927–2016) and Dean Edmonds,
built the first neural network computer in 1950. The SNARC, as it was called, used 3000
vacuum tubes and a surplus automatic pilot mechanism from a B-24 bomber to simulate a
network of 40 neurons. Later, at Princeton, Minsky studied universal computation in neural
networks. His Ph.D. committee was skeptical about whether this kind of work should be
considered mathematics, but von Neumann reportedly said, “If it isn’t now, it will be
someday.”
There were a number of other examples of early work that can be characterized as AI,
including two checkers-playing programs developed independently in 1952 by Christopher
Strachey at the University of Manchester and by Arthur Samuel at IBM. However, Alan
Turing’s vision was the most influential. He gave lectures on the topic as early as 1947 at the
London Mathematical Society and articulated a persuasive agenda in his 1950 article
“Computing Machinery and Intelligence.” Therein, he introduced the Turing test, machine
learning, genetic algorithms, and reinforcement learning. He dealt with many of the
objections raised to the possibility of AI, as described in Chapter 27 . He also suggested
that it would be easier to create human-level AI by developing learning algorithms and then
teaching the machine rather than by programming its intelligence by hand. In subsequent
lectures he warned that achieving this goal might not be the best thing for the human race.
In 1955, John McCarthy of Dartmouth College convinced Minsky, Claude Shannon, and
Nathaniel Rochester to help him bring together U.S. researchers interested in automata
theory, neural nets, and the study of intelligence. They organized a two-month workshop at
Dartmouth in the summer of 1956. There were 10 attendees in all, including Allen Newell
and Herbert Simon from Carnegie Tech, Trenchard More from Princeton, Arthur Samuel
from IBM, and Ray Solomonoff and Oliver Selfridge from MIT. The proposal states:
11 Now Carnegie Mellon University (CMU).
12 This was the first official usage of McCarthy’s term artificial intelligence. Perhaps “computational rationality” would have been more
precise and less threatening, but “AI” has stuck. At the 50th anniversary of the Dartmouth conference, McCarthy stated that he
resisted the terms “computer” or “computational” in deference to Norbert Wiener, who was promoting analog cybernetic devices
rather than digital computers.
We propose that a 2 month, 10 man study of artificial intelligence be carried out during the summer
of 1956 at Dartmouth College in Hanover, New Hampshire. The study is to proceed on the basis of
the conjecture that every aspect of learning or any other feature of intelligence can in principle be so
11
12
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001512B.xhtml#P700101715800000000000000001512B
precisely described that a machine can be made to simulate it. An attempt will be made to find how
to make machines use language, form abstractions and concepts, solve kinds of problems now
reserved for humans, and improve themselves. We think that a significant advance can be made in
one or more of these problems if a carefully selected group of scientists work on it together for a
summer.
Despite this optimistic prediction, the Dartmouth workshop did not lead to any
breakthroughs. Newell and Simon presented perhaps the most mature work, a mathematical
theorem-proving system called the Logic Theorist (LT). Simon claimed, “We have invented
a computer program capable of thinking non-numerically, and thereby solved the venerable
mind–body problem.” Soon after the workshop, the program was able to prove most of
the theorems in Chapter 2 of Russell and Whitehead’s Principia Mathematica. Russell was
reportedly delighted when told that LT had come up with a proof for one theorem that was
shorter than the one in Principia. The editors of the Journal of Symbolic Logic were less
impressed; they rejected a paper coauthored by Newell, Simon, and Logic Theorist.
13 Newell and Simon also invented a list-processing language, IPL, to write LT. They had no compiler and translated it into machine
code by hand. To avoid errors, they worked in parallel, calling out binary numbers to each other as they wrote each instruction to
make sure they agreed.
1.3.2 Early enthusiasm, great expectations (1952–1969)
The intellectual establishment of the 1950s, by and large, preferred to believe that “a
machine can never do .” (See Chapter 27 for a long list of ’s gathered by Turing.) AI
researchers naturally responded by demonstrating one after another. They focused in
particular on tasks considered indicative of intelligence in humans, including games,
puzzles, mathematics, and IQ tests. John McCarthy referred to this period as the “Look, Ma,
no hands!” era.
Newell and Simon followed up their success with LT with the General Problem Solver, or
GPS. Unlike LT, this program was designed from the start to imitate human problem-solving
protocols. Within the limited class of puzzles it could handle, it turned out that the order in
which the program considered subgoals and possible actions was similar to that in which
humans approached the same problems. Thus, GPS was probably the first program to
embody the “thinking humanly” approach. The success of GPS and subsequent programs as
models of cognition led Newell and Simon (1976) to formulate the famous physical symbol
system hypothesis, which
,states that “a physical symbol system has the necessary and
13
X X
X
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE4C.xhtml#P700101715800000000000000000EE4C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001512B.xhtml#P700101715800000000000000001512B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016254
sufficient means for general intelligent action.” What they meant is that any system (human
or machine) exhibiting intelligence must operate by manipulating data structures composed
of symbols. We will see later that this hypothesis has been challenged from many directions.
Physical symbol system
At IBM, Nathaniel Rochester and his colleagues produced some of the first AI programs.
Herbert Gelernter (1959) constructed the Geometry Theorem Prover, which was able to
prove theorems that many students of mathematics would find quite tricky. This work was a
precursor of modern mathematical theorem provers.
Of all the exploratory work done during this period, perhaps the most influential in the long
run was that of Arthur Samuel on checkers (draughts). Using methods that we now call
reinforcement learning (see Chapter 22 ), Samuel’s programs learned to play at a strong
amateur level. He thereby disproved the idea that computers can do only what they are told
to: his program quickly learned to play a better game than its creator. The program was
demonstrated on television in 1956, creating a strong impression. Like Turing, Samuel had
trouble finding computer time. Working at night, he used machines that were still on the
testing floor at IBM’s manufacturing plant. Samuel’s program was the precursor of later
systems such as TD-GAMMON (Tesauro, 1992), which was among the world’s best
backgammon players, and ALPHAGO (Silver et al., 2016), which shocked the world by
defeating the human world champion at Go (see Chapter 5 ).
In 1958, John McCarthy made two important contributions to AI. In MIT AI Lab Memo No.
1, he defined the high-level language Lisp, which was to become the dominant AI
programming language for the next 30 years. In a paper entitled Programs with Common
Sense, he advanced a conceptual proposal for AI systems based on knowledge and
reasoning. The paper describes the Advice Taker, a hypothetical program that would
embody general knowledge of the world and could use it to derive plans of action. The
concept was illustrated with simple logical axioms that suffice to generate a plan to drive to
the airport. The program was also designed to accept new axioms in the normal course of
operation, thereby allowing it to achieve competence in new areas without being
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015BA5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001667D
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016577
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F90E.xhtml#P700101715800000000000000000F90E
reprogrammed. The Advice Taker thus embodied the central principles of knowledge
representation and reasoning: that it is useful to have a formal, explicit representation of the
world and its workings and to be able to manipulate that representation with deductive
processes. The paper influenced the course of AI and remains relevant today.
Lisp
1958 also marked the year that Marvin Minsky moved to MIT. His initial collaboration with
McCarthy did not last, however. McCarthy stressed representation and reasoning in formal
logic, whereas Minsky was more interested in getting programs to work and eventually
developed an anti-logic outlook. In 1963, McCarthy started the AI lab at Stanford. His plan
to use logic to build the ultimate Advice Taker was advanced by J. A. Robinson’s discovery
in 1965 of the resolution method (a complete theorem-proving algorithm for first-order
logic; see Chapter 9 ). Work at Stanford emphasized general-purpose methods for logical
reasoning. Applications of logic included Cordell Green’s question-answering and planning
systems (Green, 1969b) and the Shakey robotics project at the Stanford Research Institute
(SRI). The latter project, discussed further in Chapter 26 , was the first to demonstrate the
complete integration of logical reasoning and physical activity.
Microworld
At MIT, Minsky supervised a series of students who chose limited problems that appeared to
require intelligence to solve. These limited domains became known as microworlds. James
Slagle’s SAINT program (1963) was able to solve closed-form calculus integration problems
typical of first-year college courses. Tom Evans’s ANALOGY program (1968) solved geometric
analogy problems that appear in IQ tests. Daniel Bobrow’s STUDENT program (1967) solved
algebra story problems, such as the following:
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001063E.xhtml#P700101715800000000000000001063E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015C5E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C8B.xhtml#P7001017158000000000000000014C8B
If the number of customers Tom gets is twice the square of 20 percent of the number of
advertisem*nts he runs, and the number of advertisem*nts he runs is 45, what is the number of
customers Tom gets?
The most famous microworld is the blocks world, which consists of a set of solid blocks
placed on a tabletop (or more often, a simulation of a tabletop), as shown in Figure 1.3 . A
typical task in this world is to rearrange the blocks in a certain way, using a robot hand that
can pick up one block at a time. The blocks world was home to the vision project of David
Huffman (1971), the vision and constraint-propagation work of David Waltz (1975), the
learning theory of Patrick Winston (1970), the natural-language-understanding program of
Terry Winograd (1972), and the planner of Scott Fahlman (1974).
Figure 1.3
A scene from the blocks world. SHRDLU (Winograd, 1972) has just completed the command “Find a block
which is taller than the one you are holding and put it in the box.”
Blocks world
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015DB6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001674B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167DA
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167D8
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AB1
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167D8
Early work building on the neural networks of McCulloch and Pitts also flourished. The
work of Shmuel Winograd and Jack Cowan (1963) showed how a large number of elements
could collectively represent an individual concept, with a corresponding increase in
robustness and parallelism. Hebb’s learning methods were enhanced by Bernie Widrow
(Widrow and
,Hoff, 1960; Widrow, 1962), who called his networks adalines, and by Frank
Rosenblatt (1962) with his perceptrons. The perceptron convergence theorem (Block et al.,
1962) says that the learning algorithm can adjust the connection strengths of a perceptron to
match any input data, provided such a match exists.
1.3.3 A dose of reality (1966–1973)
From the beginning, AI researchers were not shy about making predictions of their coming
successes. The following statement by Herbert Simon in 1957 is often quoted:
It is not my aim to surprise or shock you—but the simplest way I can summarize is to say that there
are now in the world machines that think, that learn and that create. Moreover, their ability to do
these things is going to increase rapidly until—in a visible future—the range of problems they can
handle will be coextensive with the range to which the human mind has been applied.
The term “visible future” is vague, but Simon also made more concrete predictions: that
within 10 years a computer would be chess champion and a significant mathematical
theorem would be proved by machine. These predictions came true (or approximately true)
within 40 years rather than 10. Simon’s overconfidence was due to the promising
performance of early AI systems on simple examples. In almost all cases, however, these
early systems failed on more difficult problems.
There were two main reasons for this failure. The first was that many early AI systems were
based primarily on “informed introspection” as to how humans perform a task, rather than
on a careful analysis of the task, what it means to be a solution, and what an algorithm
would need to do to reliably produce such solutions.
The second reason for failure was a lack of appreciation of the intractability of many of the
problems that AI was attempting to solve. Most of the early problem-solving systems
worked by trying out different combinations of steps until the solution was found. This
strategy worked initially because microworlds contained very few objects and hence very
few possible actions and very short solution sequences. Before the theory of computational
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167D6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167A3
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167A1
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016443
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000156E5
complexity was developed, it was widely thought that “scaling up” to larger problems was
simply a matter of faster hardware and larger memories. The optimism that accompanied
the development of resolution theorem proving, for example, was soon dampened when
researchers failed to prove theorems involving more than a few dozen facts. The fact that a
program can find a solution in principle does not mean that the program contains any of the
mechanisms needed to find it in practice.
The illusion of unlimited computational power was not confined to problem-solving
programs. Early experiments in machine evolution (now called genetic programming)
(Fried-Machine evolutionberg, 1958; Friedberg et al., 1959) were based on the undoubtedly
correct belief that by making an appropriate series of small mutations to a machine-code
program, one can generate a program with good performance for any particular task. The
idea, then, was to try random mutations with a selection process to preserve mutations that
seemed useful. Despite thousands of hours of CPU time, almost no progress was
demonstrated.
Machine evolution
Failure to come to grips with the “combinatorial explosion” was one of the main criticisms of
AI contained in the Lighthill report (Lighthill, 1973), which formed the basis for the decision
by the British government to end support for AI research in all but two universities. (Oral
tradition paints a somewhat different and more colorful picture, with political ambitions and
personal animosities whose description is beside the point.)
A third difficulty arose because of some fundamental limitations on the basic structures
being used to generate intelligent behavior. For example, Minsky and Papert’s book
Perceptrons (1969) proved that, although perceptrons (a simple form of neural network)
could be shown to learn anything they were capable of representing, they could represent
very little. In particular, a two-input perceptron could not be trained to recognize when its
two inputs were different. Although their results did not apply to more complex, multilayer
networks, research funding for neural-net research soon dwindled to almost nothing.
Ironically, the new back-propagation learning algorithms that were to cause an enormous
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015B4A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015B4C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016049
resurgence in neural-net research in the late 1980s and again in the 2010s had already been
developed in other contexts in the early 1960s (Kelley, 1960; Bryson, 1962).
1.3.4 Expert systems (1969–1986)
The picture of problem solving that had arisen during the first decade of AI research was of
a general-purpose search mechanism trying to string together elementary reasoning steps to
find complete solutions. Such approaches have been called weak methods because,
although general, they do not scale up to large or difficult problem instances. The
alternative to weak methods is to use more powerful, domain-specific knowledge that
allows larger reasoning steps and can more easily handle typically occurring cases in narrow
areas of expertise. One might say that to solve a hard problem, you have to almost know the
answer already.
Weak method
The DENDRAL program (Buchanan et al., 1969) was an early example of this approach. It was
developed at Stanford, where Ed Feigenbaum (a former student of Herbert Simon), Bruce
Buchanan (a philosopher turned computer scientist), and Joshua Lederberg (a Nobel
laureate geneticist) teamed up to solve the problem of inferring molecular structure from the
information provided by a mass spectrometer. The input to the program consists of the
elementary formula of the molecule (e.g., ) and the mass spectrum giving the
masses of the various fragments of the molecule generated when it is bombarded by an
electron beam. For example, the mass spectrum might contain a peak at ,
corresponding to the mass of a methyl ( ) fragment.
The naive version of the program generated all possible structures consistent with the
formula, and then predicted what mass spectrum would be observed for each, comparing
this with the actual spectrum. As one might expect, this is intractable for even moderate-
sized molecules. The DENDRAL researchers consulted analytical chemists and found that they
worked by looking for well-known patterns of peaks in the spectrum that suggested
common substructures in the molecule. For example, the following rule is used to recognize
a ketone subgroup (which weighs 28):
C6H13NO2
m = 15
CH3
(C=O)
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015EB8
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000157B6
,it is also noted in the margin. Subsequent significant uses
of the term are in bold, but not in the margin. We have included a comprehensive index and
an extensive bibliography.
Term
The only prerequisite is familiarity with basic concepts of computer science (algorithms,
data structures, complexity) at a sophom*ore level. Freshman calculus and linear algebra are
useful for some of the topics.
Online resources
Online resources are available through pearsonhighered.com/cs-resources or at the
book’s Web site, aima.cs.berkeley.edu. There you will find:
Exercises, programming projects, and research projects. These are no longer at the end
of each chapter; they are online only. Within the book, we refer to an online exercise
with a name like “Exercise 6.NARY.” Instructions on the Web site allow you to find
exercises by name or by topic.
Implementations of the algorithms in the book in Python, Java, and other programming
languages (currently hosted at github.com/aimacode).
A list of over 1400 schools that have used the book, many with links to online course
materials and syllabi.
Supplementary material and links for students and instructors.
Instructions on how to report errors in the book, in the likely event that some exist.
Book cover
The cover depicts the final position from the decisive game 6 of the 1997 chess match in
which the program Deep Blue defeated Garry Kasparov (playing Black), making this the first
http://pearsonhighered.com/
time a computer had beaten a world champion in a chess match. Kasparov is shown at the
top. To his right is a pivotal position from the second game of the historic Go match
between former world champion Lee Sedol and DeepMind’s ALPHAGO program. Move 37 by
ALPHAGO violated centuries of Go orthodoxy and was immediately seen by human experts as
an embarrassing mistake, but it turned out to be a winning move. At top left is an Atlas
humanoid robot built by Boston Dynamics. A depiction of a self-driving car sensing its
environment appears between Ada Lovelace, the world’s first computer programmer, and
Alan Turing, whose fundamental work defined artificial intelligence. At the bottom of the
chess board are a Mars Exploration Rover robot and a statue of Aristotle, who pioneered the
study of logic; his planning algorithm from De Motu Animalium appears behind the authors’
names. Behind the chess board is a probabilistic programming model used by the UN
Comprehensive Nuclear-Test-Ban Treaty Organization for detecting nuclear explosions
from seismic signals.
Acknowledgments
It takes a global village to make a book. Over 600 people read parts of the book and made
suggestions for improvement. The complete list is at aima.cs.berkeley.edu/ack.html;
we are grateful to all of them. We have space here to mention only a few especially
important contributors. First the contributing writers:
Judea Pearl (Section 13.5 , Causal Networks);
Vikash Mansinghka (Section 15.3 , Programs as Probability Models);
Michael Wooldridge (Chapter 18 , Multiagent Decision Making);
Ian Goodfellow (Chapter 21 , Deep Learning);
Jacob Devlin and Mei-Wing Chang (Chapter 24 , Deep Learning for Natural
Language);
Jitendra Malik and David Forsyth (Chapter 25 , Computer Vision);
Anca Dragan (Chapter 26 , Robotics).
Then some key roles:
Cynthia Yeung and Malika Cantor (project management);
Julie Sussman and Tom Galloway (copyediting and writing suggestions);
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011485.xhtml#P7001017158000000000000000011485
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012063.xhtml#P7001017158000000000000000012063
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B87.xhtml#P7001017158000000000000000012B87
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013CF5.xhtml#P7001017158000000000000000013CF5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001461E.xhtml#P700101715800000000000000001461E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000148CB.xhtml#P70010171580000000000000000148CB
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C8B.xhtml#P7001017158000000000000000014C8B
Omari Stephens (illustrations);
Tracy Johnson (editor);
Erin Ault and Rose Kernan (cover and color conversion);
Nalin Chhibber, Sam Goto, Raymond de Lacaze, Ravi Mohan, Ciaran O’Reilly, Amit
Patel, Dragomir Radiv, and Samagra Sharma (online code development and mentoring);
Google Summer of Code students (online code development).
Stuart would like to thank his wife, Loy Sheflott, for her endless patience and boundless
wisdom. He hopes that Gordon, Lucy, George, and Isaac will soon be reading this book after
they have forgiven him for working so long on it. RUGS (Russell’s Unusual Group of
Students) have been unusually helpful, as always.
Peter would like to thank his parents (Torsten and Gerda) for getting him started, and his
wife (Kris), children (Bella and Juliet), colleagues, boss, and friends for encouraging and
tolerating him through the long hours of writing and rewriting.
About the Authors
STUART RUSSELL was born in 1962 in Portsmouth, England. He received his B.A. with
first-class honours in physics from Oxford University in 1982, and his Ph.D. in computer
science from Stanford in 1986. He then joined the faculty of the University of California at
Berkeley, where he is a professor and former chair of computer science, director of the
Center for Human-Compatible AI, and holder of the Smith–Zadeh Chair in Engineering. In
1990, he received the Presidential Young Investigator Award of the National Science
Foundation, and in 1995 he was cowinner of the Computers and Thought Award. He is a
Fellow of the American Association for Artificial Intelligence, the Association for Computing
Machinery, and the American Association for the Advancement of Science, an Honorary
Fellow of Wadham College, Oxford, and an Andrew Carnegie Fellow. He held the Chaire
Blaise Pascal in Paris from 2012 to 2014. He has published over 300 papers on a wide range
of topics in artificial intelligence. His other books include The Use of Knowledge in Analogy and
Induction, Do the Right Thing: Studies in Limited Rationality (with Eric Wefald), and Human
Compatible: Artificial Intelligence and the Problem of Control.
PETER NORVIG is currently a Director of Research at Google, Inc., and was previously the
director responsible for the core Web search algorithms. He co-taught an online AI class
that signed up 160,000 students, helping to kick off the current round of massive open
online classes. He was head of the Computational Sciences Division at NASA Ames
Research Center, overseeing research and development in artificial intelligence and
robotics. He received a B.S. in applied mathematics from Brown University and a Ph.D. in
computer science from Berkeley. He has been a professor at the University of Southern
California and a faculty member at Berkeley and Stanford. He is a Fellow of the American
Association for Artificial Intelligence, the Association for Computing Machinery, the
American Academy of Arts and Sciences, and the California Academy of Science. His other
books are Paradigms of AI Programming: Case Studies in Common Lisp, Verbmobil: A Translation
System for Face-to-Face Dialog, and Intelligent Help Systems for UNIX.
The two authors shared the inaugural AAAI/EAAI Outstanding Educator award in 2016.
Contents
I Artificial Intelligence
1 Introduction 1
1.1 What Is AI? 1
1.2 The Foundations of Artificial Intelligence 5
1.3 The History of Artificial Intelligence 17
1.4 The State of the Art 27
1.5 Risks and Benefits of AI 31
Summary 34
Bibliographical and Historical Notes 35
2
,https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000157C0
if is the mass of the whole molecule and there are two peaks at and such that (a)
; (b) is a high peak; (c) is a high peak; and (d) At least one of and
is high then there is a ketone subgroup.
Recognizing that the molecule contains a particular substructure reduces the number of
possible candidates enormously. According to its authors, DENDRAL was powerful because it
embodied the relevant knowledge of mass spectroscopy not in the form of first principles
but in efficient “cookbook recipes” (Feigenbaum et al., 1971). The significance of DENDRAL
was that it was the first successful knowledge-intensive system: its expertise derived from
large numbers of special-purpose rules. In 1971, Feigenbaum and others at Stanford began
the Heuristic Programming Project (HPP) to investigate the extent to which the new
methodology of expert systems could be applied to other areas.
Expert systems
The next major effort was the MYCIN system for diagnosing blood infections. With about 450
rules, MYCIN was able to perform as well as some experts, and considerably better than
junior doctors. It also contained two major differences from DENDRAL. First, unlike the
DENDRAL rules, no general theoretical model existed from which the MYCIN rules could be
deduced. They had to be acquired from extensive interviewing of experts. Second, the rules
had to reflect the uncertainty associated with medical knowledge. MYCIN incorporated a
calculus of uncertainty called certainty factors (see Chapter 13 ), which seemed (at the
time) to fit well with how doctors assessed the impact of evidence on the diagnosis.
Certainty factor
The first successful commercial expert system, R1, began operation at the Digital Equipment
Corporation (McDermott, 1982). The program helped configure orders for new computer
systems; by 1986, it was saving the company an estimated $40 million a year. By 1988,
M x1 x2
x1 + x2 = M + 28 x1 − 28 x2 − 28 x1
x2
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015ABF
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000110BE.xhtml#P70010171580000000000000000110BE
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001615B
DEC’s AI group had 40 expert systems deployed, with more on the way. DuPont had 100 in
use and 500 in development. Nearly every major U.S. corporation had its own AI group and
was either using or investigating expert systems.
The importance of domain knowledge was also apparent in the area of natural language
understanding. Despite the success of Winograd’s SHRDLU system, its methods did not extend
to more general tasks: for problems such as ambiguity resolution it used simple rules that
relied on the tiny scope of the blocks world.
Several researchers, including Eugene Charniak at MIT and Roger Schank at Yale, suggested
that robust language understanding would require general knowledge about the world and
a general method for using that knowledge. (Schank went further, claiming, “There is no
such thing as syntax,” which upset a lot of linguists but did serve to start a useful
discussion.) Schank and his students built a series of programs (Schank and Abelson, 1977;
Wilensky, 1978; Schank and Riesbeck, 1981) that all had the task of understanding natural
language. The emphasis, however, was less on language per se and more on the problems of
representing and reasoning with the knowledge required for language understanding.
The widespread growth of applications to real-world problems led to the development of a
wide range of representation and reasoning tools. Some were based on logic—for example,
the Prolog language became popular in Europe and Japan, and the PLANNER family in the
United States. Others, following Minsky’s idea of frames (1975), adopted a more structured
approach, assembling facts about particular object and event types and arranging the types
into a large taxonomic hierarchy analogous to a biological taxonomy.
Frames
In 1981, the Japanese government announced the “Fifth Generation” project, a 10-year plan
to build massively parallel, intelligent computers running Prolog. The budget was to exceed
a $1.3 billion in today’s money. In response, the United States formed the Microelectronics
and Computer Technology Corporation (MCC), a consortium designed to assure national
competitiveness. In both cases, AI was part of a broad effort, including chip design and
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000164E2
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167B3
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000164E4
human-interface research. In Britain, the Alvey report reinstated the funding removed by
the Lighthill report. However, none of these projects ever met its ambitious goals in terms of
new AI capabilities or economic impact.
Overall, the AI industry boomed from a few million dollars in 1980 to billions of dollars in
1988, including hundreds of companies building expert systems, vision systems, robots, and
software and hardware specialized for these purposes.
Soon after that came a period called the “AI winter,” in which many companies fell by the
wayside as they failed to deliver on extravagant promises. It turned out to be difficult to
build and maintain expert systems for complex domains, in part because the reasoning
methods used by the systems broke down in the face of uncertainty and in part because the
systems could not learn from experience.
1.3.5 The return of neural networks (1986–present)
In the mid-1980s at least four different groups reinvented the back-propagation learning
algorithm first developed in the early 1960s. The algorithm was applied to many learning
problems in computer science and psychology, and the widespread dissemination of the
results in the collection Parallel Distributed Processing (Rumelhart and McClelland, 1986)
caused great excitement.
These so-called connectionist models were seen by some as direct competitors both to the
symbolic models promoted by Newell and Simon and to the logicist approach of McCarthy
and others. It might seem obvious that at some level humans manipulate symbols—in fact,
the anthropologist Terrence Deacon’s book The Symbolic Species (1997) suggests that this is
the defining characteristic of humans. Against this, Geoff Hinton, a leading figure in the
resurgence of neural networks in the 1980s and 2010s, has described symbols as the
“luminiferous aether of AI”—a reference to the non-existent medium through which many
19th-century physicists believed that electromagnetic waves propagated. Certainly, many
concepts that we name in language fail, on closer inspection, to have the kind of logically
defined necessary and sufficient conditions that early AI researchers hoped to capture in
axiomatic form. It may be that connectionist models form internal concepts in a more fluid
and imprecise way that is better suited to the messiness of the real world. They also have
the capability to learn from examples—they can compare their predicted output value to the
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001646E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015987
true value on a problem and modify their parameters to decrease the difference, making
,them more likely to perform well on future examples.
connectionist
1.3.6 Probabilistic reasoning and machine learning (1987–
present)
The brittleness of expert systems led to a new, more scientific approach incorporating
probability rather than Boolean logic, machine learning rather than hand-coding, and
experimental results rather than philosophical claims. It became more common to build
on existing theories than to propose brand-new ones, to base claims on rigorous theorems
or solid experimental methodology (Cohen, 1995) rather than on intuition, and to show
relevance to real-world applications rather than toy examples.
14 Some have characterized this change as a victory of the neats—those who think that AI theories should be grounded in
mathematical rigor—over the scruffies—those who would rather try out lots of ideas, write some programs, and then assess what
seems to be working. Both approaches are important. A shift toward neatness implies that the field has reached a level of stability and
maturity. The present emphasis on deep learning may represent a resurgence of the scruffies.
Shared benchmark problem sets became the norm for demonstrating progress, including the
UC Irvine repository for machine learning data sets, the International Planning Competition
for planning algorithms, the LibriSpeech corpus for speech recognition, the MNIST data set
for handwritten digit recognition, ImageNet and COCO for image object recognition,
SQUAD for natural language question answering, the WMT competition for machine
translation, and the International SAT Competitions for Boolean satisfiability solvers.
AI was founded in part as a rebellion against the limitations of existing fields like control
theory and statistics, but in this period it embraced the positive results of those fields. As
David McAllester (1998) put it:
In the early period of AI it seemed plausible that new forms of symbolic computation, e.g., frames
and semantic networks, made much of classical theory obsolete. This led to a form of isolationism in
which AI became largely separated from the rest of computer science. This isolationism is currently
14
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000158B0
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016128
being abandoned. There is a recognition that machine learning should not be isolated from
information theory, that uncertain reasoning should not be isolated from stochastic modeling, that
search should not be isolated from classical optimization and control, and that automated reasoning
should not be isolated from formal methods and static analysis.
The field of speech recognition illustrates the pattern. In the 1970s, a wide variety of
different architectures and approaches were tried. Many of these were rather ad hoc and
fragile, and worked on only a few carefully selected examples. In the 1980s, approaches
using hidden Markov models (HMMs) came to dominate the area. Two aspects of HMMs
are relevant. First, they are based on a rigorous mathematical theory. This allowed speech
researchers to build on several decades of mathematical results developed in other fields.
Second, they are generated by a process of training on a large corpus of real speech data.
This ensures that the performance is robust, and in rigorous blind tests HMMs improved
their scores steadily. As a result, speech technology and the related field of handwritten
character recognition made the transition to widespread industrial and consumer
applications. Note that there was no scientific claim that humans use HMMs to recognize
speech; rather, HMMs provided a mathematical framework for understanding and solving
the problem. We will see in Section 1.3.8 , however, that deep learning has rather upset
this comfortable narrative.
Hidden Markov models
1988 was an important year for the connection between AI and other fields, including
statistics, operations research, decision theory, and control theory. Judea Pearl’s (1988)
Probabilistic Reasoning in Intelligent Systems led to a new acceptance of probability and
decision theory in AI. Pearl’s development of Bayesian networks yielded a rigorous and
efficient formalism for representing uncertain knowledge as well as practical algorithms for
probabilistic reasoning. Chapters 12 to 16 cover this area, in addition to more recent
developments that have greatly increased the expressive power of probabilistic formalisms;
Chapter 20 describes methods for learning Bayesian networks and related models from
data.
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016303
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010E81.xhtml#P7001017158000000000000000010E81
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013ACD.xhtml#P7001017158000000000000000013ACD
Bayesian network
A second major contribution in 1988 was Rich Sutton’s work connecting reinforcement
learning—which had been used in Arthur Samuel’s checker-playing program in the 1950s—
to the theory of Markov decision processes (MDPs) developed in the field of operations
research. A flood of work followed connecting AI planning research to MDPs, and the field
of reinforcement learning found applications in robotics and process control as well as
acquiring deep theoretical foundations.
One consequence of AI’s newfound appreciation for data, statistical modeling, optimization,
and machine learning was the gradual reunification of subfields such as computer vision,
robotics, speech recognition, multiagent systems, and natural language processing that had
become somewhat separate from core AI. The process of reintegration has yielded
significant benefits both in terms of applications—for example, the deployment of practical
robots expanded greatly during this period—and in a better theoretical understanding of the
core problems of AI.
1.3.7 Big data (2001–present)
Remarkable advances in computing power and the creation of the World Wide Web have
facilitated the creation of very large data sets—a phenomenon sometimes known as big
data. These data sets include trillions of words of text, billions of images, and billions of
hours of speech and video, as well as vast amounts of genomic data, vehicle tracking data,
clickstream data, social network data, and so on.
Big data
This has led to the development of learning algorithms specially designed to take advantage
of very large data sets. Often, the vast majority of examples in such data sets are unlabeled;
for example, in Yarowsky’s (1995) influential work on word-sense disambiguation,
occurrences of a word such as “plant” are not labeled in the data set to indicate whether they
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001682C
refer to flora or factory. With large enough data sets, however, suitable learning algorithms
can achieve an accuracy of over 96% on the task of identifying which sense was intended in
a sentence. Moreover, Banko and Brill (2001) argued that the improvement in performance
obtained from increasing the size of the data set by two or three orders of magnitude
outweighs any improvement that can be obtained from tweaking the algorithm.
A similar phenomenon seems to occur in computer vision tasks such as filling in holes in
photographs—holes caused either by damage or by the removal of ex-friends. Hays and
Efros (2007) developed a clever method for doing this by blending in pixels from
,similar
images; they found that the technique worked poorly with a database of only thousands of
images but crossed a threshold of quality with millions of images. Soon after, the availability
of tens of millions of images in the ImageNet database (Deng et al., 2009) sparked a
revolution in the field of computer vision.
The availability of big data and the shift towards machine learning helped AI recover
commercial attractiveness (Havenstein, 2005; Halevy et al., 2009). Big data was a crucial
factor in the 2011 victory of IBM’s Watson system over human champions in the Jeopardy!
quiz game, an event that had a major impact on the public’s perception of AI.
1.3.8 Deep learning (2011–present)
The term deep learning refers to machine learning using multiple layers of simple,
adjustable computing elements. Experiments were carried out with such networks as far
back as the 1970s, and in the form of convolutional neural networks they found some
success in handwritten digit recognition in the 1990s (LeCun et al., 1995). It was not until
2011, however, that deep learning methods really took off. This occurred first in speech
recognition and then in visual object recognition.
Deep learning
In the 2012 ImageNet competition, which required classifying images into one of a thousand
categories (armadillo, bookshelf, corkscrew, etc.), a deep learning system created in
Geoffrey Hinton’s group at the University of Toronto (Krizhevsky et al., 2013) demonstrated
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001560B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015CF3
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000159D6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015CE5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015C9C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015FFA
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015F7F
a dramatic improvement over previous systems, which were based largely on handcrafted
features. Since then, deep learning systems have exceeded human performance on some
vision tasks (and lag behind in some other tasks). Similar gains have been reported in
speech recognition, machine translation, medical diagnosis, and game playing. The use of a
deep network to represent the evaluation function contributed to ALPHAGO’s victories over
the leading human Go players (Silver et al., 2016, 2017, 2018).
These remarkable successes have led to a resurgence of interest in AI among students,
companies, investors, governments, the media, and the general public. It seems that every
week there is news of a new AI application approaching or exceeding human performance,
often accompanied by speculation of either accelerated success or a new AI winter.
Deep learning relies heavily on powerful hardware. Whereas a standard computer CPU can
do or operations per second. a deep learning algorithm running on specialized
hardware (e.g., GPU, TPU, or FPGA) might consume between and operations per
second, mostly in the form of highly parallelized matrix and vector operations. Of course,
deep learning also depends on the availability of large amounts of training data, and on a
few algorithmic tricks (see Chapter 21 ).
109 1010
1014 1017
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016577
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001657B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016579
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013CF5.xhtml#P7001017158000000000000000013CF5
1.4 The State of the Art
AI Index
Stanford University’s One Hundred Year Study on AI (also known as AI100) convenes
panels of experts to provide reports on the state of the art in AI. Their 2016 report (Stone et
al., 2016; Grosz and Stone, 2018) concludes that “Substantial increases in the future uses of
AI applications, including more self-driving cars, healthcare diagnostics and targeted
treatment, and physical assistance for elder care can be expected” and that “Society is now at
a crucial juncture in determining how to deploy AI-based technologies in ways that promote
rather than hinder democratic values such as freedom, equality, and transparency.” AI100
also produces an AI Index at aiindex.org to help track progress. Some highlights from the
2018 and 2019 reports (comparing to a year 2000 baseline unless otherwise stated):
Publications: AI papers increased 20-fold between 2010 and 2019 to about 20,000 a year.
The most popular category was machine learning. (Machine learning papers in arXiv.org
doubled every year from 2009 to 2017.) Computer vision and natural language
processing were the next most popular.
Sentiment: About 70% of news articles on AI are neutral, but articles with positive tone
increased from 12% in 2016 to 30% in 2018. The most common issues are ethical: data
privacy and algorithm bias.
Students: Course enrollment increased 5-fold in the U.S. and 16-fold internationally
from a 2010 baseline. AI is the most popular specialization in Computer Science.
Diversity: AI Professors worldwide are about 80% male, 20% female. Similar numbers
hold for Ph.D. students and industry hires.
Conferences: Attendance at NeurIPS increased 800% since 2012 to 13,500 attendees.
Other conferences are seeing annual growth of about 30%.
Industry: AI startups in the U.S. increased 20-fold to over 800.
Internationalization: China publishes more papers per year than the U.S. and about as
many as all of Europe. However, in citation-weighted impact, U.S. authors are 50%
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001660C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015C68
https://aiindex.org/
ahead of Chinese authors. Singapore, Brazil, Australia, Canada, and India are the fastest
growing countries in terms of the number of AI hires.
Vision: Error rates for object detection (as achieved in LSVRC, the Large-Scale Visual
Recognition Challenge) improved from 28% in 2010 to 2% in 2017, exceeding human
performance. Accuracy on open-ended visual question answering (VQA) improved from
55% to 68% since 2015, but lags behind human performance at 83%.
Speed: Training time for the image recognition task dropped by a factor of 100 in just
the past two years. The amount of computing power used in top AI applications is
doubling every 3.4 months.
Language: Accuracy on question answering, as measured by F1 score on the Stanford
Question Answering Dataset (SQUAD), increased from 60 to 95 from 2015 to 2019; on
the SQUAD 2 variant, progress was faster, going from 62 to 90 in just one year. Both
scores exceed human-level performance.
Human benchmarks: By 2019, AI systems had reportedly met or exceeded human-level
performance in chess, Go, poker, Pac-Man, Jeopardy!, ImageNet object detection,
speech recognition in a limited domain, Chinese-to-English translation in a restricted
domain, Quake III, Dota 2, StarCraft II, various Atari games, skin cancer detection,
prostate cancer
,detection, protein folding, and diabetic retinopathy diagnosis.
When (if ever) will AI systems achieve human-level performance across a broad variety of
tasks? Ford (2018) interviews AI experts and finds a wide range of target years, from 2029 to
2200, with a mean of 2099. In a similar survey (Grace et al., 2017) 50% of respondents
thought this could happen by 2066, although 10% thought it could happen as early as 2025,
and a few said “never.” The experts were also split on whether we need fundamental new
breakthroughs or just refinements on current approaches. But don’t take their predictions
too seriously; as Philip Tetlock (2017) demonstrates in the area of predicting world events,
experts are no better than amateurs.
How will future AI systems operate? We can’t yet say. As detailed in this section, the field
has adopted several stories about itself—first the bold idea that intelligence by a machine
was even possible, then that it could be achieved by encoding expert knowledge into logic,
then that probabilistic models of the world would be the main tool, and most recently that
machine learning would induce models that might not be based on any well-understood
theory at all. The future will reveal what model comes next.
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015B14
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015C52
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016683
What can AI do today? Perhaps not as much as some of the more optimistic media articles
might lead one to believe, but still a great deal. Here are some examples:
ROBOTIC VEHICLES: The history of robotic vehicles stretches back to radio-controlled
cars of the 1920s, but the first demonstrations of autonomous road driving without special
guides occurred in the 1980s (Kanade et al., 1986; Dickmanns and Zapp, 1987). After
successful demonstrations of driving on dirt roads in the 132-mile DARPA Grand Challenge
in 2005 (Thrun, 2006) and on streets with traffic in the 2007 Urban Challenge, the race to
develop self-driving cars began in earnest. In 2018, Waymo test vehicles passed the
landmark of 10 million miles driven on public roads without a serious accident, with the
human driver stepping in to take over control only once every 6,000 miles. Soon after, the
company began offering a commercial robotic taxi service.
In the air, autonomous fixed-wing drones have been providing cross-country blood
deliveries in Rwanda since 2016. Quadcopters perform remarkable aerobatic maneuvers,
explore buildings while constructing 3-D maps, and self-assemble into autonomous
formations.
Legged locomotion: BigDog, a quadruped robot by Raibert et al. (2008), upended our
notions of how robots move—no longer the slow, stiff-legged, side-to-side gait of Hollywood
movie robots, but something closely resembling an animal and able to recover when shoved
or when slipping on an icy puddle. Atlas, a humanoid robot, not only walks on uneven
terrain but jumps onto boxes and does backflips (Ackerman and Guizzo, 2016).
AUTONOMOUS PLANNING AND SCHEDULING: A hundred million miles from Earth,
NASA’s Remote Agent program became the first on-board autonomous planning program to
control the scheduling of operations for a spacecraft (Jonsson et al., 2000). Remote Agent
generated plans from high-level goals specified from the ground and monitored the
execution of those plans—detecting, diagnosing, and recovering from problems as they
occurred. Today, the EUROPA planning toolkit (Barreiro et al., 2012) is used for daily
operations of NASA’s Mars rovers and the SEXTANT system (Winternitz, 2017) allows
autonomous navigation in deep space, beyond the global GPS system.
During the Persian Gulf crisis of 1991, U.S. forces deployed a Dynamic Analysis and
Replanning Tool, DART (Cross and Walker, 1994), to do automated logistics planning and
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015E6E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000159F6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166A6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000163BF
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001552C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015E2F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015621
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167DE
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001590C
scheduling for transportation. This involved up to 50,000 vehicles, cargo, and people at a
time, and had to account for starting points, destinations, routes, transport capacities, port
and airfield capacities, and conflict resolution among all parameters. The Defense Advanced
Research Project Agency (DARPA) stated that this single application more than paid back
DARPA’s 30-year investment in AI.
Every day, ride hailing companies such as Uber and mapping services such as Google Maps
provide driving directions for hundreds of millions of users, quickly plotting an optimal
route taking into account current and predicted future traffic conditions.
MACHINE TRANSLATION: Online machine translation systems now enable the reading of
documents in over 100 languages, including the native languages of over 99% of humans,
and render hundreds of billions of words per day for hundreds of millions of users. While
not perfect, they are generally adequate for understanding. For closely related languages
with a great deal of training data (such as French and English) translations within a narrow
domain are close to the level of a human (Wu et al., 2016b).
SPEECH RECOGNITION: In 2017, Microsoft showed that its Conversational Speech
Recognition System had reached a word error rate of 5.1%, matching human performance
on the Switchboard task, which involves transcribing telephone conversations (Xiong et al.,
2017). About a third of computer interaction worldwide is now done by voice rather than
keyboard; Skype provides real-time speech-to-speech translation in ten languages. Alexa,
Siri, Cortana, and Google offer assistants that can answer questions and carry out tasks for
the user; for example the Google Duplex service uses speech recognition and speech
synthesis to make restaurant reservations for users, carrying out a fluent conversation on
their behalf.
RECOMMENDATIONS: Companies such as Amazon, Facebook, Netflix, Spotify, YouTube,
Walmart, and others use machine learning to recommend what you might like based on
your past experiences and those of others like you. The field of recommender systems has a
long history (Resnick and Varian, 1997) but is changing rapidly due to new deep learning
methods that analyze content (text, music, video) as well as history and metadata (van den
Oord et al., 2014; Zhang et al., 2017). Spam filtering can also be considered a form of
recommendation (or dis-recommendation); current AI techniques filter out over 99.9% of
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016818
,https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001681E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000163F7
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166F6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016871
spam, and email services can also recommend potential recipients, as well as possible
response text.
Game playing: When Deep Blue defeated world chess champion Garry Kasparov in 1997,
defenders of human supremacy placed their hopes on Go. Piet Hut, an astrophysicist and
Go enthusiast, predicted that it would take “a hundred years before a computer beats
humans at Go—maybe even longer.” But just 20 years later, ALPHAGO surpassed all human
players (Silver et al., 2017). Ke Jie, the world champion, said, “Last year, it was still quite
human-like when it played. But this year, it became like a god of Go.” ALPHAGO benefited
from studying hundreds of thousands of past games by human Go players, and from the
distilled knowledge of expert Go players that worked on the team.
A followup program, ALPHAZERO, used no input from humans (except for the rules of the
game), and was able to learn through self-play alone to defeat all opponents, human and
machine, at Go, chess, and shogi (Silver et al., 2018). Meanwhile, human champions have
been beaten by AI systems at games as diverse as Jeopardy! (Ferrucci et al., 2010), poker
(Bowling et al., 2015; Moravčík et al., 2017; Brown and Sandholm, 2019), and the video
games Dota 2 (Fernandez and Mahlmann, 2018), StarCraft II (Vinyals et al., 2019), and
Quake III (Jaderberg et al., 2019).
IMAGE UNDERSTANDING: Not content with exceeding human accuracy on the
challenging ImageNet object recognition task, computer vision researchers have taken on
the more difficult problem of image captioning. Some impressive examples include “A
person riding a motorcycle on a dirt road,” “Two pizzas sitting on top of a stove top oven,”
and “A group of young people playing a game of frisbee” (Vinyals et al., 2017b). Current
systems are far from perfect, however: a “refrigerator filled with lots of food and drinks”
turns out to be a no-parking sign partially obscured by lots of small stickers.
MEDICINE: AI algorithms now equal or exceed expert doctors at diagnosing many
conditions, particularly when the diagnosis is based on images. Examples include
Alzheimer’s disease (Ding et al., 2018), metastatic cancer (Liu et al., 2017; Esteva et al., 2017),
ophthalmic disease (Gulshan et al., 2016), and skin diseases (Liu et al., 2019c). A systematic
review and meta-analysis (Liu et al., 2019a) found that the performance of AI programs, on
average, was equivalent to health care professionals. One current emphasis in medical AI is
in facilitating human–machine partnerships. For example, the LYNA system achieves 99.6%
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001657B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016579
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AE7
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015739
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000161EC
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001579C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AE1
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016724
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015DF7
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016728
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000159FE
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016061
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015A99
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015C7A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016071
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001606D
overall accuracy in diagnosing metastatic breast cancer—better than an unaided human
expert—but the combination does better still (Liu et al., 2018; Steiner et al., 2018)..
The widespread adoption of these techniques is now limited not by diagnostic accuracy but
by the need to demonstrate improvement in clinical outcomes and to ensure transparency,
lack of bias, and data privacy (Topol, 2019). In 2017, only two medical AI applications were
approved by the FDA, but that increased to 12 in 2018, and continues to rise.
CLIMATE SCIENCE: A team of scientists won the 2018 Gordon Bell Prize for a deep
learning model that discovers detailed information about extreme weather events that were
previously buried in climate data. They used a supercomputer with specialized GPU
hardware to exceed the exaop level ( operations per second), the first machine learning
program to do so (Kurth et al., 2018). Rolnick et al. (2019) present a 60-page catalog of ways
in which machine learning can be used to tackle climate change.
These are just a few examples of artificial intelligence systems that exist today. Not magic or
science fiction—but rather science, engineering, and mathematics, to which this book
provides an introduction.
10
18
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001605F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000165F6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166B8
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015FA0
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016437
1.5 Risks and Benefits of AI
Francis Bacon, a philosopher credited with creating the scientific method, noted in The
Wisdom of the Ancients (1609) that the “mechanical arts are of ambiguous use, serving as well
for hurt as for remedy.” As AI plays an increasingly important role in the economic, social,
scientific, medical, financial, and military spheres, we would do well to consider the hurts
and remedies—in modern parlance, the risks and benefits—that it can bring. The topics
summarized here are covered in greater depth in Chapters 27 and 28 .
To begin with the benefits: put simply, our entire civilization is the product of our human
,intelligence. If we have access to substantially greater machine intelligence, the ceiling on
our ambitions is raised substantially. The potential for AI and robotics to free humanity from
menial repetitive work and to dramatically increase the production of goods and services
could presage an era of peace and plenty. The capacity to accelerate scientific research could
result in cures for disease and solutions for climate change and resource shortages. As
Demis Hassabis, CEO of Google DeepMind, has suggested: “First solve AI, then use AI to
solve everything else.”
Long before we have an opportunity to “solve AI,” however, we will incur risks from the
misuse of AI, inadvertent or otherwise. Some of these are already apparent, while others
seem likely based on current trends:
LETHAL AUTONOMOUS WEAPONS: These are defined by the United Nations as
weapons that can locate, select, and eliminate human targets without human
intervention. A primary concern with such weapons is their scalability: the absence of a
requirement for human supervision means that a small group can deploy an arbitrarily
large number of weapons against human targets defined by any feasible recognition
criterion. The technologies needed for autonomous weapons are similar to those needed
for self-driving cars. Informal expert discussions on the potential risks of lethal
autonomous weapons began at the UN in 2014, moving to the formal pre-treaty stage of
a Group of Governmental Experts in 2017.
SURVEILLANCE AND PERSUASION: While it is expensive, tedious, and sometimes
legally questionable for security personnel to monitor phone lines, video camera feeds,
emails, and other messaging channels, AI (speech recognition, computer vision, and
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000155E9
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001512B.xhtml#P700101715800000000000000001512B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000152FF.xhtml#P70010171580000000000000000152FF
natural language understanding) can be used in a scalable fashion to perform mass
surveillance of individuals and detect activities of interest. By tailoring information flows
to individuals through social media, based on machine learning techniques, political
behavior can be modified and controlled to some extent—a concern that became
apparent in elections beginning in 2016.
BIASED DECISION MAKING: Careless or deliberate misuse of machine learning
algorithms for tasks such as evaluating parole and loan applications can result in
decisions that are biased by race, gender, or other protected categories. Often, the data
themselves reflect pervasive bias in society.
IMPACT ON EMPLOYMENT: Concerns about machines eliminating jobs are centuries
old. The story is never simple: machines do some of the tasks that humans might
otherwise do, but they also make humans more productive and therefore more
employable, and make companies more profitable and therefore able to pay higher
wages. They may render some activities economically viable that would otherwise be
impractical. Their use generally results in increasing wealth but tends to have the effect
of shifting wealth from labor to capital, further exacerbating increases in inequality.
Previous advances in technology—such as the invention of mechanical looms—have
resulted in serious disruptions to employment, but eventually people find new kinds of
work to do. On the other hand, it is possible that AI will be doing those new kinds of
work too. This topic is rapidly becoming a major focus for economists and governments
around the world.
SAFETY-CRITICAL APPLICATIONS: As AI techniques advance, they are increasingly
used in high-stakes, safety-critical applications such as driving cars and managing the
water supplies of cities. Fatal accidents have already occurred and highlight the difficulty
of formal verification and statistical risk analysis for systems developed using machine
learning techniques. The field of AI will need to develop technical and ethical standards
at least comparable to those prevalent in other engineering and healthcare disciplines
where people’s lives are at stake.
CYBERSECURITY: AI techniques are useful in defending against cyberattack, for
example by detecting unusual patterns of behavior, but they will also contribute to the
potency, survivability, and proliferation capability of malware. For example,
reinforcement learning methods have been used to create highly effective tools for
automated, personalized blackmail and phishing attacks.
We will revisit these topics in more depth in Section 27.3 . As AI systems become more
capable, they will take on more of the societal roles previously played by humans. Just as
humans have used these roles in the past to perpetrate mischief, we can expect that humans
may misuse AI systems in these roles to perpetrate even more mischief. All of the examples
given above point to the importance of governance and, eventually, regulation. At present,
the research community and the major corporations involved in AI research have developed
voluntary self-governance principles for AI-related activities (see Section 27.3 ).
Governments and international organizations are setting up advisory bodies to devise
appropriate regulations for each specific use case, to prepare for the economic and social
impacts, and to take advantage of AI capabilities to address major societal problems.
What of the longer term? Will we achieve the long-standing goal: the creation of intelligence
comparable to or more capable than human intelligence? And, if we do, what then?
Human-level AI
Artificial general intelligence (AGI)
For much of AI’s history, these questions have been overshadowed by the daily grind of
getting AI systems to do anything even remotely intelligent. As with any broad discipline,
the great majority of AI researchers have specialized in a specific subfield such as game-
playing, knowledge representation, vision, or natural language understanding—often on the
assumption that progress in these subfields would contribute to the broader goals of AI. Nils
Nilsson (1995), one of the original leaders of the Shakey project at SRI, reminded the field of
those broader goals and warned that the subfields were in danger of becoming ends in
themselves. Later, some influential founders of AI, including John McCarthy (2007), Marvin
Minsky (2007), and Patrick Winston (Beal and Winston, 2009), concurred with Nilsson’s
warnings, suggesting that instead of focusing on measurable performance in specific
applications, AI should return to its roots of striving for, in Herb Simon’s words, “machines
that think, that learn and that create.” They called the effort human-level AI or HLAI—a
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015183.xhtml#P7001017158000000000000000015183
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015183.xhtml#P7001017158000000000000000015183
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001627D
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001613E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000161A5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015642
machine should be able to learn to do anything a human can do. Their first symposium was
in 2004 (Minsky et al., 2004). Another effort with similar goals, the artificial general
intelligence (AGI) movement
,(Goertzel and Pennachin, 2007), held its first conference and
organized the Journal of Artificial General Intelligence in 2008.
Artificial superintelligence (ASI)
At around the same time, concerns were raised that creating artificial superintelligence or
ASI—intelligence that far surpasses human ability—might be a bad idea (Yudkowsky, 2008;
Omohundro, 2008). Turing (1996) himself made the same point in a lecture given in
Manchester in 1951, drawing on earlier ideas from Samuel Butler (1863):
15 Even earlier, in 1847, Richard Thornton, editor of the Primitive Expounder, railed against mechanical calculators: “Mind ... outruns
itself and does away with the necessity of its own existence by inventing machines to do its own thinking. ... But who knows that such
machines when brought to greater perfection, may not think of a plan to remedy all their own defects and then grind out ideas beyond
the ken of mortal mind!”
It seems probable that once the machine thinking method had started, it would not take long to
outstrip our feeble powers. ... At some stage therefore we should have to expect the machines to take
control, in the way that is mentioned in Samuel Butler’s Erewhon.
These concerns have only become more widespread with recent advances in deep learning,
the publication of books such as Superintelligence by Nick Bostrom (2014), and public
pronouncements from Stephen Hawking, Bill Gates, Martin Rees, and Elon Musk.
Experiencing a general sense of unease with the idea of creating superintelligent machines is
only natural. We might call this the gorilla problem: about seven million years ago, a now-
extinct primate evolved, with one branch leading to gorillas and one to humans. Today, the
gorillas are not too happy about the human branch; they have essentially no control over
their future. If this is the result of success in creating superhuman AI—that humans cede
control over their future—then perhaps we should stop work on AI, and, as a corollary, give
up the benefits it might bring. This is the essence of Turing’s warning: it is not obvious that
we can control machines that are more intelligent than us.
15
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000161AB
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015C05
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001684C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000162AD
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166DA
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000157DC
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015721
Gorilla problem
If superhuman AI were a black box that arrived from outer space, then indeed it would be
wise to exercise caution in opening the box. But it is not: we design the AI systems, so if
they do end up “taking control,” as Turing suggests, it would be the result of a design failure.
To avoid such an outcome, we need to understand the source of potential failure. Norbert
Wiener (1960), who was motivated to consider the long-term future of AI after seeing
Arthur Samuel’s checker-playing program learn to beat its creator, had this to say:
If we use, to achieve our purposes, a mechanical agency with whose operation we cannot interfere
effectively ... we had better be quite sure that the purpose put into the machine is the purpose which
we really desire.
Many cultures have myths of humans who ask gods, genies, magicians, or devils for
something. Invariably, in these stories, they get what they literally ask for, and then regret it.
The third wish, if there is one, is to undo the first two. We will call this the King Midas
problem: Midas, a legendary King in Greek mythology, asked that everything he touched
should turn to gold, but then regretted it after touching his food, drink, and family
members.
16 Midas would have done better if he had followed basic principles of safety and included an “undo” button and a “pause” button in
his wish.
King Midas problem
We touched on this issue in Section 1.1.5 , where we pointed out the need for a significant
modification to the standard model of putting fixed objectives into the machine. The
solution to Wiener’s predicament is not to have a definite “purpose put into the machine” at
all. Instead, we want machines that strive to achieve human objectives but know that they
don’t know for certain exactly what those objectives are.
16
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167AF
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBB1.xhtml#P700101715800000000000000000EC10
It is perhaps unfortunate that almost all AI research to date has been carried out within the
standard model, which means that almost all of the technical material in this edition reflects
that intellectual framework. There are, however, some early results within the new
framework. In Chapter 16 , we show that a machine has a positive incentive to allow itself
to be switched off if and only if it is uncertain about the human objective. In Chapter 18 ,
we formulate and study assistance games, which describe mathematically the situation in
which a human has an objective and a machine tries to achieve it, but is initially uncertain
about what it is. In Chapter 22 , we explain the methods of inverse reinforcement learning
that allow machines to learn more about human preferences from observations of the
choices that humans make. In Chapter 27 , we explore two of the principal difficulties: first,
that our choices depend on our preferences through a very complex cognitive architecture
that is hard to invert; and, second, that we humans may not have consistent preferences in
the first place—either individually or as a group—so it may not be clear what AI systems
should be doing for us.
Assistance game
Inverse reinforcement learning
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B87.xhtml#P7001017158000000000000000012B87
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001512B.xhtml#P700101715800000000000000001512B
Summary
This chapter defines AI and establishes the cultural background against which it has
developed. Some of the important points are as follows:
Different people approach AI with different goals in mind. Two important questions to
ask are: Are you concerned with thinking, or behavior? Do you want to model humans,
or try to achieve the optimal results?
According to what we have called the standard model, AI is concerned mainly with
rational action. An ideal intelligent agent takes the best possible action in a situation.
We study the problem of building agents that are intelligent in this sense.
Two refinements to this simple idea are needed: first, the ability of any agent, human or
otherwise, to choose rational actions is limited by the computational intractability of
doing so; second, the concept of a machine that pursues a definite objective needs to be
replaced with that of a machine
,pursuing objectives to benefit humans, but uncertain as
to what those objectives are.
Philosophers (going back to 400 BCE) made AI conceivable by suggesting that the mind is
in some ways like a machine, that it operates on knowledge encoded in some internal
language, and that thought can be used to choose what actions to take.
Mathematicians provided the tools to manipulate statements of logical certainty as well
as uncertain, probabilistic statements. They also set the groundwork for understanding
computation and reasoning about algorithms.
Economists formalized the problem of making decisions that maximize the expected
utility to the decision maker.
Neuroscientists discovered some facts about how the brain works and the ways in which
it is similar to and different from computers.
Psychologists adopted the idea that humans and animals can be considered
information-processing machines. Linguists showed that language use fits into this
model.
Computer engineers provided the ever-more-powerful machines that make AI
applications possible, and software engineers made them more usable.
Control theory deals with designing devices that act optimally on the basis of feedback
from the environment. Initially, the mathematical tools of control theory were quite
different from those used in AI, but the fields are coming closer together.
The history of AI has had cycles of success, misplaced optimism, and resulting cutbacks
in enthusiasm and funding. There have also been cycles of introducing new, creative
approaches and systematically refining the best ones.
AI has matured considerably compared to its early decades, both theoretically and
methodologically. As the problems that AI deals with became more complex, the field
moved from Boolean logic to probabilistic reasoning, and from hand-crafted knowledge
to machine learning from data. This has led to improvements in the capabilities of real
systems and greater integration with other disciplines.
As AI systems find application in the real world, it has become necessary to consider a
wide range of risks and ethical consequences.
In the longer term, we face the difficult problem of controlling superintelligent AI
systems that may evolve in unpredictable ways. Solving this problem seems to
necessitate a change in our conception of AI.
Bibliographical and Historical Notes
A comprehensive history of AI is given by Nils Nilsson (2009), one of the early pioneers of
the field. Pedro Domingos (2015) and Melanie Mitchell (2019) give overviews of machine
learning for a general audience, and Kai-Fu Lee (2018) describes the race for international
leadership in AI. Martin Ford (2018) interviews 23 leading AI researchers.
The main professional societies for AI are the Association for the Advancement of Artificial
Intelligence (AAAI), the ACM Special Interest Group in Artificial Intelligence (SIGAI,
formerly SIGART), the European Association for AI, and the Society for Artificial
Intelligence and Simulation of Behaviour (AISB). The Partnership on AI brings together
many commercial and nonprofit organizations concerned with the ethical and social impacts
of AI. AAAI’s AI Magazine contains many topical and tutorial articles, and its Web site,
aaai.org, contains news, tutorials, and background information.
The most recent work appears in the proceedings of the major AI conferences: the
International Joint Conference on AI (IJCAI), the annual European Conference on AI
(ECAI), and the AAAI Conference. Machine learning is covered by the International
Conference on Machine Learning and the Neural Information Processing Systems (NeurIPS)
meeting. The major journals for general AI are Artificial Intelligence, Computational
Intelligence, the IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Intelligent
Systems, and the Journal of Artificial Intelligence Research. There are also many conferences
and journals devoted to specific areas, which we cover in the appropriate chapters.
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001627F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015A14
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000161B3
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016000
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015B14
Chapter 2
Intelligent Agents
In which we discuss the nature of agents, perfect or otherwise, the diversity of environments, and the
resulting menagerie of agent types.
Chapter 1 identified the concept of rational agents as central to our approach to artificial
intelligence. In this chapter, we make this notion more concrete. We will see that the
concept of rationality can be applied to a wide variety of agents operating in any imaginable
environment. Our plan in this book is to use this concept to develop a small set of design
principles for building successful agents—systems that can reasonably be called intelligent.
We begin by examining agents, environments, and the coupling between them. The
observation that some agents behave better than others leads naturally to the idea of a
rational agent—one that behaves as well as possible. How well an agent can behave
depends on the nature of the environment; some environments are more difficult than
others. We give a crude categorization of environments and show how properties of an
environment influence the design of suitable agents for that environment. We describe a
number of basic “skeleton” agent designs, which we flesh out in the rest of the book.
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBA2.xhtml#P700101715800000000000000000EBA2
2.1 Agents and Environments
An agent is anything that can be viewed as perceiving its environment through sensors and
acting upon that environment through actuators. This simple idea is illustrated in Figure
2.1 . A human agent has eyes, ears, and other organs for sensors and hands, legs, vocal
tract, and so on for actuators. A robotic agent might have cameras and infrared range finders
for sensors and various motors for actuators. A software agent receives file contents,
network packets, and human input (keyboard/mouse/touchscreen/voice) as sensory inputs
and acts on the environment by writing files, sending network packets, and displaying
information or generating sounds. The environment could be everything—the entire
universe! In practice it is just that part of the universe whose state we care about when
designing this agent—the part that affects what the agent perceives and that is affected by
the agent’s actions.
Figure 2.1
Agents interact with environments through sensors and actuators.
Environment
Sensor
Actuator
We use the term percept to refer to the content an agent’s sensors are perceiving. An agent’s
percept sequence is the complete history of everything the agent has ever perceived. In
general, an agent’s choice of action at any given instant can depend on its built-in knowledge and
on the entire percept sequence observed to date, but not on anything it hasn’t perceived. By
specifying the agent’s choice of action for every possible percept sequence, we have said
more or less everything there is to say about the agent. Mathematically speaking, we say
that an agent’s behavior is described by the agent function that maps any given percept
sequence to an action.
Percept
Percept sequence
Agent function
We can imagine tabulating the agent function that describes any given agent; for most
agents, this would be a very large table—infinite, in
,fact, unless we place a bound on the
length of percept sequences we want to consider. Given an agent to experiment with, we
can, in principle, construct this table by trying out all possible percept sequences and
recording which actions the agent does in response. The table is, of course, an external
characterization of the agent. Internally, the agent function for an artificial agent will be
implemented by an agent program. It is important to keep these two ideas distinct. The
agent function is an abstract mathematical description; the agent program is a concrete
implementation, running within some physical system.
1 If the agent uses some randomization to choose its actions, then we would have to try each sequence many times to identify the
probability of each action. One might imagine that acting randomly is rather silly, but we show later in this chapter that it can be very
intelligent.
Agent program
To illustrate these ideas, we use a simple example—the vacuum-cleaner world, which
consists of a robotic vacuum-cleaning agent in a world consisting of squares that can be
either dirty or clean. Figure 2.2 shows a configuration with just two squares, and . The
vacuum agent perceives which square it is in and whether there is dirt in the square. The
agent starts in square . The available actions are to move to the right, move to the left,
suck up the dirt, or do nothing. One very simple agent function is the following: if the
current square is dirty, then suck; otherwise, move to the other square. A partial tabulation
of this agent function is shown in Figure 2.3 and an agent program that implements it
appears in Figure 2.8 on page 49.
2 In a real robot, it would be unlikely to have an actions like “move right” and “move left.” Instead the actions would be “spin wheels
forward” and “spin wheels backward.” We have chosen the actions to be easier to follow on the page, not for ease of implementation
in an actual robot.
Figure 2.2
1
A B
A
2
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EF69.xhtml#P700101715800000000000000000EF93
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EF69.xhtml#P700101715800000000000000000EF83
A vacuum-cleaner world with just two locations. Each location can be clean or dirty, and the agent can
move left or right and can clean the square that it occupies. Different versions of the vacuum world allow
for different rules about what the agent can perceive, whether its actions always succeed, and so on.
Figure 2.3
Partial tabulation of a simple agent function for the vacuum-cleaner world shown in Figure 2.2 . The
agent cleans the current square if it is dirty, otherwise it moves to the other square. Note that the table is
of unbounded size unless there is a restriction on the length of possible percept sequences.
Looking at Figure 2.3 , we see that various vacuum-world agents can be defined simply by
filling in the right-hand column in various ways. The obvious question, then, is this: What is
the right way to fill out the table? In other words, what makes an agent good or bad, intelligent
or stupid? We answer these questions in the next section.
Before closing this section, we should emphasize that the notion of an agent is meant to be a
tool for analyzing systems, not an absolute characterization that divides the world into
agents and non-agents. One could view a hand-held calculator as an agent that chooses the
action of displaying “4” when given the percept sequence but such an analysis
would hardly aid our understanding of the calculator. In a sense, all areas of engineering can
be seen as designing artifacts that interact with the world; AI operates at (what the authors
consider to be) the most interesting end of the spectrum, where the artifacts have significant
computational resources and the task environment requires nontrivial decision making.
“2 + 2 = ,”
2.2 Good Behavior: The Concept of Rationality
A rational agent is one that does the right thing. Obviously, doing the right thing is better
than doing the wrong thing, but what does it mean to do the right thing?
Rational agent
2.2.1 Performance measures
Moral philosophy has developed several different notions of the “right thing,” but AI has
generally stuck to one notion called consequentialism: we evaluate an agent’s behavior by
its consequences. When an agent is plunked down in an environment, it generates a
sequence of actions according to the percepts it receives. This sequence of actions causes
the environment to go through a sequence of states. If the sequence is desirable, then the
agent has performed well. This notion of desirability is captured by a performance measure
that evaluates any given sequence of environment states.
Consequentialism
Performance measure
Humans have desires and preferences of their own, so the notion of rationality as applied to
humans has to do with their success in choosing actions that produce sequences of
environment states that are desirable from their point of view. Machines, on the other hand,
do not have desires and preferences of their own; the performance measure is, initially at
least, in the mind of the designer of the machine, or in the mind of the users the machine is
designed for. We will see that some agent designs have an explicit representation of (a
version of) the performance measure, while in other designs the performance measure is
entirely implicit—the agent may do the right thing, but it doesn’t know why.
Recalling Norbert Wiener’s warning to ensure that “the purpose put into the machine is the
purpose which we really desire” (page 33), notice that it can be quite hard to formulate a
performance measure correctly. Consider, for example, the vacuum-cleaner agent from the
preceding section. We might propose to measure performance by the amount of dirt cleaned
up in a single eight-hour shift. With a rational agent, of course, what you ask for is what you
get. A rational agent can maximize this performance measure by cleaning up the dirt, then
dumping it all on the floor, then cleaning it up again, and so on. A more suitable
performance measure would reward the agent for having a clean floor. For example, one
point could be awarded for each clean square at each time step (perhaps with a penalty for
electricity consumed and noise generated). As a general rule, it is better to design performance
measures according to what one actually wants to be achieved in the environment, rather than
according to how one thinks the agent should behave.
Even when the obvious pitfalls are avoided, some knotty problems remain. For example, the
notion of “clean floor” in the preceding paragraph is based on average cleanliness over time.
Yet the same average cleanliness can be achieved by two different agents, one of which does
a mediocre job all the time while the other cleans energetically but takes long breaks. Which
is preferable might seem to be a fine point of janitorial science, but in fact it is a deep
philosophical question with far-reaching implications. Which is better—a reckless life of
highs and lows, or a safe but humdrum existence? Which is better—an economy where
everyone lives in moderate poverty, or one in which some live in plenty while others are
very poor? We leave these questions as an exercise for the diligent reader.
For most of the book, we will assume that the performance measure can be specified
correctly. For the reasons given above, however, we must accept the possibility that we
might put the wrong purpose into the machine—precisely the King Midas problem described
on page 33. Moreover, when designing one piece of software, copies of which will belong to
different users, we cannot anticipate the exact preferences of each individual user. Thus, we
may need to build agents that reflect initial uncertainty about the true performance measure
,and learn more about it as time goes by; such agents are described in Chapters 16 , 18 ,
and 22 .
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EDBE.xhtml#P700101715800000000000000000EDD9
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EDBE.xhtml#P700101715800000000000000000EDD9
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B87.xhtml#P7001017158000000000000000012B87
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4
2.2.2 Rationality
What is rational at any given time depends on four things:
The performance measure that defines the criterion of success.
The agent’s prior knowledge of the environment.
The actions that the agent can perform.
The agent’s percept sequence to date.
This leads to a definition of a rational agent:
For each possible percept sequence, a rational agent should select an action that is expected to
maximize its performance measure, given the evidence provided by the percept sequence and
whatever built-in knowledge the agent has.
Definition of a rational agent
Consider the simple vacuum-cleaner agent that cleans a square if it is dirty and moves to the
other square if not; this is the agent function tabulated in Figure 2.3 . Is this a rational
agent? That depends! First, we need to say what the performance measure is, what is known
about the environment, and what sensors and actuators the agent has. Let us assume the
following:
The performance measure awards one point for each clean square at each time step,
over a “lifetime” of 1000 time steps.
The “geography” of the environment is known a priori (Figure 2.2 ) but the dirt
distribution and the initial location of the agent are not. Clean squares stay clean and
sucking cleans the current square. The and actions move the agent one
square except when this would take the agent outside the environment, in which case
the agent remains where it is.
The only available actions are , , and .
The agent correctly perceives its location and whether that location contains dirt.
Right Left
Right Left Suck
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE56.xhtml#P700101715800000000000000000EE77
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE56.xhtml#P700101715800000000000000000EE71
Under these circ*mstances the agent is indeed rational; its expected performance is at least
as good as any other agent’s.
One can see easily that the same agent would be irrational under different circ*mstances.
For example, once all the dirt is cleaned up, the agent will oscillate needlessly back and
forth; if the performance measure includes a penalty of one point for each movement, the
agent will fare poorly. A better agent for this case would do nothing once it is sure that all
the squares are clean. If clean squares can become dirty again, the agent should occasionally
check and re-clean them if needed. If the geography of the environment is unknown, the
agent will need to explore it. Exercise 2.VACR asks you to design agents for these cases.
2.2.3 Omniscience, learning, and autonomy
Omniscience
We need to be careful to distinguish between rationality and omniscience. An omniscient
agent knows the actual outcome of its actions and can act accordingly; but omniscience is
impossible in reality. Consider the following example: I am walking along the Champs
Elysées one day and I see an old friend across the street. There is no traffic nearby and I’m
not otherwise engaged, so, being rational, I start to cross the street. Meanwhile, at 33,000
feet, a cargo door falls off a passing airliner, and before I make it to the other side of the
street I am flattened. Was I irrational to cross the street? It is unlikely that my obituary
would read “Idiot attempts to cross street.”
3 See N. Henderson, “New door latches urged for Boeing 747 jumbo jets,” Washington Post, August 24, 1989.
This example shows that rationality is not the same as perfection. Rationality maximizes
expected performance, while perfection maximizes actual performance. Retreating from a
requirement of perfection is not just a question of being fair to agents. The point is that if we
expect an agent to do what turns out after the fact to be the best action, it will be impossible
to design an agent to fulfill this specification—unless we improve the performance of crystal
balls or time machines.
3
Our definition of rationality does not require omniscience, then, because the rational choice
depends only on the percept sequence to date. We must also ensure that we haven’t
inadvertently allowed the agent to engage in decidedly underintelligent activities. For
example, if an agent does not look both ways before crossing a busy road, then its percept
sequence will not tell it that there is a large truck approaching at high speed. Does our
definition of rationality say that it’s now OK to cross the road? Far from it!
First, it would not be rational to cross the road given this uninformative percept sequence:
the risk of accident from crossing without looking is too great. Second, a rational agent
should choose the “looking” action before stepping into the street, because looking helps
maximize the expected performance. Doing actions in order to modify future percepts—
sometimes called information gathering—is an important part of rationality and is covered
in depth in Chapter 16 . A second example of information gathering is provided by the
exploration that must be undertaken by a vacuum-cleaning agent in an initially unknown
environment.
Information gathering
Our definition requires a rational agent not only to gather information but also to learn as
much as possible from what it perceives. The agent’s initial configuration could reflect some
prior knowledge of the environment, but as the agent gains experience this may be modified
and augmented. There are extreme cases in which the environment is completely known a
priori and completely predictable. In such cases, the agent need not perceive or learn; it
simply acts correctly.
Learning
Of course, such agents are fragile. Consider the lowly dung beetle. After digging its nest and
laying its eggs, it fetches a ball of dung from a nearby heap to plug the entrance. If the ball
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194
of dung is removed from its grasp en route, the beetle continues its task and pantomimes
plugging the nest with the nonexistent dung ball, never noticing that it is missing. Evolution
has built an assumption into the beetle’s behavior, and when it is violated, unsuccessful
behavior results.
Slightly more intelligent is the sphex wasp. The female sphex will dig a burrow, go out and
sting a caterpillar and drag it to the burrow, enter the burrow again to check all is well, drag
the caterpillar inside, and lay its eggs. The caterpillar serves as a food source when the eggs
hatch. So far so good, but if an entomologist moves the caterpillar a few inches away while
the sphex is doing the check, it will revert to the “drag the caterpillar” step of its plan and
will continue the plan without modification, re-checking the burrow, even after dozens of
caterpillar-moving interventions. The sphex is unable to learn that its innate plan is failing,
and thus will not change it.
To the extent that an agent relies on the prior knowledge of its designer rather than on its
own percepts and learning processes, we say that the agent lacks autonomy. A rational
agent
,Intelligent Agents 36
2.1 Agents and Environments 36
2.2 Good Behavior: The Concept of Rationality 39
2.3 The Nature of Environments 42
2.4 The Structure of Agents 47
Summary 60
Bibliographical and Historical Notes 60
II Problem-solving
3 Solving Problems by Searching 63
3.1 Problem-Solving Agents 63
3.2 Example Problems 66
3.3 Search Algorithms 71
3.4 Uninformed Search Strategies 76
3.5 Informed (Heuristic) Search Strategies 84
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE41.xhtml#P700101715800000000000000000EE41
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBA2.xhtml#P700101715800000000000000000EBA2
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBB1.xhtml#P700101715800000000000000000EBB1
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EC1C.xhtml#P700101715800000000000000000EC1C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000ED0D.xhtml#P700101715800000000000000000ED0D
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000ED8C.xhtml#P700101715800000000000000000ED8C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EDBE.xhtml#P700101715800000000000000000EDBE
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EDF2.xhtml#P700101715800000000000000000EDF2
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE13.xhtml#P700101715800000000000000000EE13
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE4C.xhtml#P700101715800000000000000000EE4C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE56.xhtml#P700101715800000000000000000EE56
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE93.xhtml#P700101715800000000000000000EE93
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EED7.xhtml#P700101715800000000000000000EED7
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EF69.xhtml#P700101715800000000000000000EF69
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F05B.xhtml#P700101715800000000000000000F05B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F06F.xhtml#P700101715800000000000000000F06F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F575.xhtml#P700101715800000000000000000F575
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F097.xhtml#P700101715800000000000000000F097
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F0A7.xhtml#P700101715800000000000000000F0A7
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F106.xhtml#P700101715800000000000000000F106
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F194.xhtml#P700101715800000000000000000F194
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F254.xhtml#P700101715800000000000000000F254
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F32C.xhtml#P700101715800000000000000000F32C
3.6 Heuristic Functions 97
Summary 104
Bibliographical and Historical Notes 106
4 Search in Complex Environments 110
4.1 Local Search and Optimization Problems 110
4.2 Local Search in Continuous Spaces 119
4.3 Search with Nondeterministic Actions 122
4.4 Search in Partially Observable Environments 126
4.5 Online Search Agents and Unknown Environments 134
Summary 141
Bibliographical and Historical Notes 142
5 Adversarial Search and Games 146
5.1 Game Theory 146
5.2 Optimal Decisions in Games 148
5.3 Heuristic Alpha–Beta Tree Search 156
5.4 Monte Carlo Tree Search 161
5.5 Stochastic Games 164
5.6 Partially Observable Games 168
5.7 Limitations of Game Search Algorithms 173
Summary 174
Bibliographical and Historical Notes 175
6 Constraint Satisfaction Problems 180
6.1 Defining Constraint Satisfaction Problems 180
6.2 Constraint Propagation: Inference in CSPs 185
6.3 Backtracking Search for CSPs 191
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F450.xhtml#P700101715800000000000000000F450
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F4FA.xhtml#P700101715800000000000000000F4FA
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F52C.xhtml#P700101715800000000000000000F52C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F57F.xhtml#P700101715800000000000000000F57F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F588.xhtml#P700101715800000000000000000F588
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F6E7.xhtml#P700101715800000000000000000F6E7
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F717.xhtml#P700101715800000000000000000F717
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F77C.xhtml#P700101715800000000000000000F77C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F81B.xhtml#P700101715800000000000000000F81B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F8BE.xhtml#P700101715800000000000000000F8BE
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F8D1.xhtml#P700101715800000000000000000F8D1
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F90E.xhtml#P700101715800000000000000000F90E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F919.xhtml#P700101715800000000000000000F919
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F95D.xhtml#P700101715800000000000000000F95D
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FA0F.xhtml#P700101715800000000000000000FA0F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FA7B.xhtml#P700101715800000000000000000FA7B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FACB.xhtml#P700101715800000000000000000FACB
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FB03.xhtml#P700101715800000000000000000FB03
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FB3F.xhtml#P700101715800000000000000000FB3F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FB54.xhtml#P700101715800000000000000000FB54
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FB6B.xhtml#P700101715800000000000000000FB6B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FBA5.xhtml#P700101715800000000000000000FBA5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FBB2.xhtml#P700101715800000000000000000FBB2
,should be autonomous—it should learn what it can to compensate for partial or
incorrect prior knowledge. For example, a vacuum-cleaning agent that learns to predict
where and when additional dirt will appear will do better than one that does not.
Autonomy
As a practical matter, one seldom requires complete autonomy from the start: when the
agent has had little or no experience, it would have to act randomly unless the designer
gave some assistance. Just as evolution provides animals with enough built-in reflexes to
survive long enough to learn for themselves, it would be reasonable to provide an artificial
intelligent agent with some initial knowledge as well as an ability to learn. After sufficient
experience of its environment, the behavior of a rational agent can become effectively
independent of its prior knowledge. Hence, the incorporation of learning allows one to
design a single rational agent that will succeed in a vast variety of environments.
2.3 The Nature of Environments
Now that we have a definition of rationality, we are almost ready to think about building
rational agents. First, however, we must think about task environments, which are
essentially the “problems” to which rational agents are the “solutions.” We begin by showing
how to specify a task environment, illustrating the process with a number of examples. We
then show that task environments come in a variety of flavors. The nature of the task
environment directly affects the appropriate design for the agent program.
Task environment
2.3.1 Specifying the task environment
In our discussion of the rationality of the simple vacuum-cleaner agent, we had to specify
the performance measure, the environment, and the agent’s actuators and sensors. We
group all these under the heading of the task environment. For the acronymically minded,
we call this the PEAS (Performance, Environment, Actuators, Sensors) description. In
designing an agent, the first step must always be to specify the task environment as fully as
possible.
PEAS
The vacuum world was a simple example; let us consider a more complex problem: an
automated taxi driver. Figure 2.4 summarizes the PEAS description for the taxi’s task
environment. We discuss each element in more detail in the following paragraphs.
Figure 2.4
PEAS description of the task environment for an automated taxi driver.
First, what is the performance measure to which we would like our automated driver to
aspire? Desirable qualities include getting to the correct destination; minimizing fuel
consumption and wear and tear; minimizing the trip time or cost; minimizing violations of
traffic laws and disturbances to other drivers; maximizing safety and passenger comfort;
maximizing profits. Obviously, some of these goals conflict, so tradeoffs will be required.
Next, what is the driving environment that the taxi will face? Any taxi driver must deal with
a variety of roads, ranging from rural lanes and urban alleys to 12-lane freeways. The roads
contain other traffic, pedestrians, stray animals, road works, police cars, puddles, and
potholes. The taxi must also interact with potential and actual passengers. There are also
some optional choices. The taxi might need to operate in Southern California, where snow
is seldom a problem, or in Alaska, where it seldom is not. It could always be driving on the
right, or we might want it to be flexible enough to drive on the left when in Britain or Japan.
Obviously, the more restricted the environment, the easier the design problem.
The actuators for an automated taxi include those available to a human driver: control over
the engine through the accelerator and control over steering and braking. In addition, it will
need output to a display screen or voice synthesizer to talk back to the passengers, and
perhaps some way to communicate with other vehicles, politely or otherwise.
The basic sensors for the taxi will include one or more video cameras so that it can see, as
well as lidar and ultrasound sensors to detect distances to other cars and obstacles. To avoid
speeding tickets, the taxi should have a speedometer, and to control the vehicle properly,
especially on curves, it should have an accelerometer. To determine the mechanical state of
the vehicle, it will need the usual array of engine, fuel, and electrical system sensors. Like
many human drivers, it might want to access GPS signals so that it doesn’t get lost. Finally, it
will need touchscreen or voice input for the passenger to request a destination.
In Figure 2.5 , we have sketched the basic PEAS elements for a number of additional agent
types. Further examples appear in Exercise 2.PEAS. The examples include physical as well as
virtual environments. Note that virtual task environments can be just as complex as the
“real” world: for example, a software agent (or software robot or softbot) that trades on
auction and reselling Web sites deals with millions of other users and billions of objects,
many with real images.
Figure 2.5
Examples of agent types and their PEAS descriptions.
Software agent
Softbot
2.3.2 Properties of task environments
The range of task environments that might arise in AI is obviously vast. We can, however,
identify a fairly small number of dimensions along which task environments can be
categorized. These dimensions determine, to a large extent, the appropriate agent design
and the applicability of each of the principal families of techniques for agent
implementation. First we list the dimensions, then we analyze several task environments to
illustrate the ideas. The definitions here are informal; later chapters provide more precise
statements and examples of each kind of environment.
Fully observable
Partially observable
FULLY OBSERVABLE VS. PARTIALLY OBSERVABLE: If an agent’s sensors give it access
to the complete state of the environment at each point in time, then we say that the task
environment is fully observable. A task environment is effectively fully observable if the
sensors detect all aspects that are relevant to the choice of action; relevance, in turn,
depends on the performance measure. Fully observable environments are convenient
because the agent need not maintain any internal state to keep track of the world. An
environment might be partially observable because of noisy and inaccurate sensors or
because parts of the state are simply missing from the sensor data—for example, a vacuum
agent with only a local dirt sensor cannot tell whether there is dirt in other squares, and an
automated taxi cannot see what other drivers are thinking. If the agent has no sensors at all
then the environment is unobservable. One might think that in such cases the agent’s plight
is hopeless, but, as we discuss in Chapter 4 , the agent’s goals may still be achievable,
sometimes with certainty.
Unobservable
SINGLE-AGENT VS. MULTIAGENT: The distinction between single-agent and multiagent
environments may seem simple enough. For example, an agent solving a crossword puzzle
by itself is clearly in a single-agent environment, whereas an agent playing chess is in a two-
agent environment. However, there are some subtle issues. First, we have described how an
entity may be viewed as an agent, but we have not explained which entities must be viewed
as agents. Does an agent (the taxi driver for example) have to treat an object (another
vehicle) as an agent, or can it be treated merely as an object behaving according to the laws
of physics, analogous to waves at the beach or leaves blowing in the wind? The key
distinction is whether ’s behavior is best described as maximizing a performance measure
whose value depends on agent ’s behavior.
Single-agent
Multiagent
For example, in chess, the opponent entity is trying to maximize its performance measure,
which, by the rules of chess, minimizes agent ’s performance measure. Thus, chess is a
competitive multiagent environment. On the
,other hand, in the taxi-driving environment,
avoiding collisions maximizes the performance measure of all agents, so it is a partially
A B
B
A
B
A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F57F.xhtml#P700101715800000000000000000F57F
cooperative multiagent environment. It is also partially competitive because, for example,
only one car can occupy a parking space.
Competitive
Cooperative
The agent-design problems in multiagent environments are often quite different from those
in single-agent environments; for example, communication often emerges as a rational
behavior in multiagent environments; in some competitive environments, randomized
behavior is rational because it avoids the pitfalls of predictability.
Deterministic vs. nondeterministic. If the next state of the environment is completely
determined by the current state and the action executed by the agent(s), then we say the
environment is deterministic; otherwise, it is nondeterministic. In principle, an agent need
not worry about uncertainty in a fully observable, deterministic environment. If the
environment is partially observable, however, then it could appear to be nondeterministic.
Deterministic
Nondeterministic
Most real situations are so complex that it is impossible to keep track of all the unobserved
aspects; for practical purposes, they must be treated as nondeterministic. Taxi driving is
clearly nondeterministic in this sense, because one can never predict the behavior of traffic
exactly; moreover, one’s tires may blow out unexpectedly and one’s engine may seize up
without warning. The vacuum world as we described it is deterministic, but variations can
include nondeterministic elements such as randomly appearing dirt and an unreliable
suction mechanism (Exercise 2.VFIN).
One final note: the word stochastic is used by some as a synonym for “nondeterministic,”
but we make a distinction between the two terms; we say that a model of the environment
is stochastic if it explicitly deals with probabilities (e.g., “there’s a 25% chance of rain
tomorrow”) and “nondeterministic” if the possibilities are listed without being quantified
(e.g., “there’s a chance of rain tomorrow”).
Stochastic
EPISODIC VS. SEQUENTIAL: In an episodic task environment, the agent’s experience is
divided into atomic episodes. In each episode the agent receives a percept and then
performs a single action. Crucially, the next episode does not depend on the actions taken in
previous episodes. Many classification tasks are episodic. For example, an agent that has to
spot defective parts on an assembly line bases each decision on the current part, regardless
of previous decisions; moreover, the current decision doesn’t affect whether the next part is
defective. In sequential environments, on the other hand, the current decision could affect
all future decisions. Chess and taxi driving are sequential: in both cases, short-term actions
can have long-term consequences. Episodic environments are much simpler than sequential
environments because the agent does not need to think ahead.
4 The word “sequential” is also used in computer science as the antonym of “parallel.” The two meanings are largely unrelated.
Episodic
Sequential
4
Static
Dynamic
STATIC VS. DYNAMIC: If the environment can change while an agent is deliberating, then
we say the environment is dynamic for that agent; otherwise, it is static. Static environments
are easy to deal with because the agent need not keep looking at the world while it is
deciding on an action, nor need it worry about the passage of time. Dynamic environments,
on the other hand, are continuously asking the agent what it wants to do; if it hasn’t decided
yet, that counts as deciding to do nothing. If the environment itself does not change with the
passage of time but the agent’s performance score does, then we say the environment is
semidynamic. Taxi driving is clearly dynamic: the other cars and the taxi itself keep moving
while the driving algorithm dithers about what to do next. Chess, when played with a clock,
is semidynamic. Crossword puzzles are static.
Semidynamic
DISCRETE VS. CONTINUOUS: The discrete/continuous distinction applies to the state of
the environment, to the way time is handled, and to the percepts and actions of the agent. For
example, the chess environment has a finite number of distinct states (excluding the clock).
Chess also has a discrete set of percepts and actions. Taxi driving is a continuous-state and
continuous-time problem: the speed and location of the taxi and of the other vehicles sweep
through a range of continuous values and do so smoothly over time. Taxi-driving actions are
also continuous (steering angles, etc.). Input from digital cameras is discrete, strictly
speaking, but is typically treated as representing continuously varying intensities and
locations.
Discrete
Continuous
KNOWN VS. UNKNOWN: Strictly speaking, this distinction refers not to the environment
itself but to the agent’s (or designer’s) state of knowledge about the “laws of physics” of the
environment. In a known environment, the outcomes (or outcome probabilities if the
environment is nondeterministic) for all actions are given. Obviously, if the environment is
unknown, the agent will have to learn how it works in order to make good decisions.
Known
Unknown
The distinction between known and unknown environments is not the same as the one
between fully and partially observable environments. It is quite possible for a known
environment to be partially observable—for example, in solitaire card games, I know the
rules but am still unable to see the cards that have not yet been turned over. Conversely, an
unknown environment can be fully observable—in a new video game, the screen may show
the entire game state but I still don’t know what the buttons do until I try them.
As noted on page 39, the performance measure itself may be unknown, either because the
designer is not sure how to write it down correctly or because the ultimate user—whose
preferences matter—is not known. For example, a taxi driver usually won’t know whether a
new passenger prefers a leisurely or speedy journey, a cautious or aggressive driving style. A
virtual personal assistant starts out knowing nothing about the personal preferences of its
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE93.xhtml#P700101715800000000000000000EE95
new owner. In such cases, the agent may learn more about the performance measure based
on further interactions with the designer or user. This, in turn, suggests that the task
environment is necessarily viewed as a multiagent environment.
The hardest case is partially observable, multiagent, nondeterministic, sequential, dynamic,
continuous, and unknown. Taxi driving is hard in all these senses, except that the driver’s
environment is mostly known. Driving a rented car in a new country with unfamiliar
geography, different traffic laws, and nervous passengers is a lot more exciting.
Figure 2.6 lists the properties of a number of familiar environments. Note that the
properties are not always cut and dried. For example, we have listed the medical-diagnosis
task as single-agent because the disease process in a patient is not profitably modeled as an
agent; but a medical-diagnosis system might also have to deal with recalcitrant patients and
skeptical staff, so the environment could have a multiagent aspect. Furthermore, medical
diagnosis is episodic if one conceives of the task as selecting a diagnosis given a list of
symptoms; the problem is sequential if the task can include proposing a series of tests,
evaluating progress over the course of treatment, handling multiple patients, and so on.
Figure 2.6
Examples of task environments and their characteristics.
We have not included a “known/unknown” column because, as
,explained earlier, this is not
strictly a property of the environment. For some environments, such as chess and poker, it is
quite easy to supply the agent with full knowledge of the rules, but it is nonetheless
interesting to consider how an agent might learn to play these games without such
knowledge.
The code repository associated with this book (aima.cs.berkeley.edu) includes multiple
environment implementations, together with a general-purpose environment simulator for
evaluating an agent’s performance. Experiments are often carried out not for a single
environment but for many environments drawn from an environment class. For example, to
evaluate a taxi driver in simulated traffic, we would want to run many simulations with
different traffic, lighting, and weather conditions. We are then interested in the agent’s
average performance over the environment class.
Environment class
2.4 The Structure of Agents
So far we have talked about agents by describing behavior—the action that is performed after
any given sequence of percepts. Now we must bite the bullet and talk about how the insides
work. The job of AI is to design an agent program that implements the agent function—the
mapping from percepts to actions. We assume this program will run on some sort of
computing device with physical sensors and actuators—we call this the agent architecture:
Agent program
Agent architecture
Obviously, the program we choose has to be one that is appropriate for the architecture. If
the program is going to recommend actions like Walk, the architecture had better have legs.
The architecture might be just an ordinary PC, or it might be a robotic car with several
onboard computers, cameras, and other sensors. In general, the architecture makes the
percepts from the sensors available to the program, runs the program, and feeds the
program’s action choices to the actuators as they are generated. Most of this book is about
designing agent programs, although Chapters 25 and 26 deal directly with the sensors
and actuators.
2.4.1 Agent programs
The agent programs that we design in this book all have the same skeleton: they take the
current percept as input from the sensors and return an action to the actuators. Notice the
difference between the agent program, which takes the current percept as input, and the
agent function, which may depend on the entire percept history. The agent program has no
agent = architecture + program.
5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000148CB.xhtml#P70010171580000000000000000148CB
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C8B.xhtml#P7001017158000000000000000014C8B
choice but to take just the current percept as input because nothing more is available from
the environment; if the agent’s actions need to depend on the entire percept sequence, the
agent will have to remember the percepts.
5 There are other choices for the agent program skeleton; for example, we could have the agent programs be coroutines that run
asynchronously with the environment. Each such coroutine has an input and output port and consists of a loop that reads the input
port for percepts and writes actions to the output port.
We describe the agent programs in the simple pseudocode language that is defined in
Appendix B . (The online code repository contains implementations in real programming
languages.) For example, Figure 2.7 shows a rather trivial agent program that keeps track
of the percept sequence and then uses it to index into a table of actions to decide what to
do. The table—an example of which is given for the vacuum world in Figure 2.3 —
represents explicitly the agent function that the agent program embodies. To build a rational
agent in this way, we as designers must construct a table that contains the appropriate
action for every possible percept sequence.
Figure 2.7
The TABLE-DRIVEN-AGENT program is invoked for each new percept and returns an action each time. It
retains the complete percept sequence in memory.
It is instructive to consider why the table-driven approach to agent construction is doomed
to failure. Let be the set of possible percepts and let be the lifetime of the agent (the
total number of percepts it will receive). The lookup table will contain entries.
Consider the automated taxi: the visual input from a single camera (eight cameras is typical)
comes in at the rate of roughly 70 megabytes per second (30 frames per second,
pixels with 24 bits of color information). This gives a lookup table with over
entries for an hour’s driving. Even the lookup table for chess—a tiny, well-behaved fragment
of the real world—has (it turns out) at least entries. In comparison, the number of
atoms in the observable universe is less than . The daunting size of these tables means
that (a) no physical agent in this universe will have the space to store the table; (b) the
P T
∑T
t = 1
∣∣P ∣∣
t
1080 × 720
10600,000,000,000
10150
1080
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001543B.xhtml#P700101715800000000000000001543B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE56.xhtml#P700101715800000000000000000EE77
designer would not have time to create the table; and (c) no agent could ever learn all the
right table entries from its experience.
Despite all this, TABLE-DRIVEN-AGENT does do what we want, assuming the table is filled in
correctly: it implements the desired agent function.
The key challenge for AI is to find out how to write programs that, to the extent possible, produce
rational behavior from a smallish program rather than from a vast table.
We have many examples showing that this can be done successfully in other areas: for
example, the huge tables of square roots used by engineers and schoolchildren prior to the
1970s have now been replaced by a five-line program for Newton’s method running on
electronic calculators. The question is, can AI do for general intelligent behavior what
Newton did for square roots? We believe the answer is yes.
In the remainder of this section, we outline four basic kinds of agent programs that embody
the principles underlying almost all intelligent systems:
Simple reflex agents;
Model-based reflex agents;
Goal-based agents; and
Utility-based agents.
Each kind of agent program combines particular components in particular ways to generate
actions. Section 2.4.6 explains in general terms how to convert all these agents into
learning agents that can improve the performance of their components so as to generate
better actions. Finally, Section 2.4.7 describes the variety of ways in which the components
themselves can be represented within the agent. This variety provides a major organizing
principle for the field and for the book itself.
2.4.2 Simple reflex agents
The simplest kind of agent is the simple reflex agent. These agents select actions on the
basis of the current percept, ignoring the rest of the percept history. For example, the
vacuum agent whose agent function is tabulated in Figure 2.3 is a simple reflex agent,
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE56.xhtml#P700101715800000000000000000EE77
because its decision is based only on the current location and on whether that location
contains dirt. An agent program for this agent is shown in Figure 2.8 .
Figure 2.8
The agent program for a simple reflex agent in the two-location vacuum environment. This program
implements the agent function tabulated in Figure 2.3 .
Simple reflex agent
Notice that the vacuum agent program is very small indeed compared to the corresponding
table. The most obvious reduction comes from ignoring the percept history, which cuts
down the number of relevant percept
,sequences from to just 4. A further, small reduction
comes from the fact that when the current square is dirty, the action does not depend on the
location. Although we have written the agent program using if-then-else statements, it is
simple enough that it can also be implemented as a Boolean circuit.
Simple reflex behaviors occur even in more complex environments. Imagine yourself as the
driver of the automated taxi. If the car in front brakes and its brake lights come on, then you
should notice this and initiate braking. In other words, some processing is done on the
visual input to establish the condition we call “The car in front is braking.” Then, this
triggers some established connection in the agent program to the action “initiate braking.”
We call such a connection a condition–action rule, written as
6 Also called situation–action rules, productions, or if–then rules.
if car-in-front-is-braking then initiate-braking.
4T
6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE56.xhtml#P700101715800000000000000000EE77
condition–action rule
Humans also have many such connections, some of which are learned responses (as for
driving) and some of which are innate reflexes (such as blinking when something
approaches the eye). In the course of the book, we show several different ways in which
such connections can be learned and implemented.
The program in Figure 2.8 is specific to one particular vacuum environment. A more
general and flexible approach is first to build a general-purpose interpreter for condition–
action rules and then to create rule sets for specific task environments. Figure 2.9 gives the
structure of this general program in schematic form, showing how the condition–action
rules allow the agent to make the connection from percept to action. Do not worry if this
seems trivial; it gets more interesting shortly.
Figure 2.9
Schematic diagram of a simple reflex agent. We use rectangles to denote the current internal state of the
agent’s decision process, and ovals to represent the background information used in the process.
An agent program for Figure 2.9 is shown in Figure 2.10 . The INTERPRET-INPUT function
generates an abstracted description of the current state from the percept, and the RULE-
MATCH function returns the first rule in the set of rules that matches the given state
description. Note that the description in terms of “rules” and “matching” is purely
conceptual; as noted above, actual implementations can be as simple as a collection of logic
gates implementing a Boolean circuit. Alternatively, a “neural” circuit can be used, where
the logic gates are replaced by the nonlinear units of artificial neural networks (see Chapter
21 ).
Figure 2.10
A simple reflex agent. It acts according to a rule whose condition matches the current state, as defined by
the percept.
Simple reflex agents have the admirable property of being simple, but they are of limited
intelligence. The agent in Figure 2.10 will work only if the correct decision can be made on the
basis of just the current percept—that is, only if the environment is fully observable.
Even a little bit of unobservability can cause serious trouble. For example, the braking rule
given earlier assumes that the condition car-in-front-is-braking can be determined from the
current percept—a single frame of video. This works if the car in front has a centrally
mounted (and hence uniquely identifiable) brake light. Unfortunately, older models have
different configurations of taillights, brake lights, and turn-signal lights, and it is not always
possible to tell from a single image whether the car is braking or simply has its taillights on.
A simple reflex agent driving behind such a car would either brake continuously and
unnecessarily, or, worse, never brake at all.
We can see a similar problem arising in the vacuum world. Suppose that a simple reflex
vacuum agent is deprived of its location sensor and has only a dirt sensor. Such an agent has
just two possible percepts: and . It can in response to ; what
[Dirty] [Clean] Suck [Dirty]
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013CF5.xhtml#P7001017158000000000000000013CF5
should it do in response to ? Moving fails (forever) if it happens to start in
square , and moving fails (forever) if it happens to start in square . Infinite loops
are often unavoidable for simple reflex agents operating in partially observable
environments.
Escape from infinite loops is possible if the agent can randomize its actions. For example, if
the vacuum agent perceives , it might flip a coin to choose between and .
It is easy to show that the agent will reach the other square in an average of two steps.
Then, if that square is dirty, the agent will clean it and the task will be complete. Hence, a
randomized simple reflex agent might outperform a deterministic simple reflex agent.
Randomization
We mentioned in Section 2.3 that randomized behavior of the right kind can be rational in
some multiagent environments. In single-agent environments, randomization is usually not
rational. It is a useful trick that helps a simple reflex agent in some situations, but in most
cases we can do much better with more sophisticated deterministic agents.
2.4.3 Model-based reflex agents
The most effective way to handle partial observability is for the agent to keep track of the part
of the world it can’t see now. That is, the agent should maintain some sort of internal state that
depends on the percept history and thereby reflects at least some of the unobserved aspects
of the current state. For the braking problem, the internal state is not too extensive—just the
previous frame from the camera, allowing the agent to detect when two red lights at the
edge of the vehicle go on or off simultaneously. For other driving tasks such as changing
lanes, the agent needs to keep track of where the other cars are if it can’t see them all at
once. And for any driving to be possible at all, the agent needs to keep track of where its
keys are.
Internal state
[Clean] Left
A Right B
[Clean] Right Left
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EED7.xhtml#P700101715800000000000000000EED7
Updating this internal state information as time goes by requires two kinds of knowledge to
be encoded in the agent program in some form. First, we need some information about how
the world changes over time, which can be divided roughly into two parts: the effects of the
agent’s actions and how the world evolves independently of the agent. For example, when
the agent turns the steering wheel clockwise, the car turns to the right, and when it’s raining
the car’s cameras can get wet. This knowledge about “how the world works”—whether
implemented in simple Boolean circuits or in complete scientific theories—is called a
transition model of the world.
Transition model
Second, we need some information about how the state of the world is reflected in the
agent’s percepts. For example, when the car in front initiates braking, one or more
illuminated red regions appear in the forward-facing camera image, and, when the camera
gets wet, droplet-shaped objects appear in the image partially obscuring the road. This kind
of knowledge is called a sensor model.
Sensor model
Together, the transition model and sensor model allow an agent to keep track of the state of
the world—to the extent possible given the limitations of the agent’s sensors. An agent that
uses such models is called a model-based agent.
Model-based agent
Figure 2.11 gives the structure of the model-based reflex agent with internal state,
showing how the current percept is combined with the old internal state to generate the
updated description of the current state, based on the agent’s model of how the world
,works. The agent program is shown in Figure 2.12 . The interesting part is the function
UPDATE-STATE, which is responsible for creating the new internal state description. The
details of how models and states are represented vary widely depending on the type of
environment and the particular technology used in the agent design.
Figure 2.11
A model-based reflex agent.
Figure 2.12
A model-based reflex agent. It keeps track of the current state of the world, using an internal model. It
then chooses an action in the same way as the reflex agent.
Regardless of the kind of representation used, it is seldom possible for the agent to
determine the current state of a partially observable environment exactly. Instead, the box
labeled “what the world is like now” (Figure 2.11 ) represents the agent’s “best guess” (or
sometimes best guesses, if the agent entertains multiple possibilities). For example, an
automated taxi may not be able to see around the large truck that has stopped in front of it
and can only guess about what may be causing the hold-up. Thus, uncertainty about the
current state may be unavoidable, but the agent still has to make a decision.
2.4.4 Goal-based agents
Knowing something about the current state of the environment is not always enough to
decide what to do. For example, at a road junction, the taxi can turn left, turn right, or go
straight on. The correct decision depends on where the taxi is trying to get to. In other
words, as well as a current state description, the agent needs some sort of goal information
that describes situations that are desirable—for example, being at a particular destination.
The agent program can combine this with the model (the same information as was used in
the model-based reflex agent) to choose actions that achieve the goal. Figure 2.13 shows
the goal-based agent’s structure.
Figure 2.13
A model-based, goal-based agent. It keeps track of the world state as well as a set of goals it is trying to
achieve, and chooses an action that will (eventually) lead to the achievement of its goals.
Goal
Sometimes goal-based action selection is straightforward—for example, when goal
satisfaction results immediately from a single action. Sometimes it will be more tricky—for
example, when the agent has to consider long sequences of twists and turns in order to find
a way to achieve the goal. Search (Chapters 3 to 5 ) and planning (Chapter 11 ) are the
subfields of AI devoted to finding action sequences that achieve the agent’s goals.
Notice that decision making of this kind is fundamentally different from the condition–
action rules described earlier, in that it involves consideration of the future—both “What will
happen if I do such-and-such?” and “Will that make me happy?” In the reflex agent designs,
this information is not explicitly represented, because the built-in rules map directly from
percepts to actions. The reflex agent brakes when it sees brake lights, period. It has no idea
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F097.xhtml#P700101715800000000000000000F097
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F90E.xhtml#P700101715800000000000000000F90E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B83.xhtml#P7001017158000000000000000010B83
why. A goal-based agent brakes when it sees brake lights because that’s the only action that
it predicts will achieve its goal of not hitting other cars.
Although the goal-based agent appears less efficient, it is more flexible because the
knowledge that supports its decisions is represented explicitly and can be modified. For
example, a goal-based agent’s behavior can easily be changed to go to a different
destination, simply by specifying that destination as the goal. The reflex agent’s rules for
when to turn and when to go straight will work only for a single destination; they must all
be replaced to go somewhere new.
2.4.5 Utility-based agents
Goals alone are not enough to generate high-quality behavior in most environments. For
example, many action sequences will get the taxi to its destination (thereby achieving the
goal), but some are quicker, safer, more reliable, or cheaper than others. Goals just provide
a crude binary distinction between “happy” and “unhappy” states. A more general
performance measure should allow a comparison of different world states according to
exactly how happy they would make the agent. Because “happy” does not sound very
scientific, economists and computer scientists use the term utility instead.
7 The word “utility” here refers to “the quality of being useful,” not to the electric company or waterworks.
Utility
We have already seen that a performance measure assigns a score to any given sequence of
environment states, so it can easily distinguish between more and less desirable ways of
getting to the taxi’s destination. An agent’s utility function is essentially an internalization
of the performance measure. Provided that the internal utility function and the external
performance measure are in agreement, an agent that chooses actions to maximize its utility
will be rational according to the external performance measure.
Utility function
7
Let us emphasize again that this is not the only way to be rational—we have already seen a
rational agent program for the vacuum world (Figure 2.8 ) that has no idea what its utility
function is—but, like goal-based agents, a utility-based agent has many advantages in terms
of flexibility and learning. Furthermore, in two kinds of cases, goals are inadequate but a
utility-based agent can still make rational decisions. First, when there are conflicting goals,
only some of which can be achieved (for example, speed and safety), the utility function
specifies the appropriate tradeoff. Second, when there are several goals that the agent can
aim for, none of which can be achieved with certainty, utility provides a way in which the
likelihood of success can be weighed against the importance of the goals.
Partial observability and nondeterminism are ubiquitous in the real world, and so, therefore,
is decision making under uncertainty. Technically speaking, a rational utility-based agent
chooses the action that maximizes the expected utility of the action outcomes—that is, the
utility the agent expects to derive, on average, given the probabilities and utilities of each
outcome. (Appendix A defines expectation more precisely.) In Chapter 16 , we show that
any rational agent must behave as if it possesses a utility function whose expected value it
tries to maximize. An agent that possesses an explicit utility function can make rational
decisions with a general-purpose algorithm that does not depend on the specific utility
function being maximized. In this way, the “global” definition of rationality—designating as
rational those agent functions that have the highest performance—is turned into a “local”
constraint on rational-agent designs that can be expressed in a simple program.
Expected utility
The utility-based agent structure appears in Figure 2.14 . Utility-based agent programs
appear in Chapters 16 and 17 , where we design decision-making agents that must
handle the uncertainty inherent in nondeterministic or partially observable environments.
Decision making in multiagent environments is also studied in the framework of utility
theory, as explained in Chapter 18 .
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015390.xhtml#P7001017158000000000000000015390
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194
,https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124DA.xhtml#P70010171580000000000000000124DA
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B87.xhtml#P7001017158000000000000000012B87
Figure 2.14
A model-based, utility-based agent. It uses a model of the world, along with a utility function that
measures its preferences among states of the world. Then it chooses the action that leads to the best
expected utility, where expected utility is computed by averaging over all possible outcome states,
weighted by the probability of the outcome.
At this point, the reader may be wondering, “Is it that simple? We just build agents that
maximize expected utility, and we’re done?” It’s true that such agents would be intelligent,
but it’s not simple. A utility-based agent has to model and keep track of its environment,
tasks that have involved a great deal of research on perception, representation, reasoning,
and learning. The results of this research fill many of the chapters of this book. Choosing the
utility-maximizing course of action is also a difficult task, requiring ingenious algorithms
that fill several more chapters. Even with these algorithms, perfect rationality is usually
unachievable in practice because of computational complexity, as we noted in Chapter 1 .
We also note that not all utility-based agents are model-based; we will see in Chapters 22
and 26 that a model-free agent can learn what action is best in a particular situation
without ever learning exactly how that action changes the environment.
Model-free agent
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBA2.xhtml#P700101715800000000000000000EBA2
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C8B.xhtml#P7001017158000000000000000014C8B
Finally, all of this assumes that the designer can specify the utility function correctly;
Chapters 17 , 18 , and 22 consider the issue of unknown utility functions in more depth.
2.4.6 Learning agents
We have described agent programs with various methods for selecting actions. We have not,
so far, explained how the agent programs come into being. In his famous early paper, Turing
(1950) considers the idea of actually programming his intelligent machines by hand. He
estimates how much work this might take and concludes, “Some more expeditious method
seems desirable.” The method he proposes is to build learning machines and then to teach
them. In many areas of AI, this is now the preferred method for creating state-of-the-art
systems. Any type of agent (model-based, goal-based, utility-based, etc.) can be built as a
learning agent (or not).
Learning has another advantage, as we noted earlier: it allows the agent to operate in
initially unknown environments and to become more competent than its initial knowledge
alone might allow. In this section, we briefly introduce the main ideas of learning agents.
Throughout the book, we comment on opportunities and methods for learning in particular
kinds of agents. Chapters 19 –22 go into much more depth on the learning algorithms
themselves.
Learning element
Performance element
A learning agent can be divided into four conceptual components, as shown in Figure
2.15 . The most important distinction is between the learning element, which is
responsible for making improvements, and the performance element, which is responsible
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124DA.xhtml#P70010171580000000000000000124DA
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B87.xhtml#P7001017158000000000000000012B87
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166D4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001316E.xhtml#P700101715800000000000000001316E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4
for selecting external actions. The performance element is what we have previously
considered to be the entire agent: it takes in percepts and decides on actions. The learning
element uses feedback from the critic on how the agent is doing and determines how the
performance element should be modified to do better in the future.
Figure 2.15
A general learning agent. The “performance element” box represents what we have previously considered
to be the whole agent program. Now, the “learning element” box gets to modify that program to improve
its performance.
Critic
The design of the learning element depends very much on the design of the performance
element. When trying to design an agent that learns a certain capability, the first question is
not “How am I going to get it to learn this?” but “What kind of performance element will my
agent use to do this once it has learned how?” Given a design for the performance element,
learning mechanisms can be constructed to improve every part of the agent.
The critic tells the learning element how well the agent is doing with respect to a fixed
performance standard. The critic is necessary because the percepts themselves provide no
indication of the agent’s success. For example, a chess program could receive a percept
indicating that it has checkmated its opponent, but it needs a performance standard to know
that this is a good thing; the percept itself does not say so. It is important that the
performance standard be fixed. Conceptually, one should think of it as being outside the
agent altogether because the agent must not modify it to fit its own behavior.
The last component of the learning agent is the problem generator. It is responsible for
suggesting actions that will lead to new and informative experiences. If the performance
element had its way, it would keep doing the actions that are best, given what it knows, but
if the agent is willing to explore a little and do some perhaps suboptimal actions in the short
run, it might discover much better actions for the long run. The problem generator’s job is
to suggest these exploratory actions. This is what scientists do when they carry out
experiments. Galileo did not think that dropping rocks from the top of a tower in Pisa was
valuable in itself. He was not trying to break the rocks or to modify the brains of unfortunate
pedestrians. His aim was to modify his own brain by identifying a better theory of the
motion of objects.
Problem generator
The learning element can make changes to any of the “knowledge” components shown in
the agent diagrams (Figures 2.9 , 2.11 , 2.13 , and 2.14 ). The simplest cases involve
learning directly from the percept sequence. Observation of pairs of successive states of the
environment can allow the agent to learn “What my actions do” and “How the world
evolves” in response to its actions. For example, if the automated taxi exerts a certain
braking pressure when driving on a wet road, then it will soon find out how much
deceleration is actually achieved, and whether it skids off the road. The problem generator
might identify certain parts of the model that are in need of improvement and suggest
experiments, such as trying out the brakes on different road surfaces under different
conditions.
Improving the model components of a model-based agent so that they conform better with
reality is almost always a good idea, regardless of
,the external performance standard. (In
some cases, it is better from a computational point of view to have a simple but slightly
inaccurate model rather than a perfect but fiendishly complex model.) Information from the
external standard is needed when trying to learn a reflex component or a utility function.
For example, suppose the taxi-driving agent receives no tips from passengers who have
been thoroughly shaken up during the trip. The external performance standard must inform
the agent that the loss of tips is a negative contribution to its overall performance; then the
agent might be able to learn that violent maneuvers do not contribute to its own utility. In a
sense, the performance standard distinguishes part of the incoming percept as a reward (or
penalty) that provides direct feedback on the quality of the agent’s behavior. Hard-wired
performance standards such as pain and hunger in animals can be understood in this way.
Reward
Penalty
More generally, human choices can provide information about human preferences. For
example, suppose the taxi does not know that people generally don’t like loud noises, and
settles on the idea of blowing its horn continuously as a way of ensuring that pedestrians
know it’s coming. The consequent human behavior—covering ears, using bad language, and
possibly cutting the wires to the horn—would provide evidence to the agent with which to
update its utility function. This issue is discussed further in Chapter 22 .
In summary, agents have a variety of components, and those components can be
represented in many ways within the agent program, so there appears to be great variety
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4
among learning methods. There is, however, a single unifying theme. Learning in intelligent
agents can be summarized as a process of modification of each component of the agent to
bring the components into closer agreement with the available feedback information,
thereby improving the overall performance of the agent.
2.4.7 How the components of agent programs work
We have described agent programs (in very high-level terms) as consisting of various
components, whose function it is to answer questions such as: “What is the world like now?”
“What action should I do now?” “What do my actions do?” The next question for a student of
AI is, “How on Earth do these components work?” It takes about a thousand pages to begin
to answer that question properly, but here we want to draw the reader’s attention to some
basic distinctions among the various ways that the components can represent the
environment that the agent inhabits.
Roughly speaking, we can place the representations along an axis of increasing complexity
and expressive power—atomic, factored, and structured. To illustrate these ideas, it helps to
consider a particular agent component, such as the one that deals with “What my actions
do.” This component describes the changes that might occur in the environment as the
result of taking an action, and Figure 2.16 provides schematic depictions of how those
transitions might be represented.
Figure 2.16
Three ways to represent states and the transitions between them. (a) Atomic representation: a state (such
as B or C) is a black box with no internal structure; (b) Factored representation: a state consists of a
vector of attribute values; values can be Boolean, real-valued, or one of a fixed set of symbols. (c)
Structured representation: a state includes objects, each of which may have attributes of its own as well
as relationships to other objects.
In an atomic representation each state of the world is indivisible—it has no internal
structure. Consider the task of finding a driving route from one end of a country to the other
via some sequence of cities (we address this problem in Figure 3.1 on page 64). For the
purposes of solving this problem, it may suffice to reduce the state of the world to just the
name of the city we are in—a single atom of knowledge, a “black box” whose only
discernible property is that of being identical to or different from another black box. The
standard algorithms underlying search and game-playing (Chapters 3 –5 ), hidden
Markov models (Chapter 14 ), and Markov decision processes (Chapter 17 ) all work with
atomic representations.
Atomic representation
A factored representation splits up each state into a fixed set of variables or attributes,
each of which can have a value. Consider a higher-fidelity description for the same driving
problem, where we need to be concerned with more than just atomic location in one city or
another; we might need to pay attention to how much gas is in the tank, our current GPS
coordinates, whether or not the oil warning light is working, how much money we have for
tolls, what station is on the radio, and so on. While two different atomic states have nothing
in common—they are just different black boxes—two different factored states can share
some attributes (such as being at some particular GPS location) and not others (such as
having lots of gas or having no gas); this makes it much easier to work out how to turn one
state into another. Many important areas of AI are based on factored representations,
including constraint satisfaction algorithms (Chapter 6 ), propositional logic (Chapter 7 ),
planning (Chapter 11 ), Bayesian networks (Chapters 12 –16 ), and various machine
learning algorithms.
Factored representation
Variable
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F0A7.xhtml#P700101715800000000000000000F0C0
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F0A7.xhtml#P700101715800000000000000000F0B1
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F097.xhtml#P700101715800000000000000000F097
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F90E.xhtml#P700101715800000000000000000F90E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011529.xhtml#P7001017158000000000000000011529
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124DA.xhtml#P70010171580000000000000000124DA
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FBA5.xhtml#P700101715800000000000000000FBA5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FE31.xhtml#P700101715800000000000000000FE31
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B83.xhtml#P7001017158000000000000000010B83
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010E81.xhtml#P7001017158000000000000000010E81
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194
Attribute
Value
For many purposes, we need to understand the world as having things in it that are related to
each other, not just variables with values. For example, we might notice that a large truck
ahead of us is reversing into the driveway of a dairy farm, but a loose cow is blocking the
truck’s path. A factored representation is unlikely to be pre-equipped with the attribute
TruckAheadBackingIntoDairyFarmDrivewayBlockedByLooseCow with value true or false.
Instead, we would need a structured representation, in which objects such as cows and
trucks and their various and varying relationships can be described explicitly (see Figure
2.16(c) ). Structured representations underlie relational databases and first-order logic
(Chapters 8 , 9 , and 10 ), first-order probability models (Chapter 15 ), and much of
natural language
,understanding (Chapters 23 and 24 ). In fact, much of what humans
express in natural language concerns objects and their relationships.
Structured representation
As we mentioned earlier, the axis along which atomic, factored, and structured
representations lie is the axis of increasing expressiveness. Roughly speaking, a more
expressive representation can capture, at least as concisely, everything a less expressive one
can capture, plus some more. Often, the more expressive language is much more concise; for
example, the rules of chess can be written in a page or two of a structured-representation
language such as first-order logic but require thousands of pages when written in a factored-
representation language such as propositional logic and around pages when written in
an atomic language such as that of finite-state automata. On the other hand, reasoning and
1038
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001039E.xhtml#P700101715800000000000000001039E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001063E.xhtml#P700101715800000000000000001063E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001094A.xhtml#P700101715800000000000000001094A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011E2C.xhtml#P7001017158000000000000000011E2C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001427E.xhtml#P700101715800000000000000001427E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001461E.xhtml#P700101715800000000000000001461E
learning become more complex as the expressive power of the representation increases. To
gain the benefits of expressive representations while avoiding their drawbacks, intelligent
systems for the real world may need to operate at all points along the axis simultaneously.
Expressiveness
Localist representation
Another axis for representation involves the mapping of concepts to locations in physical
memory, whether in a computer or in a brain. If there is a one-to-one mapping between
concepts and memory locations, we call that a localist representation. On the other hand, if
the representation of a concept is spread over many memory locations, and each memory
location is employed as part of the representation of multiple different concepts, we call that
a distributed representation. Distributed representations are more robust against noise and
information loss. With a localist representation, the mapping from concept to memory
location is arbitrary, and if a transmission error garbles a few bits, we might confuse Truck
with the unrelated concept Truce. But with a distributed representation, you can think of
each concept representing a point in multidimensional space, and if you garble a few bits
you move to a nearby point in that space, which will have similar meaning.
Distributed representation
Summary
This chapter has been something of a whirlwind tour of AI, which we have conceived of as
the science of agent design. The major points to recall are as follows:
An agent is something that perceives and acts in an environment. The agent function
for an agent specifies the action taken by the agent in response to any percept sequence.
The performance measure evaluates the behavior of the agent in an environment. A
rational agent acts so as to maximize the expected value of the performance measure,
given the percept sequence it has seen so far.
A task environment specification includes the performance measure, the external
environment, the actuators, and the sensors. In designing an agent, the first step must
always be to specify the task environment as fully as possible.
Task environments vary along several significant dimensions. They can be fully or
partially observable, single-agent or multiagent, deterministic or nondeterministic,
episodic or sequential, static or dynamic, discrete or continuous, and known or
unknown.
In cases where the performance measure is unknown or hard to specify correctly, there
is a significant risk of the agent optimizing the wrong objective. In such cases the agent
design should reflect uncertainty about the true objective.
The agent program implements the agent function. There exists a variety of basic agent
program designs reflecting the kind of information made explicit and used in the
decision process. The designs vary in efficiency, compactness, and flexibility. The
appropriate design of the agent program depends on the nature of the environment.
Simple reflex agents respond directly to percepts, whereas model-based reflex agents
maintain internal state to track aspects of the world that are not evident in the current
percept. Goal-based agents act to achieve their goals, and utility-based agents try to
maximize their own expected “happiness.”
All agents can improve their performance through learning.
Bibliographical and Historical Notes
The central role of action in intelligence—the notion of practical reasoning—goes back at
least as far as Aristotle’s Nicomachean Ethics. Practical reasoning was also the subject of
McCarthy’s influential paper “Programs with Common Sense” (1958). The fields of robotics
and control theory are, by their very nature, concerned principally with physical agents. The
concept of a controller in control theory is identical to that of an agent in AI. Perhaps
surprisingly, AI has concentrated for most of its history on isolated components of agents—
question-answering systems, theorem-provers, vision systems, and so on—rather than on
whole agents. The discussion of agents in the text by Genesereth and Nilsson (1987) was an
influential exception. The whole-agent view is now widely accepted and is a central theme
in recent texts (Padgham and Winikoff, 2004; Jones, 2007; Poole and Mackworth, 2017).
Controller
Chapter 1 traced the roots of the concept of rationality in philosophy and economics. In
AI, the concept was of peripheral interest until the mid-1980s, when it began to suffuse
many discussions about the proper technical foundations of the field. A paper by Jon Doyle
(1983) predicted that rational agent design would come to be seen as the core mission of AI,
while other popular topics would spin off to form new disciplines.
Careful attention to the properties of the environment and their consequences for rational
agent design is most apparent in the control theory tradition—for example, classical control
systems (Dorf and Bishop, 2004; Kirk, 2004) handle fully observable, deterministic
environments; stochastic optimal control (Kumar and Varaiya, 1986; Bertsekas and Shreve,
2007) handles partially observable, stochastic environments; and hybrid control (Henzinger
and Sastry, 1998; Cassandras and Lygeros, 2006) deals with environments containing both
discrete and continuous elements. The distinction between fully and partially observable
environments is also central in the dynamic programming literature developed in the field
of operations research (Puterman, 1994), which we discuss in Chapter 17 .
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016136
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015BB5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000162BF
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015E2D
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016372
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBA2.xhtml#P700101715800000000000000000EBA2
,https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016372
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015A1C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015ED6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015F97
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000156AE
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015D19
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015813
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016394
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124DA.xhtml#P70010171580000000000000000124DA
Although simple reflex agents were central to behaviorist psychology (see Chapter 1 ),
most AI researchers view them as too simple to provide much leverage. (Rosenschein (1985)
and Brooks (1986) questioned this assumption; see Chapter 26 .) A great deal of work has
gone into finding efficient algorithms for keeping track of complex environments (Bar-
Shalom et al., 2001; Choset et al., 2005; Simon, 2006), most of it in the probabilistic setting.
Goal-based agents are presupposed in everything from Aristotle’s view of practical
reasoning to McCarthy’s early papers on logical AI. Shakey the Robot (Fikes and Nilsson,
1971; Nilsson, 1984) was the first robotic embodiment of a logical, goal-based agent. A full
logical analysis of goal-based agents appeared in Genesereth and Nilsson (1987), and a
goal-based programming methodology called agent-oriented programming was developed
by Shoham (1993). The agent-based approach is now extremely popular in software
engineering (Ciancarini and Wooldridge, 2001). It has also infiltrated the area of operating
systems, where autonomic computing refers to computer systems and networks that
monitor and control themselves with a perceive–act loop and machine learning methods
(Kephart and Chess, 2003). Noting that a collection of agent programs designed to work
well together in a true multiagent environment necessarily exhibits modularity—the
programs share no internal state and communicate with each other only through the
environment—it is common within the field of multiagent systems to design the agent
program of a single agent as a collection of autonomous sub-agents. In some cases, one can
even prove that the resulting system gives the same optimal solutions as a monolithic
design.
Autonomic computing
The goal-based view of agents also dominates the cognitive psychology tradition in the area
of problem solving, beginning with the enormously influential Human Problem Solving
(Newell and Simon, 1972) and running through all of Newell’s later work (Newell, 1990).
Goals, further analyzed as desires (general) and intentions (currently pursued), are central to
the influential theory of agents developed by Michael Bratman (1987).
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBA2.xhtml#P700101715800000000000000000EBA2
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001644C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001578A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C8B.xhtml#P7001017158000000000000000014C8B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001561B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001586F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016584
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AEC
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016279
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015BB5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016561
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001588E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015EBE
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016252
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001624A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015766
As noted in Chapter 1 , the development of utility theory as a basis for rational behavior
goes back hundreds of years. In AI, early research eschewed utilities in favor of goals, with
some exceptions (Feldman and Sproull, 1977). The resurgence of interest in probabilistic
methods in the 1980s led to the acceptance of maximization of expected utility as the most
general framework for decision making (Horvitz et al., 1988). The text by Pearl (1988) was
the first in AI to cover probability and utility theory in depth; its exposition of practical
methods for reasoning and decision making under uncertainty was probably the single
biggest factor in the rapid shift towards utility-based agents in the 1990s (see Chapter 16 ).
The formalization of reinforcement learning within a decision-theoretic framework also
contributed to this shift (Sutton, 1988). Somewhat remarkably, almost all AI research until
very recently has assumed that the performance measure can be exactly and correctly
specified in the form of a utility function or reward function (Hadfield-Menell et al., 2017a;
Russell, 2019).
The general design for learning agents portrayed in Figure 2.15 is classic in the machine
learning literature (Buchanan et al., 1978; Mitchell, 1997). Examples of the design, as
embodied in programs, go back at least as far as Arthur Samuel’s (1959, 1967) learning
program for playing checkers. Learning agents are discussed in depth in Chapters 19 –22 .
Some early papers on agent-based approaches are collected by Huhns and Singh (1998) and
Wooldridge and Rao (1999). Texts on multiagent systems provide a good introduction to
many aspects of agent design (Weiss, 2000a; Wooldridge, 2009). Several conference series
devoted to agents began in the 1990s, including the International Workshop on Agent
Theories, Architectures, and Languages (ATAL), the International Conference on
Autonomous Agents (AGENTS), and the International Conference on Multi-Agent Systems
(ICMAS). In 2002, these three merged to form the International Joint Conference on
Autonomous Agents and Multi-Agent Systems (AAMAS). From 2000 to 2012 there were
annual workshops on Agent-Oriented Software Engineering (AOSE). The journal
Autonomous Agents and Multi-Agent
,Systems was founded in 1998. Finally, Dung Beetle Ecology
(Hanski and Cambefort, 1991) provides a wealth of interesting information on the behavior
of dung beetles. YouTube has inspiring video recordings of their activities.
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBA2.xhtml#P700101715800000000000000000EBA2
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AC1
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015D86
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016303
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016633
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015C8C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016482
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EF69.xhtml#P700101715800000000000000000F019
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000157BC
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000161BB
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000164BA
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000164BC
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001316E.xhtml#P700101715800000000000000001316E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015DBC
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167FC
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016767
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167FA
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015CB5
II Problem-solving
Chapter 3
Solving Problems by Searching
In which we see how an agent can look ahead to find a sequence of actions that will eventually
achieve its goal.
When the correct action to take is not immediately obvious, an agent may need to to plan
ahead: to consider a sequence of actions that form a path to a goal state. Such an agent is
called a problem-solving agent, and the computational process it undertakes is called
search.
Problem-solving agent
Search
Problem-solving agents use atomic representations, as described in Section 2.4.7 —that is,
states of the world are considered as wholes, with no internal structure visible to the
problem-solving algorithms. Agents that use factored or structured representations of states
are called planning agents and are discussed in Chapters 7 and 11 .
We will cover several search algorithms. In this chapter, we consider only the simplest
environments: episodic, single agent, fully observable, deterministic, static, discrete, and
known. We distinguish between informed algorithms, in which the agent can estimate how
far it is from the goal, and uninformed algorithms, where no such estimate is available.
Chapter 4 relaxes the constraints on environments, and Chapter 5 considers multiple
agents.
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EF69.xhtml#P700101715800000000000000000F032
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FE31.xhtml#P700101715800000000000000000FE31
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B83.xhtml#P7001017158000000000000000010B83
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F57F.xhtml#P700101715800000000000000000F57F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F90E.xhtml#P700101715800000000000000000F90E
This chapter uses the concepts of asymptotic complexity (that is, notation). Readers
unfamiliar with these concepts should consult Appendix A .
O(n)
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015390.xhtml#P7001017158000000000000000015390
3.1 Problem-Solving Agents
Imagine an agent enjoying a touring vacation in Romania. The agent wants to take in the
sights, improve its Romanian, enjoy the nightlife, avoid hangovers, and so on. The decision
problem is a complex one. Now, suppose the agent is currently in the city of Arad and has a
nonrefundable ticket to fly out of Bucharest the following day. The agent observes street
signs and sees that there are three roads leading out of Arad: one toward Sibiu, one to
Timisoara, and one to Zerind. None of these are the goal, so unless the agent is familiar with
the geography of Romania, it will not know which road to follow.
1 We are assuming that most readers are in the same position and can easily imagine themselves to be as clueless as our agent. We
apologize to Romanian readers who are unable to take advantage of this pedagogical device.
If the agent has no additional information—that is, if the environment is unknown—then the
agent can do no better than to execute one of the actions at random. This sad situation is
discussed in Chapter 4 . In this chapter, we will assume our agents always have access to
information about the world, such as the map in Figure 3.1 . With that information, the
agent can follow this four-phase problem-solving process:
GOAL FORMULATION: The agent adopts the goal of reaching Bucharest. Goals
organize behavior by limiting the objectives and hence the actions to be considered.
Goal formulation
PROBLEM FORMULATION: The agent devises a description of the states and actions
necessary to reach the goal—an abstract model of the relevant part of the world. For our
agent, one good model is to consider the actions of traveling from one city to an
adjacent city, and therefore the only fact about the state of the world that will change
due to an action is the current city.
1
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F57F.xhtml#P700101715800000000000000000F57F
Problem formulation
SEARCH: Before taking any action in the real world, the agent simulates sequences of
actions in its model, searching until it finds a sequence of actions that reaches the goal.
Such a sequence is called a solution. The agent might have to simulate multiple
sequences that do not reach the goal, but eventually it will find a solution (such as going
from Arad to Sibiu to fa*garas to Bucharest), or it will find that no solution is possible.
Search
Solution
EXECUTION: The agent can now execute the actions in the solution, one at a time.
Execution
Figure 3.1
A simplified road map of part of Romania, with road distances
,https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FC3C.xhtml#P700101715800000000000000000FC3C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FCB9.xhtml#P700101715800000000000000000FCB9
6.4 Local Search for CSPs 197
6.5 The Structure of Problems 199
Summary 203
Bibliographical and Historical Notes 204
III Knowledge, reasoning, and planning
7 Logical Agents 208
7.1 Knowledge-Based Agents 209
7.2 The Wumpus World 210
7.3 Logic 214
7.4 Propositional Logic: A Very Simple Logic 217
7.5 Propositional Theorem Proving 222
7.6 Effective Propositional Model Checking 232
7.7 Agents Based on Propositional Logic 237
Summary 246
Bibliographical and Historical Notes 247
8 First-Order Logic 251
8.1 Representation Revisited 251
8.2 Syntax and Semantics of First-Order Logic 256
8.3 Using First-Order Logic 265
8.4 Knowledge Engineering in First-Order Logic 271
Summary 277
Bibliographical and Historical Notes 278
9 Inference in First-Order Logic 280
9.1 Propositional vs. First-Order Inference 280
9.2 Unification and First-Order Inference 282
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FD34.xhtml#P700101715800000000000000000FD34
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FD76.xhtml#P700101715800000000000000000FD76
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FDEE.xhtml#P700101715800000000000000000FDEE
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FDFE.xhtml#P700101715800000000000000000FDFE
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010395.xhtml#P7001017158000000000000000010395
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FE31.xhtml#P700101715800000000000000000FE31
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FE45.xhtml#P700101715800000000000000000FE45
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FE74.xhtml#P700101715800000000000000000FE74
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FF5C.xhtml#P700101715800000000000000000FF5C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FFB9.xhtml#P700101715800000000000000000FFB9
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001007B.xhtml#P700101715800000000000000001007B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001020A.xhtml#P700101715800000000000000001020A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010278.xhtml#P7001017158000000000000000010278
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010334.xhtml#P7001017158000000000000000010334
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010353.xhtml#P7001017158000000000000000010353
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001039E.xhtml#P700101715800000000000000001039E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000103A9.xhtml#P70010171580000000000000000103A9
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010408.xhtml#P7001017158000000000000000010408
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000104F2.xhtml#P70010171580000000000000000104F2
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010583.xhtml#P7001017158000000000000000010583
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001060C.xhtml#P700101715800000000000000001060C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001061E.xhtml#P700101715800000000000000001061E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001063E.xhtml#P700101715800000000000000001063E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010647.xhtml#P7001017158000000000000000010647
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001066D.xhtml#P700101715800000000000000001066D
9.3 Forward Chaining 286
9.4 Backward Chaining 293
9.5 Resolution 298
Summary 309
Bibliographical and Historical Notes 310
10 Knowledge Representation 314
10.1 Ontological Engineering 314
10.2 Categories and Objects 317
10.3 Events 322
10.4 Mental Objects and Modal Logic 326
10.5 Reasoning Systems for Categories 329
10.6 Reasoning with Default Information 333
Summary 337
Bibliographical and Historical Notes 338
11 Automated Planning 344
11.1 Definition of Classical Planning 344
11.2 Algorithms for Classical Planning 348
11.3 Heuristics for Planning 353
11.4 Hierarchical Planning 356
11.5 Planning and Acting in Nondeterministic Domains 365
11.6 Time, Schedules, and Resources 374
11.7 Analysis of Planning Approaches 378
Summary 379
Bibliographical and Historical Notes 380
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000106D2.xhtml#P70010171580000000000000000106D2
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010777.xhtml#P7001017158000000000000000010777
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000107FB.xhtml#P70010171580000000000000000107FB
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010905.xhtml#P7001017158000000000000000010905
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001091E.xhtml#P700101715800000000000000001091E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001094A.xhtml#P700101715800000000000000001094A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010954.xhtml#P7001017158000000000000000010954
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001097B.xhtml#P700101715800000000000000001097B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000109FC.xhtml#P70010171580000000000000000109FC
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010A40.xhtml#P7001017158000000000000000010A40
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010A7E.xhtml#P7001017158000000000000000010A7E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010ADB.xhtml#P7001017158000000000000000010ADB
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B1F.xhtml#P7001017158000000000000000010B1F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B3A.xhtml#P7001017158000000000000000010B3A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B83.xhtml#P7001017158000000000000000010B83
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B8C.xhtml#P7001017158000000000000000010B8C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010BFC.xhtml#P7001017158000000000000000010BFC
,in miles.
It is an important property that in a fully observable, deterministic, known environment, the
solution to any problem is a fixed sequence of actions: drive to Sibiu, then fa*garas, then
Bucharest. If the model is correct, then once the agent has found a solution, it can ignore its
percepts while it is executing the actions—closing its eyes, so to speak—because the solution
is guaranteed to lead to the goal. Control theorists call this an open-loop system: ignoring
the percepts breaks the loop between agent and environment. If there is a chance that the
model is incorrect, or the environment is nondeterministic, then the agent would be safer
using a closed-loop approach that monitors the percepts (see Section 4.4 ).
Open-loop
Closed-loop
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F77C.xhtml#P700101715800000000000000000F77C
In partially observable or nondeterministic environments, a solution would be a branching
strategy that recommends different future actions depending on what percepts arrive. For
example, the agent might plan to drive from Arad to Sibiu but might need a contingency
plan in case it arrives in Zerind by accident or finds a sign saying “Drum Închis” (Road
Closed).
3.1.1 Search problems and solutions
A search problem can be defined formally as follows:
Problem
A set of possible states that the environment can be in. We call this the state space.
States
State space
The initial state that the agent starts in. For example: Arad.
Initial state
A set of one or more goal states. Sometimes there is one goal state (e.g., Bucharest),
sometimes there is a small set of alternative goal states, and sometimes the goal is
defined by a property that applies to many states (potentially an infinite number). For
example, in a vacuum-cleaner world, the goal might be to have no dirt in any location,
regardless of any other facts about the state. We can account for all three of these
possibilities by specifying an IS-GOAL method for a problem. In this chapter we will
sometimes say “the goal” for simplicity, but what we say also applies to “any one of the
possible goal states.”
Goal states
The actions available to the agent. Given a state ACTIONS returns a finite set of
actions that can be executed in We say that each of these actions is applicable in
An example:
2 For problems with an infinite number of actions we would need techniques that go beyond this chapter.
Action
Applicable
A transition model, which describes what each action does. RESULT returns the state
that results from doing action in state For example,
Transition model
s, (s) 2
s. s.
ACTIONS(Arad) = {ToSibiu,ToTimisoara,ToZerind}.
(s,a)
a s.
RESULT(Arad,ToZerind) = Zerind.
An action cost function, denoted by ACTION-COST when we are programming or
when we are doing math, that gives the numeric cost of applying action in
state to reach state A problem-solving agent should use a cost function that reflects
its own performance measure; for example, for route-finding agents, the cost of an
action might be the length in miles (as seen in Figure 3.1 ), or it might be the time it
takes to complete the action.
Action cost function
A sequence of actions forms a path, and a solution is a path from the initial state to a goal
state. We assume that action costs are additive; that is, the total cost of a path is the sum of
the individual action costs. An optimal solution has the lowest path cost among all
solutions. In this chapter, we assume that all action costs will be positive, to avoid certain
complications.
3 In any problem with a cycle of net negative cost, the cost-optimal solution is to go around that cycle an infinite number of times. The
Bellman–Ford and Floyd–Warshall algorithms (not covered here) handle negative-cost actions, as long as there are no negative cycles.
It is easy to accommodate zero-cost actions, as long as the number of consecutive zero-cost actions is bounded. For example, we
might have a robot where there is a cost to move, but zero cost to rotate 90°; the algorithms in this chapter can handle this as long as
no more than three consecutive 90° turns are allowed. There is also a complication with problems that have an infinite number of
arbitrarily small action costs. Consider a version of Zeno’s paradox where there is an action to move half way to the goal, at a cost of
half of the previous move. This problem has no solution with a finite number of actions, but to prevent a search from taking an
unbounded number of actions without quite reaching the goal, we can require that all action costs be at least for some small positive
value
Path
Optimal solution
(s, a, s′)
c(s, a, s′) a
s s
′.
3
ϵ,
ϵ.
The state space can be represented as a graph in which the vertices are states and the
directed edges between them are actions. The map of Romania shown in Figure 3.1 is such
a graph, where each road indicates two actions, one in each direction.
Graph
3.1.2 Formulating problems
Our formulation of the problem of getting to Bucharest is a model—an abstract
mathematical description—and not the real thing. Compare the simple atomic state
description Arad to an actual cross-country trip, where the state of the world includes so
many things: the traveling companions, the current radio program, the scenery out of the
window, the proximity of law enforcement officers, the distance to the next rest stop, the
condition of the road, the weather, the traffic, and so on. All these considerations are left out
of our model because they are irrelevant to the problem of finding a route to Bucharest.
The process of removing detail from a representation is called abstraction. A good problem
formulation has the right level of detail. If the actions were at the level of “move the right
foot forward a centimeter” or “turn the steering wheel one degree left,” the agent would
probably never find its way out of the parking lot, let alone to Bucharest.
Abstraction
Can we be more precise about the appropriate level of abstraction? Think of the abstract
states and actions we have chosen as corresponding to large sets of detailed world states
and detailed action sequences. Now consider a solution to the abstract problem: for
example, the path from Arad to Sibiu to Rimnicu Vilcea to Pitesti to Bucharest. This abstract
solution corresponds to a large number of more detailed paths. For example, we could drive
with the radio on between Sibiu and Rimnicu Vilcea, and then switch it off for the rest of the
trip.
Level of abstraction
The abstraction is valid if we can elaborate any abstract solution into a solution in the more
detailed world; a sufficient condition is that for every detailed state that is “in Arad,” there is
a detailed path to some state that is “in Sibiu,” and so on. The abstraction is useful if
carrying out each of the actions in the solution is easier than the original problem; in our
case, the action “drive from Arad to Sibiu” can be carried out without further search or
planning by a driver with average skill. The choice of a good abstraction thus involves
removing as much detail as possible while retaining validity and ensuring that the abstract
actions are easy to carry out. Were it not for the ability to construct useful abstractions,
intelligent agents would be completely swamped by the real world.
4 See Section 11.4 .
4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010C89.xhtml#P7001017158000000000000000010C89
3.2 Example Problems
The problem-solving approach has been applied to a vast array of task environments. We
list some of the best known here, distinguishing between standardized and real-world
problems. A standardized problem is intended to illustrate or exercise various problem-
solving methods. It can be given a concise, exact description and hence is suitable as a
,benchmark for researchers to compare the performance of algorithms. A real-world
problem, such as robot navigation, is one whose solutions people actually use, and whose
formulation is idiosyncratic, not standardized, because, for example, each robot has different
sensors that produce different data.
Standardized problem
Real-world problem
3.2.1 Standardized problems
Grid world
A grid world problem is a two-dimensional rectangular array of square cells in which agents
can move from cell to cell. Typically the agent can move to any obstacle-free adjacent cell—
horizontally or vertically and in some problems diagonally. Cells can contain objects, which
the agent can pick up, push, or otherwise act upon; a wall or other impassible obstacle in a
cell prevents an agent from moving into that cell. The vacuum world from Section 2.1 can
be formulated as a grid world problem as follows:
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE56.xhtml#P700101715800000000000000000EE56
STATES: A state of the world says which objects are in which cells. For the vacuum
world, the objects are the agent and any dirt. In the simple two-cell version, the agent
can be in either of the two cells, and each call can either contain dirt or not, so there are
states (see Figure 3.2 ). In general, a vacuum environment with cells has
states.
Figure 3.2
The state-space graph for the two-cell vacuum world. There are 8 states and three actions for each
state:
INITIAL STATE: Any state can be designated as the initial state.
ACTIONS: In the two-cell world we defined three actions: Suck, move Left, and move
Right. In a two-dimensional multi-cell world we need more movement actions. We
could add Upward and Downward, giving us four absolute movement actions, or we
could switch to egocentric actions, defined relative to the viewpoint of the agent—for
example, Forward, Backward, TurnRight, and TurnLeft.
TRANSITION MODEL: Suck removes any dirt from the agent’s cell; Forward moves the
agent ahead one cell in the direction it is facing, unless it hits a wall, in which case the
action has no effect. Backward moves the agent in the opposite direction, while
TurnRight and TurnLeft change the direction it is facing by
GOAL STATES: The states in which every cell is clean.
ACTION COST: Each action costs 1.
2 ⋅ 2 ⋅ 2 = 8 n
n ⋅ 2n
L = Left,R = Right,S = Suck.
90∘.
Sokoban puzzle
Another type of grid world is the sokoban puzzle, in which the agent’s goal is to push a
number of boxes, scattered about the grid, to designated storage locations. There can be at
most one box per cell. When an agent moves forward into a cell containing a box and there
is an empty cell on the other side of the box, then both the box and the agent move forward.
The agent can’t push a box into another box or a wall. For a world with non-obstacle cells
and boxes, there are states; for example on an grid with a dozen
boxes, there are over 200 trillion states.
In a sliding-tile puzzle, a number of tiles (sometimes called blocks or pieces) are arranged
in a grid with one or more blank spaces so that some of the tiles can slide into the blank
space. One variant is the Rush Hour puzzle, in which cars and trucks slide around a
grid in an attempt to free a car from the traffic jam. Perhaps the best-known variant is the 8-
puzzle (see Figure 3.3 ), which consists of a grid with eight numbered tiles and one
blank space, and the 15-puzzle on a grid. The object is to reach a specified goal state,
such as the one shown on the right of the figure. The standard formulation of the 8 puzzle is
as follows:
STATES: A state description specifies the location of each of the tiles.
INITIAL STATE: Any state can be designated as the initial state. Note that a parity
property partitions the state space—any given goal can be reached from exactly half of
the possible initial states (see Exercise 3.PART).
ACTIONS: While in the physical world it is a tile that slides, the simplest way of
describing an action is to think of the blank space moving Left, Right, Up, or Down. If the
blank is at an edge or corner then not all actions will be applicable.
TRANSITION MODEL: Maps a state and action to a resulting state; for example, if we
apply Left to the start state in Figure 3.3 , the resulting state has the 5 and the blank
switched.
Figure 3.3
n
b n × n!/(b!(n − b)!) 8 × 8
6 × 6
3 × 3
4 × 4
A typical instance of the 8-puzzle.
GOAL STATE: Although any state could be the goal, we typically specify a state with
the numbers in order, as in Figure 3.3 .
ACTION COST: Each action costs 1.
Sliding-tile puzzle
8-puzzle
15-puzzle
Note that every problem formulation involves abstractions. The 8-puzzle actions are
abstracted to their beginning and final states, ignoring the intermediate locations where the
tile is sliding. We have abstracted away actions such as shaking the board when tiles get
stuck and ruled out extracting the tiles with a knife and putting them back again. We are left
with a description of the rules, avoiding all the details of physical manipulations.
Our final standardized problem was devised by Donald Knuth (1964) and illustrates how
infinite state spaces can arise. Knuth conjectured that starting with the number 4, a
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015EF4
sequence of square root, floor, and factorial operations can reach any desired positive
integer. For example, we can reach 5 from 4 as follows:
The problem definition is simple:
STATES: Positive real numbers.
INITIAL STATE: 4.
ACTIONS: Apply square root, floor, or factorial operation (factorial for integers only).
TRANSITION MODEL: As given by the mathematical definitions of the operations.
GOAL STATE: The desired positive integer.
ACTION COST: Each action costs 1.
The state space for this problem is infinite: for any integer greater than 2 the factorial
operator will always yield a larger integer. The problem is interesting because it explores
very large numbers: the shortest path to 5 goes through
Infinite state spaces arise frequently in tasks
involving the generation of mathematical expressions, circuits, proofs, programs, and other
recursively defined objects.
3.2.2 Real-world problems
We have already seen how the route-finding problem is defined in terms of specified
locations and transitions along edges between them. Route-finding algorithms are used in a
variety of applications. Some, such as Web sites and in-car systems that provide driving
directions, are relatively straightforward extensions of the Romania example. (The main
complications are varying costs due to traffic-dependent delays, and rerouting due to road
closures.) Others, such as routing video streams in computer networks, military operations
planning, and airline travel-planning systems, involve much more complex specifications.
Consider the airline travel problems that must be solved by a travel-planning Web site:
STATES: Each state obviously includes a location (e.g., an airport) and the current time.
Furthermore, because the cost of an action (a flight segment) may depend on previous
⌊
⎷
⎷√√√(4!)!⌋ = 5.
(4!)! = 620,448,401,733,239,439,360,000.
segments, their fare bases, and their status as domestic or international, the state must
record extra information about these “historical” aspects.
INITIAL STATE: The user’s home airport.
ACTIONS: Take any flight from the current location, in any seat class, leaving after the
current time, leaving enough time for within-airport transfer if needed.
TRANSITION MODEL: The state resulting from taking a flight will have the flight’s
destination as the new location and the flight’s arrival time as the new time.
GOAL STATE: A destination city. Sometimes the goal can be more complex, such as
“arrive at the destination
,on a nonstop flight.”
ACTION COST: A combination of monetary cost, waiting time, flight time, customs and
immigration procedures, seat quality, time of day, type of airplane, frequent-flyer reward
points, and so on.
Commercial travel advice systems use a problem formulation of this kind, with many
additional complications to handle the airlines’ byzantine fare structures. Any seasoned
traveler knows, however, that not all air travel goes according to plan. A really good system
should include contingency plans—what happens if this flight is delayed and the connection
is missed?
Touring problems describe a set of locations that must be visited, rather than a single goal
destination. The traveling salesperson problem (TSP) is a touring problem in which every
city on a map must be visited. The aim is to find a tour with cost (or in the optimization
version, to find a tour with the lowest cost possible). An enormous amount of effort has
been expended to improve the capabilities of TSP algorithms. The algorithms can also be
extended to handle fleets of vehicles. For example, a search and optimization algorithm for
routing school buses in Boston saved $5 million, cut traffic and air pollution, and saved time
for drivers and students (Bertsimas et al., 2019). In addition to planning trips, search
algorithms have been used for tasks such as planning the movements of automatic circuit-
board drills and of stocking machines on shop floors.
Touring problem
< C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000156B0
Traveling salesperson problem (TSP)
A VLSI layout problem requires positioning millions of components and connections on a
chip to minimize area, minimize circuit delays, minimize stray capacitances, and maximize
manufacturing yield. The layout problem comes after the logical design phase and is usually
split into two parts: cell layout and channel routing. In cell layout, the primitive
components of the circuit are grouped into cells, each of which performs some recognized
function. Each cell has a fixed footprint (size and shape) and requires a certain number of
connections to each of the other cells. The aim is to place the cells on the chip so that they
do not overlap and so that there is room for the connecting wires to be placed between the
cells. Channel routing finds a specific route for each wire through the gaps between the
cells. These search problems are extremely complex, but definitely worth solving.
VLSI layout
Robot navigation is a generalization of the route-finding problem described earlier. Rather
than following distinct paths (such as the roads in Romania), a robot can roam around, in
effect making its own paths. For a circular robot moving on a flat surface, the space is
essentially two-dimensional. When the robot has arms and legs that must also be controlled,
the search space becomes many-dimensional—one dimension for each joint angle.
Advanced techniques are required just to make the essentially continuous search space
finite (see Chapter 26 ). In addition to the complexity of the problem, real robots must also
deal with errors in their sensor readings and motor controls, with partial observability, and
with other agents that might alter the environment.
Robot navigation
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C8B.xhtml#P7001017158000000000000000014C8B
Automatic assembly sequencing of complex objects (such as electric motors) by a robot has
been standard industry practice since the 1970s. Algorithms first find a feasible assembly
sequence and then work to optimize the process. Minimizing the amount of manual human
labor on the assembly line can produce significant savings in time and cost. In assembly
problems, the aim is to find an order in which to assemble the parts of some object. If the
wrong order is chosen, there will be no way to add some part later in the sequence without
undoing some of the work already done. Checking an action in the sequence for feasibility is
a difficult geometrical search problem closely related to robot navigation. Thus, the
generation of legal actions is the expensive part of assembly sequencing. Any practical
algorithm must avoid exploring all but a tiny fraction of the state space. One important
assembly problem is protein design, in which the goal is to find a sequence of amino acids
that will fold into a three-dimensional protein with the right properties to cure some
disease.
Automatic assembly sequencing
Protein design
3.3 Search Algorithms
A search algorithm takes a search problem as input and returns a solution, or an indication
of failure. In this chapter we consider algorithms that superimpose a search tree over the
state-space graph, forming various paths from the initial state, trying to find a path that
reaches a goal state. Each node in the search tree corresponds to a state in the state space
and the edges in the search tree correspond to actions. The root of the tree corresponds to
the initial state of the problem.
Search algorithm
Node
It is important to understand the distinction between the state space and the search tree.
The state space describes the (possibly infinite) set of states in the world, and the actions
that allow transitions from one state to another. The search tree describes paths between
these states, reaching towards the goal. The search tree may have multiple paths to (and
thus multiple nodes for) any given state, but each node in the tree has a unique path back to
the root (as in all trees).
Expand
Figure 3.4 shows the first few steps in finding a path from Arad to Bucharest. The root
node of the search tree is at the initial state, Arad. We can expand the node, by considering
the available ACTIONS for that state, using the RESULT function to see where those actions lead
to, and generating a new node (called a child node or successor node) for each of the
resulting states. Each child node has Arad as its parent node.
Figure 3.4
Three partial search trees for finding a route from Arad to Bucharest. Nodes that have been expanded are
lavender with bold letters; nodes on the frontier that have been generated but not yet expanded are in
green; the set of states corresponding to these two types of nodes are said to have been reached. Nodes
that could be generated next are shown in faint dashed lines. Notice in the bottom tree there is a cycle
from Arad to Sibiu to Arad; that can’t be an optimal path, so search should not continue from there.
Generating
Child node
Successor node
Parent node
Now we must choose which of these three child nodes to consider next. This is the essence
of search—following up one option now and putting the others aside for later. Suppose we
choose to expand Sibiu first. Figure 3.4 (bottom) shows the result: a set of 6 unexpanded
nodes (outlined in bold). We call this the frontier of the search tree. We say that any state
that has had a node generated for it has been reached (whether or not that node has been
expanded). Figure 3.5 shows the search tree superimposed on the state-space graph.
5 Some authors call the frontier the open list, which is both geographically less evocative and computationally less appropriate,
because a queue is more efficient than a list here. Those authors use the term closed list to refer to the set of previously expanded
nodes, which in our terminology would be the reached nodes minus the frontier.
Figure 3.5
A sequence of search trees generated by a graph search on the Romania problem of Figure 3.1 . At each
stage, we have expanded every node on the frontier, extending every path with all applicable actions that
don’t result in a state that has already been reached. Notice that at the third stage, the topmost city
(Oradea) has two successors, both of which have already been reached by other paths, so
,no paths are
extended from Oradea.
5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F0A7.xhtml#P700101715800000000000000000F0C0
Frontier
Reached
Note that the frontier separates two regions of the state-space graph: an interior region
where every state has been expanded, and an exterior region of states that have not yet
been reached. This property is illustrated in Figure 3.6 .
Figure 3.6
The separation property of graph search, illustrated on a rectangular-grid problem. The frontier (green)
separates the interior (lavender) from the exterior (faint dashed). The frontier is the set of nodes (and
corresponding states) that have been reached but not yet expanded; the interior is the set of nodes (and
corresponding states) that have been expanded; and the exterior is the set of states that have not been
reached. In (a), just the root has been expanded. In (b), the top frontier node is expanded. In (c), the
remaining successors of the root are expanded in clockwise order.
Separator
3.3.1 Best-first search
How do we decide which node from the frontier to expand next? A very general approach is
called best-first search, in which we choose a node, with minimum value of some
n,
evaluation function, Figure 3.7 shows the algorithm. On each iteration we choose a
node on the frontier with minimum value, return it if its state is a goal state, and
otherwise apply EXPAND to generate child nodes. Each child node is added to the frontier if it
has not been reached before, or is re-added if it is now being reached with a path that has a
lower path cost than any previous path. The algorithm returns either an indication of failure,
or a node that represents a path to a goal. By employing different functions, we get
different specific algorithms, which this chapter will cover.
Figure 3.7
The best-first search algorithm, and the function for expanding a node. The data structures used here are
described in Section 3.3.2 . See Appendix B for yield.
Best-first search
Evaluation function
f(n).
f(n)
f(n)
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001543B.xhtml#P700101715800000000000000001543B
3.3.2 Search data structures
Search algorithms require a data structure to keep track of the search tree. A node in the
tree is represented by a data structure with four components:
node.STATE: the state to which the node corresponds;
node.PARENT: the node in the tree that generated this node;
node.ACTION: the action that was applied to the parent’s state to generate this node;
node.PATH-COST: the total cost of the path from the initial state to this node. In
mathematical formulas, we use as a synonym for PATH-COST.
Following the PARENT pointers back from a node allows us to recover the states and actions
along the path to that node. Doing this from a goal node gives us the solution.
We need a data structure to store the frontier. The appropriate choice is a queue of some
kind, because the operations on a frontier are:
IS-EMPTY(frontier) returns true only if there are no nodes in the frontier.
POP(frontier) removes the top node from the frontier and returns it.
TOP(frontier) returns (but does not remove) the top node of the frontier.
ADD(node, frontier) inserts node into its proper place in the queue.
Queue
Three kinds of queues are used in search algorithms:
A priority queue first pops the node with the minimum cost according to some
evaluation function, It is used in best-first search.
g(node)
f.
Priority queue
A FIFO queue or first-in-first-out queue first pops the node that was added to the queue
first; we shall see it is used in breadth-first search.
FIFO queue
A LIFO queue or last-in-first-out queue (also known as a stack) pops first the most
recently added node; we shall see it is used in depth-first search.
LIFO queue
Stack
The reached states can be stored as a lookup table (e.g. a hash table) where each key is a
state and each value is the node for that state.
3.3.3 Redundant paths
The search tree shown in Figure 3.4 (bottom) includes a path from Arad to Sibiu and back
to Arad again. We say that Arad is a repeated state in the search tree, generated in this case
by a cycle (also known as a loopy path). So even though the state space has only 20 states,
the complete search tree is infinite because there is no limit to how often one can traverse a
loop.
Repeated state
Cycle
Loopy path
A cycle is a special case of a redundant path. For example, we can get to Sibiu via the path
Arad–Sibiu (140 miles long) or the path Arad–Zerind–Oradea–Sibiu (297 miles long). This
second path is redundant—it’s just a worse way to get to the same state—and need not be
considered in our quest for optimal paths.
Redundant path
Consider an agent in a grid world, with the ability to move to any of 8 adjacent
squares. If there are no obstacles, the agent can reach any of the 100 squares in 9 moves or
fewer. But the number of paths of length 9 is almost (a bit less because of the edges of the
grid), or more than 100 million. In other words, the average cell can be reached by over a
million redundant paths of length 9, and if we eliminate redundant paths, we can complete a
search roughly a million times faster. As the saying goes, algorithms that cannot remember the
past are doomed to repeat it. There are three approaches to this issue.
First, we can remember all previously reached states (as best-first search does), allowing us
to detect all redundant paths, and keep only the best path to each state. This is appropriate
for state spaces where there are many redundant paths, and is the preferred choice when
the table of reached states will fit in memory.
10 × 10
89
Graph search
Tree-like search
Second, we can not worry about repeating the past. There are some problem formulations
where it is rare or impossible for two paths to reach the same state. An example would be
an assembly problem where each action adds a part to an evolving assemblage, and there is
an ordering of parts so that it is possible to add and then but not and then For
those problems, we could save memory space if we don’t track reached states and we don’t
check for redundant paths. We call a search algorithm a graph search if it checks for
redundant paths and a tree-like search if it does not check. The BEST-FIRST-SEARCH algorithm
in Figure 3.7 is a graph search algorithm; if we remove all references to reached we get a
tree-like search that uses less memory but will examine redundant paths to the same state,
and thus will run slower.
6 We say “tree-like search” because the state space is still the same graph no matter how we search it; we are just choosing to treat it
as if it were a tree, with only one path from each node back to the root.
Third, we can compromise and check for cycles, but not for redundant paths in general.
Since each node has a chain of parent pointers, we can check for cycles with no need for
additional memory by following up the chain of parents to see if the state at the end of the
path has appeared earlier in the path. Some implementations follow this chain all the way
up, and thus eliminate all cycles; other implementations follow only a few links (e.g., to the
parent, grandparent, and great-grandparent), and thus take only a constant amount of time,
while eliminating all short cycles (and relying on other mechanisms to deal with long
cycles).
3.3.4 Measuring problem-solving performance
Before we get into the design of various search algorithms, we will consider the criteria used
to choose among them. We can evaluate an algorithm’s performance in four ways:
A B, B A.
6
COMPLETENESS: Is the algorithm guaranteed to find a solution when there is one, and
to correctly report failure when there is not?
Completeness
COST OPTIMALITY: Does it find a solution with the lowest path cost of all solutions?
,7 Some authors use the term “admissibility” for the property of finding the lowest-cost solution, and some use just “optimality,”
but that can be confused with other types of optimality.
Cost optimality
TIME COMPLEXITY: How long does it take to find a solution? This can be measured in
seconds, or more abstractly by the number of states and actions considered.
Time complexity
SPACE COMPLEXITY: How much memory is needed to perform the search?
Space complexity
To understand completeness, consider a search problem with a single goal. That goal could
be anywhere in the state space; therefore a complete algorithm must be capable of
systematically exploring every state that is reachable from the initial state. In finite state
7
spaces that is straightforward to achieve: as long as we keep track of paths and cut off ones
that are cycles (e.g. Arad to Sibiu to Arad), eventually we will reach every reachable state.
In infinite state spaces, more care is necessary. For example, an algorithm that repeatedly
applied the “factorial” operator in Knuth’s “4” problem would follow an infinite path from 4
to 4! to (4!)!, and so on. Similarly, on an infinite grid with no obstacles, repeatedly moving
forward in a straight line also follows an infinite path of new states. In both cases the
algorithm never returns to a state it has reached before, but is incomplete because wide
expanses of the state space are never reached.
To be complete, a search algorithm must be systematic in the way it explores an infinite
state space, making sure it can eventually reach any state that is connected to the initial
state. For example, on the infinite grid, one kind of systematic search is a spiral path that
covers all the cells that are steps from the origin before moving out to cells that are
steps away. Unfortunately, in an infinite state space with no solution, a sound algorithm
needs to keep searching forever; it can’t terminate because it can’t know if the next state will
be a goal.
Systematic
Time and space complexity are considered with respect to some measure of the problem
difficulty. In theoretical computer science, the typical measure is the size of the state-space
graph, where is the number of vertices (state nodes) of the graph and is the
number of edges (distinct state/action pairs). This is appropriate when the graph is an
explicit data structure, such as the map of Romania. But in many AI problems, the graph is
represented only implicitly by the initial state, actions, and transition model. For an implicit
state space, complexity can be measured in terms of the depth or number of actions in an
optimal solution; the maximum number of actions in any path; and the branching
factor or number of successors of a node that need to be considered.
Depth
s s + 1
|V | + |E|, |V | |E|
d,
m, b,
Branching factor
3.4 Uninformed Search Strategies
An uninformed search algorithm is given no clue about how close a state is to the goal(s).
For example, consider our agent in Arad with the goal of reaching Bucharest. An
uninformed agent with no knowledge of Romanian geography has no clue whether going to
Zerind or Sibiu is a better first step. In contrast, an informed agent (Section 3.5 ) who
knows the location of each city knows that Sibiu is much closer to Bucharest and thus more
likely to be on the shortest path.
3.4.1 Breadth-first search
When all actions have the same cost, an appropriate strategy is breadth-first search, in
which the root node is expanded first, then all the successors of the root node are expanded
next, then their successors, and so on. This is a systematic search strategy that is therefore
complete even on infinite state spaces. We could implement breadth-first search as a call to
BEST-FIRST-SEARCH where the evaluation function is the depth of the node—that is, the
number of actions it takes to reach the node.
Breadth-first search
However, we can get additional efficiency with a couple of tricks. A first-in-first-out queue
will be faster than a priority queue, and will give us the correct order of nodes: new nodes
(which are always deeper than their parents) go to the back of the queue, and old nodes,
which are shallower than the new nodes, get expanded first. In addition, reached can be a set
of states rather than a mapping from states to nodes, because once we’ve reached a state, we
can never find a better path to the state. That also means we can do an early goal test,
checking whether a node is a solution as soon as it is generated, rather than the late goal test
that best-first search uses, waiting until a node is popped off the queue. Figure 3.8 shows
the progress of a breadth-first search on a binary tree, and Figure 3.9 shows the algorithm
with the early-goal efficiency enhancements.
f(n)
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F32C.xhtml#P700101715800000000000000000F32C
Figure 3.8
Breadth-first search on a simple binary tree. At each stage, the node to be expanded next is indicated by
the triangular marker.
Figure 3.9
Breadth-first search and uniform-cost search algorithms.
Early goal test
Late goal test
Breadth-first search always finds a solution with a minimal number of actions, because
when it is generating nodes at depth it has already generated all the nodes at depth
so if one of them were a solution, it would have been found. That means it is cost-optimal
for problems where all actions have the same cost, but not for problems that don’t have that
property. It is complete in either case. In terms of time and space, imagine searching a
uniform tree where every state has successors. The root of the search tree generates
nodes, each of which generates more nodes, for a total of at the second level. Each of
these generates more nodes, yielding nodes at the third level, and so on. Now suppose
that the solution is at depth Then the total number of nodes generated is
All the nodes remain in memory, so both time and space complexity are Exponential
bounds like that are scary. As a typical real-world example, consider a problem with
branching factor processing speed 1 million nodes/second, and memory
requirements of 1 Kbyte/node. A search to depth would take less than 3 hours, but
would require 10 terabytes of memory. The memory requirements are a bigger problem for
breadth-first search than the execution time. But time is still an important factor. At depth
even with infinite memory, the search would take 3.5 years. In general, exponential-
complexity search problems cannot be solved by uninformed search for any but the smallest
instances.
3.4.2 Dijkstra’s algorithm or uniform-cost search
When actions have different costs, an obvious choice is to use best-first search where the
evaluation function is the cost of the path from the root to the current node. This is called
Dijkstra’s algorithm by the theoretical computer science community, and uniform-cost
search by the AI community. The idea is that while breadth-first search spreads out in
waves of uniform depth—first depth 1, then depth 2, and so on—uniform-cost search spreads
out in waves of uniform path-cost. The algorithm can be implemented as a call to BEST-FIRST-
SEARCH with PATH-COST as the evaluation function, as shown in Figure 3.9 .
Uniform-cost search
d, d − 1,
b b
b b2
b b3
d.
1 + b + b2 + b3 + ⋯ + bd = O (bd)
O(bd).
b = 10,
d = 10
d = 14,
Consider Figure 3.10 , where the problem is to get from Sibiu to Bucharest. The successors
of Sibiu are Rimnicu Vilcea and fa*garas, with costs 80 and 99, respectively. The least-cost
node, Rimnicu Vilcea, is expanded next, adding Pitesti with cost The least-
cost node is now fa*garas, so it is expanded, adding Bucharest with cost
Bucharest is the goal, but the algorithm tests for goals only when it expands a node, not
when it generates a node, so it has not yet detected that this is a path
,to the goal.
Figure 3.10
Part of the Romania state space, selected to illustrate uniform-cost search.
The algorithm continues on, choosing Pitesti for expansion next and adding a second path
to Bucharest with cost It has a lower cost, so it replaces the previous
path in reached and is added to the frontier. It turns out this node now has the lowest cost, so
it is considered next, found to be a goal, and returned. Note that if we had checked for a
goal upon generating a node rather than when expanding the lowest-cost node, then we
would have returned a higher-cost path (the one through fa*garas).
The complexity of uniform-cost search is characterized in terms of the cost of the
optimal solution, and a lower bound on the cost of each action, with Then the
algorithm’s worst-case time and space complexity is which can be much greater
than This is because uniform-cost search can explore large trees of actions with low costs
before exploring paths involving a high-cost and perhaps useful action. When all action
costs are equal, is just and uniform-cost search is similar to breadth-first
search.
80 + 97 = 177.
99 + 211 = 310.
80 + 97 + 101 = 278.
C ∗,
8 ϵ, ϵ > 0.
O(b1+⌊C ∗/ϵ⌋),
bd.
b1+⌊C ∗/ϵ⌋ bd+1,
8 Here, and throughout the book, the “star” in means an optimal value for
Uniform-cost search is complete and is cost-optimal, because the first solution it finds will
have a cost that is at least as low as the cost of any other node in the frontier. Uniform-cost
search considers all paths systematically in order of increasing cost, never getting caught
going down a single infinite path (assuming that all action costs are ).
3.4.3 Depth-first search and the problem of memory
Depth-first search
Depth-first search always expands the deepest node in the frontier first. It could be
implemented as a call to BEST-FIRST-SEARCH where the evaluation function is the negative of
the depth. However, it is usually implemented not as a graph search but as a tree-like search
that does not keep a table of reached states. The progress of the search is illustrated in
Figure 3.11 ; search proceeds immediately to the deepest level of the search tree, where the
nodes have no successors. The search then “backs up” to the next deepest node that still has
unexpanded successors. Depth-first search is not cost-optimal; it returns the first solution it
finds, even if it is not cheapest.
Figure 3.11
C ∗ C.
> ϵ > 0
f
A dozen steps (left to right, top to bottom) in the progress of a depth-first search on a binary tree from
start state A to goal M. The frontier is in green, with a triangle marking the node to be expanded next.
Previously expanded nodes are lavender, and potential future nodes have faint dashed lines. Expanded
nodes with no descendants in the frontier (very faint lines) can be discarded.
For finite state spaces that are trees it is efficient and complete; for acyclic state spaces it
may end up expanding the same state many times via different paths, but will (eventually)
systematically explore the entire space.
In cyclic state spaces it can get stuck in an infinite loop; therefore some implementations of
depth-first search check each new node for cycles. Finally, in infinite state spaces, depth-first
search is not systematic: it can get stuck going down an infinite path, even if there are no
cycles. Thus, depth-first search is incomplete.
With all this bad news, why would anyone consider using depth-first search rather than
breadth-first or best-first? The answer is that for problems where a tree-like search is
feasible, depth-first search has much smaller needs for memory. We don’t keep a reached
table at all, and the frontier is very small: think of the frontier in breadth-first search as the
surface of an ever-expanding sphere, while the frontier in depth-first search is just a radius
of the sphere.
For a finite tree-shaped state-space like the one in Figure 3.11 , a depth-first tree-like
search takes time proportional to the number of states, and has memory complexity of only
where is the branching factor and is the maximum depth of the tree. Some
problems that would require exabytes of memory with breadth-first search can be handled
with only kilobytes using depth-first search. Because of its parsimonious use of memory,
depth-first tree-like search has been adopted as the basic workhorse of many areas of AI,
including constraint satisfaction (Chapter 6 ), propositional satisfiability (Chapter 7 ), and
logic programming (Chapter 9 ).
A variant of depth-first search called backtracking search uses even less memory. (See
Chapter 6 for more details.) In backtracking, only one successor is generated at a time
rather than all successors; each partially expanded node remembers which successor to
generate next. In addition, successors are generated by modifying the current state
description directly rather than allocating memory for a brand-new state. This reduces the
memory requirements to just one state description and a path of actions; a significant
savings over states for depth-first search. With backtracking we also have the option
of maintaining an efficient set data structure for the states on the current path, allowing us
to check for a cyclic path in time rather than For backtracking to work, we must
be able to undo each action when we backtrack. Backtracking is critical to the success of
many problems with large state descriptions, such as robotic assembly.
Backtracking search
3.4.4 Depth-limited and iterative deepening search
To keep depth-first search from wandering down an infinite path, we can use depth-limited
search, a version of depth-first search in which we supply a depth limit, and treat all
nodes at depth as if they had no successors (see Figure 3.12 ). The time complexity is
O(bm), b m
O(m)
O(bm)
O(1) O(m).
ℓ,
ℓ
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FBA5.xhtml#P700101715800000000000000000FBA5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FE31.xhtml#P700101715800000000000000000FE31
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001063E.xhtml#P700101715800000000000000001063E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FBA5.xhtml#P700101715800000000000000000FBA5
and the space complexity is Unfortunately, if we make a poor choice for the
algorithm will fail to reach the solution, making it incomplete again.
Figure 3.12
Iterative deepening and depth-limited tree-like search. Iterative deepening repeatedly applies depth-
limited search with increasing limits. It returns one of three different types of values: either a solution
node; or failure, when it has exhausted all nodes and proved there is no solution at any depth; or cutoff,
to mean there might be a solution at a deeper depth than This is a tree-like search algorithm that does
not keep track of reached states, and thus uses much less memory than best-first search, but runs the risk
of visiting the same state multiple times on different paths. Also, if the IS-CYCLE check does not check all
cycles, then the algorithm may get caught in a loop.
Depth-limited search
Since depth-first search is a tree-like search, we can’t keep it from wasting time on
redundant paths in general, but we can eliminate cycles at the cost of some computation
time. If we look only a few links up in the parent chain we can catch most cycles; longer
cycles are handled by the depth limit.
O(bℓ) O(bℓ). ℓ
ℓ.
Sometimes a good depth limit can be chosen based on knowledge of the problem. For
example, on the map of Romania there are 20 cities. Therefore, is a valid limit. But if
we studied the map carefully, we would discover that any city can be reached from any
other city in at most 9 actions. This number, known as the
,diameter of the state-space
graph, gives us a better depth limit, which leads to a more efficient depth-limited search.
However, for most problems we will not know a good depth limit until we have solved the
problem.
Diameter
Iterative deepening search solves the problem of picking a good value for by trying all
values: first 0, then 1, then 2, and so on—until either a solution is found, or the depth-
limited search returns the failure value rather than the cutoff value. The algorithm is shown
in Figure 3.12 . Iterative deepening combines many of the benefits of depth-first and
breadth-first search. Like depth-first search, its memory requirements are modest:
when there is a solution, or on finite state spaces with no solution. Like breadth-first
search, iterative deepening is optimal for problems where all actions have the same cost,
and is complete on finite acyclic state spaces, or on any finite state space when we check
nodes for cycles all the way up the path.
Iterative deepening search
The time complexity is when there is a solution, or when there is none. Each
iteration of iterative deepening search generates a new level, in the same way that breadth-
first search does, but breadth-first does this by storing all nodes in memory, while iterative-
deepening does it by repeating the previous levels, thereby saving memory at the cost of
more time. Figure 3.13 shows four iterations of iterative-deepening search on a binary
search tree, where the solution is found on the fourth iteration.
ℓ = 19
ℓ
O(bd)
O(bm)
O(bd) O(bm)
Figure 3.13
Four iterations of iterative deepening search for goal on a binary tree, with the depth limit varying
from 0 to 3. Note the interior nodes form a single path. The triangle marks the node to expand next;
green nodes with dark outlines are on the frontier; the very faint nodes provably can’t be part of a
solution with this depth limit.
Iterative deepening search may seem wasteful because states near the top of the search tree
are re-generated multiple times. But for many state spaces, most of the nodes are in the
bottom level, so it does not matter much that the upper levels are repeated. In an iterative
deepening search, the nodes on the bottom level (depth ) are generated once, those on the
next-to-bottom level are generated twice, and so on, up to the children of the root, which
are generated times. So the total number of nodes generated in the worst case is
M
d
d
which gives a time complexity of —asymptotically the same as breadth-first search. For
example, if and the numbers are
If you are really concerned about the repetition, you can use a hybrid approach that runs
breadth-first search until almost all the available memory is consumed, and then runs
iterative deepening from all the nodes in the frontier. In general, iterative deepening is the
preferred uninformed search method when the search state space is larger than can fit in memory
and the depth of the solution is not known.
3.4.5 Bidirectional search
The algorithms we have covered so far start at an initial state and can reach any one of
multiple possible goal states. An alternative approach called bidirectional search
simultaneously searches forward from the initial state and backwards from the goal state(s),
hoping that the two searches will meet. The motivation is that is much less than
(e.g., 50,000 times less when ).
Bidirectional search
For this to work, we need to keep track of two frontiers and two tables of reached states, and
we need to be able to reason backwards: if state is a successor of in the forward
direction, then we need to know that is a successor of in the backward direction. We
have a solution when the two frontiers collide.
9 In our implementation, the reached data structure supports a query asking whether a given state is a member, and the frontier data
structure (a priority queue) does not, so we check for a collision using reached; but conceptually we are asking if the two frontiers have
met up. The implementation can be extended to handle multiple goal states by loading the node for each goal state into the backwards
frontier and backwards reached table.
N(IDS) = (d)b1 + (d − 1)b2 + (d − 2)b3 … + bd,
O(bd)
b = 10 d = 5,
N(IDS) = 50 + 400 + 3,000 + 20,000 + 100,000 = 123,450
N(BFS) = 10 + 100 + 1,000 + 10,000 + 100,000 = 111,110.
bd/2 + bd/2
bd b = d = 10
s' s
s s'
9
There are many different versions of bidirectional search, just as there are many different
unidirectional search algorithms. In this section, we describe bidirectional best-first search.
Although there are two separate frontiers, the node to be expanded next is always one with
a minimum value of the evaluation function, across either frontier. When the evaluation
function is the path cost, we get bidirectional uniform-cost search, and if the cost of the
optimal path is then no node with cost will be expanded. This can result in a
considerable speedup.
The general best-first bidirectional search algorithm is shown in Figure 3.14 . We pass in
two versions of the problem and the evaluation function, one in the forward direction
(subscript ) and one in the backward direction (subscript ). When the evaluation
function is the path cost, we know that the first solution found will be an optimal solution,
but with different evaluation functions that is not necessarily true. Therefore, we keep track
of the best solution found so far, and might have to update that several times before the
TERMINATED test proves that there is no possible better solution remaining.
Figure 3.14
C ∗, > C ∗
2
F B
Bidirectional best-first search keeps two frontiers and two tables of reached states. When a path in one
frontier reaches a state that was also reached in the other half of the search, the two paths are joined (by
the function JOIN-NODES) to form a solution. The first solution we get is not guaranteed to be the best; the
function TERMINATED determines when to stop looking for new solutions.
3.4.6 Comparing uninformed search algorithms
Figure 3.15 compares uninformed search algorithms in terms of the four evaluation criteria
set forth in Section 3.3.4 . This comparison is for tree-like search versions which don’t
check for repeated states. For graph searches which do check, the main differences are that
depth-first search is complete for finite state spaces, and the space and time complexities are
bounded by the size of the state space (the number of vertices and edges, ).
Figure 3.15
|V | + |E|
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F194.xhtml#P700101715800000000000000000F235
Evaluation of search algorithms. is the branching factor; is the maximum depth of the search tree; is
the depth of the shallowest solution, or is when there is no solution; is the depth limit. Superscript
caveats are as follows: complete if is finite, and the state space either has a solution or is finite.
complete if all action costs are cost-optimal if action costs are all identical; if both directions
are breadth-first or uniform-cost.
b m d
m ℓ
1 b 2
≥ ε > 0; 3 4
3.5 Informed (Heuristic) Search Strategies
This section shows how an informed search strategy—one that uses domain-specific hints
about the location of goals—can find solutions more efficiently than an uninformed strategy.
The hints come in the form of a heuristic function, denoted
10 It may seem odd that the heuristic function operates on a node, when all it really needs is the node’s state. It is traditional to use
rather than to be consistent with the evaluation function and the path cost
Informed search
Heuristic function
For example, in route-finding problems, we can estimate the distance from the current state
to a goal by computing the straight-line distance on the map between the two points. We
study heuristics and where they come from in more detail in Section 3.6
,.
3.5.1 Greedy best-first search
Greedy best-first search is a form of best-first search that expands first the node with the
lowest value—the node that appears to be closest to the goal—on the grounds that this
is likely to lead to a solution quickly. So the evaluation function
Greedy best-first search
h(n):10
h (n) h (s) f(n) g (n) .
h(n) = estimated cost of the cheapest path from the state at nodento a goal state.
h(n)
f(n) = h(n).
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F450.xhtml#P700101715800000000000000000F450
Let us see how this works for route-finding problems in Romania; we use the straight-line
distance heuristic, which we will call If the goal is Bucharest, we need to know the
straight-line distances to Bucharest, which are shown in Figure 3.16 . For example,
Notice that the values of cannot be computed from the problem
description itself (that is, the ACTIONS and RESULT functions). Moreover, it takes a certain
amount of world knowledge to know that is correlated with actual road distances and
is, therefore, a useful heuristic.
Figure 3.16
Values of —straight-line distances to Bucharest.
Straight-line distance
Figure 3.17 shows the progress of a greedy best-first search using to find a path from
Arad to Bucharest. The first node to be expanded from Arad will be Sibiu because the
heuristic says it is closer to Bucharest than is either Zerind or Timisoara. The next node to
be expanded will be fa*garas because it is now closest according to the heuristic. fa*garas in
turn generates Bucharest, which is the goal. For this particular problem, greedy best-first
search using finds a solution without ever expanding a node that is not on the solution
path. The solution it found does not have optimal cost, however: the path via Sibiu and
fa*garas to Bucharest is 32 miles longer than the path through Rimnicu Vilcea and Pitesti.
This is why the algorithm is called “greedy”—on each iteration it tries to get as close to a
goal as it can, but greediness can lead to worse results than being careful.
hSLD.
hSLD(Arad) = 366. hSLD
hSLD
hSLD
hSLD
hSLD
Figure 3.17
Stages in a greedy best-first tree-like search for Bucharest with the straight-line distance heuristic
Nodes are labeled with their -values.
Greedy best-first graph search is complete in finite state spaces, but not in infinite ones. The
worst-case time and space complexity is With a good heuristic function, however,
the complexity can be reduced substantially, on certain problems reaching
3.5.2 A* search
The most common informed search algorithm is A* search (pronounced “A-star search”), a
best-first search that uses the evaluation function
hSLD.
h
O(|V |).
O(bm).
f(n) = g(n) + h(n)
A* search
where is the path cost from the initial state to node and is the estimated cost of
the shortest path from to a goal state, so we have
In Figure 3.18 , we show the progress of an A* search with the goal of reaching Bucharest.
The values of are computed from the action costs in Figure 3.1 , and the values of
are given in Figure 3.16 . Notice that Bucharest first appears on the frontier at step (e), but
it is not selected for expansion (and thus not detected as a solution) because at it is
not the lowest-cost node on the frontier—that would be Pitesti, at Another way to
say this is that there might be a solution through Pitesti whose cost is as low as 417, so the
algorithm will not settle for a solution that costs 450. At step (f), a different path to
Bucharest is now the lowest-cost node, at so it is selected and detected as the
optimal solution.
Figure 3.18
g(n) n, h(n)
n
f(n) = estimated cost of the best path that continues fromnto a goal.
g hSLD
f = 450
f = 417.
f = 418,
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F0A7.xhtml#P700101715800000000000000000F0C0
Stages in an A* search for Bucharest. Nodes are labeled with The values are the straight-line
distances to Bucharest taken from Figure 3.16 .
Admissible heuristic
A* search is complete. Whether A* is cost-optimal depends on certain properties of the
heuristic. A key property is admissibility: an admissible heuristic is one that never
overestimates the cost to reach a goal. (An admissible heuristic is therefore optimistic.) With
an admissible heuristic, A* is cost-optimal, which we can show with a proof by
contradiction. Suppose the optimal path has cost but the algorithm returns a path with
cost Then there must be some node which is on the optimal path and is
unexpanded (because if all the nodes on the optimal path had been expanded, then we
would have returned that optimal solution). So then, using the notation to mean the
cost of the optimal path from the start to and to mean the cost of the optimal path
from to the nearest goal, we have:
11 Again, assuming all action costs are and the state space either has a solution or is finite.
The first and last lines form a contradiction, so the supposition that the algorithm could
return a suboptimal path must be wrong—it must be that A* returns only cost-optimal
paths.
A slightly stronger property is called consistency. A heuristic is consistent if, for every
node and every successor of generated by an action we have:
f = g + h. h
11
C ∗,
C > C ∗. n
g∗(n)
n, h∗(n)
n
>∈> 0,
f (n) > C ∗ (otherwisenwouldhavebeenexpanded)
f (n) = g (n) + h (n) (bydefinition)
f (n) = g∗ (n) + h (n) (becausen isonanoptimalpath)
f (n) ≤ g∗ (n) + h∗ (n) (because ofadmissibility,h (n) ≤ h∗ (n))
f (n) ≤ C ∗ (by definition,C ∗ = g∗ (n) + h∗ (n))
h(n)
n n′ n a,
h(n) ≤ c(n, a,n′) + h(n′).
Consistency
This is a form of the triangle inequality, which stipulates that a side of a triangle cannot be
longer than the sum of the other two sides (see Figure 3.19 ). An example of a consistent
heuristic is the straight-line distance that we used in getting to Bucharest.
Figure 3.19
Triangle inequality: If the heuristic is consistent, then the single number will be less than the sum
of the cost of the action from to plus the heuristic estimate
Triangle inequality
Every consistent heuristic is admissible (but not vice versa), so with a consistent heuristic,
A* is cost-optimal. In addition, with a consistent heuristic, the first time we reach a state it
will be on an optimal path, so we never have to re-add a state to the frontier, and never
have to change an entry in reached. But with an inconsistent heuristic, we may end up with
multiple paths reaching the same state, and if each new path has a lower path cost than the
previous one, then we will end up with multiple nodes for that state in the frontier, costing
us both time and space. Because of that, some implementations of A* take care to only enter
a state into the frontier once, and if a better path to the state is found, all the successors of
the state are updated (which requires that nodes have child pointers as well as parent
pointers). These complications have led many implementers to avoid inconsistent heuristics,
but Felner et al. (2011) argues that the worst effects rarely happen in practice, and one
shouldn’t be afraid of inconsistent heuristics.
hSLD
h h(n)
c(n, a, a′) n n′ h(n′).
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AD1
With an inadmissible heuristic, A* may or may not be cost-optimal. Here are two cases
where it is: First, if there is even one cost-optimal path on which is admissible for all
nodes on the path, then that path will be found, no matter what the heuristic says for
states off the path. Second, if the optimal solution has cost and the second-best has cost
and if overestimates some costs, but never by more than then A* is
guaranteed to return cost-optimal solutions.
3.5.3 Search contours
,A useful way to visualize a search is to draw contours in the state space, just like the
contours in a topographic map. Figure 3.20 shows an example. Inside the contour labeled
400, all nodes have and so on. Then, because A* expands the
frontier node of lowest -cost, we can see that an A* search fans out from the start node,
adding nodes in concentric bands of increasing -cost.
Figure 3.20
Map of Romania showing contours at and with Arad as the start state. Nodes
inside a given contour have costs less than or equal to the contour value.
h(n)
n
C ∗,
C2, h(n) C2 − C ∗,
f(n) = g(n) + h(n) ≤ 400,
f
f
f = 380, f = 400, f = 420,
f = g + h
Contour
With uniform-cost search, we also have contours, but of -cost, not The contours with
uniform-cost search will be “circular” around the start state, spreading out equally in all
directions with no preference towards the goal. With A* search using a good heuristic, the
bands will stretch toward a goal state (as in Figure 3.20 ) and become more narrowly
focused around an optimal path.
It should be clear that as you extend a path, the costs are monotonic: the path cost always
increases as you go along a path, because action costs are always positive. Therefore you
get concentric contour lines that don’t cross each other, and if you choose to draw the lines
fine enough, you can put a line between any two nodes on any path.
12 Technically, we say “strictly monotonic” for costs that always increase, and “monotonic” for costs that never decrease, but might
remain the same.
Monotonic
But it is not obvious whether the cost will monotonically increase. As you extend
a path from to the cost goes from to Canceling
out the term, we see that the path’s cost will be monotonically increasing if and only if
in other words if and only if the heuristic is consistent. But note
that a path might contribute several nodes in a row with the same score; this
will happen whenever the decrease in is exactly equal to the action cost just taken (for
example, in a grid problem, when is in the same row as the goal and you take a step
towards the goal, is increased by 1 and is decreased by 1). If is the cost of the optimal
solution path, then we can say the following:
13 In fact, the term “monotonic heuristic” is a synonym for “consistent heuristic.” The two ideas were developed independently, and
then it was proved that they are equivalent (Pearl, 1984).
g g + h.
g + h
g
12
f = g + h
n n′, g (n) + h (n) g(n) + c(n, a,n′) + h(n′).
g(n)
h(n) ≤ c(n, a,n′) + h(n′); 13
g(n) + h(n)
h
n
g h C ∗
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000162FB
A* expands all nodes that can be reached from the initial state on a path where every
node on the path has We say these are surely expanded nodes.
Surely expanded nodes
A* might then expand some of the nodes right on the “goal contour” (where )
before selecting a goal node.
A* expands no nodes with
We say that A* with a consistent heuristic is optimally efficient in the sense that any
algorithm that extends search paths from the initial state, and uses the same heuristic
information, must expand all nodes that are surely expanded by A* (because any one of
them could have been part of an optimal solution). Among the nodes with one
algorithm could get lucky and choose the optimal one first while another algorithm is
unlucky; we don’t consider this difference in defining optimal efficiency.
Optimally efficient
A* is efficient because it prunes away search tree nodes that are not necessary for finding
an optimal solution. In Figure 3.18(b) we see that Timisoara has and Zerind has
Even though they are children of the root and would be among the first nodes
expanded by uniform-cost or breadth-first search, they are never expanded by A* search
because the solution with is found first. The concept of pruning—eliminating
possibilities from consideration without having to examine them—is important for many
areas of AI.
Pruning
f(n) < C ∗.
f(n) = C ∗
f(n) > C ∗.
f(n) = C ∗,
f = 447
f = 449.
f = 418
That A* search is complete, cost-optimal, and optimally efficient among all such algorithms
is rather satisfying. Unfortunately, it does not mean that A* is the answer to all our
searching needs. The catch is that for many problems, the number of nodes expanded can
be exponential in the length of the solution. For example, consider a version of the vacuum
world with a super-powerful vacuum that can clean up any one square at a cost of 1 unit,
without even having to visit the square; in that scenario, squares can be cleaned in any
order. With initially dirty squares, there are states where some subset has been
cleaned; all of those states are on an optimal solution path, and hence satisfy so
all of them would be visited by A*.
3.5.4 Satisficing search: Inadmissible heuristics and weighted
A*
Inadmissible heuristic
A* search has many good qualities, but it expands a lot of nodes. We can explore fewer
nodes (taking less time and space) if we are willing to accept solutions that are suboptimal,
but are “good enough”—what we call satisficing solutions. If we allow A* search to use an
inadmissible heuristic—one that may overestimate—then we risk missing the optimal
solution, but the heuristic can potentially be more accurate, thereby reducing the number of
nodes expanded. For example, road engineers know the concept of a detour index, which is
a multiplier applied to the straight-line distance to account for the typical curvature of roads.
A detour index of 1.3 means that if two cities are 10 miles apart in straight-line distance, a
good estimate of the best path between them is 13 miles. For most localities, the detour
index ranges between 1.2 and 1.6.
Detour index
N 2N
f(n) < C ∗,
We can apply this idea to any problem, not just ones involving roads, with an approach
called weighted A* search where we weight the heuristic value more heavily, giving us the
evaluation function for some
Weighted A* search
Figure 3.21 shows a search problem on a grid world. In (a), an A* search finds the optimal
solution, but has to explore a large portion of the state space to find it. In (b), a weighted A*
search finds a solution that is slightly costlier, but the search time is much faster. We see
that the weighted search focuses the contour of reached states towards a goal. That means
that fewer states are explored, but if the optimal path ever strays outside of the weighted
search’s contour (as it does in this case), then the optimal path will not be found. In general,
if the optimal solution costs a weighted A* search will find a solution that costs
somewhere between and but in practice we usually get results much closer to
than
Figure 3.21
Two searches on the same grid: (a) an A* search and (b) a weighted A* search with weight The
gray bars are obstacles, the purple line is the path from the green start to red goal, and the small dots are
states that were reached by each search. On this particular problem, weighted A* explores 7 times fewer
states and finds a path that is 5% more costly.
f(n) = g(n) + W × h(n), W > 1.
C ∗,
C ∗ W × C ∗;
C ∗ W × C ∗.
W = 2.
We have considered searches that evaluate states by combining and in various ways;
weighted A* can be seen as a generalization of the others:
You could call weighted A* “somewhat-greedy search”: like greedy best-first search, it
focuses the search towards a goal; on the other hand, it won’t ignore the path cost
completely, and will suspend a path that is making little progress at great cost.
There are a variety of suboptimal search algorithms, which can be characterized by the
criteria for what counts as “good enough.” In bounded suboptimal search, we look for a
solution that is guaranteed to be within a constant factor of the optimal cost. Weighted
A*
,provides this guarantee. In bounded-cost search, we look for a solution whose cost is
less than some constant And in unbounded-cost search, we accept a solution of any cost,
as long as we can find it quickly.
Bounded suboptimal search
Bounded-cost search
Unbounded-cost search
An example of an unbounded-cost search algorithm is speedy search, which is a version of
greedy best-first search that uses as a heuristic the estimated number of actions required to
reach a goal, regardless of the cost of those actions. Thus, for problems where all actions
g h
A* search: g(n) + h(n) (W = 1)
Uniform-cost search: g(n) (W = 0)
Greedy best-first search: h(n) (W = ∞)
Weighted A* search: g(n) + W × h(n) (1 < W < ∞)
W
C.
have the same cost it is the same as greedy best-first search, but when actions have different
costs, it tends to lead the search to find a solution quickly, even if it might have a high cost.
Speedy search
3.5.5 Memory-bounded search
The main issue with A* is its use of memory. In this section we’ll cover some
implementation tricks that save space, and then some entirely new algorithms that take
better advantage of the available space.
Memory is split between the frontier and the reached states. In our implementation of best-
first search, a state that is on the frontier is stored in two places: as a node in the frontier (so
we can decide what to expand next) and as an entry in the table of reached states (so we
know if we have visited the state before). For many problems (such as exploring a grid), this
duplication is not a concern, because the size of frontier is much smaller than reached, so
duplicating the states in the frontier requires a comparatively trivial amount of memory. But
some implementations keep a state in only one of the two places, saving a bit of space at the
cost of complicating (and perhaps slowing down) the algorithm.
Another possibility is to remove states from reached when we can prove that they are no
longer needed. For some problems, we can use the separation property (Figure 3.6 on
page 72), along with the prohibition of U-turn actions, to ensure that all actions either move
outwards from the frontier or onto another frontier state. In that case, we need only check
the frontier for redundant paths, and we can eliminate the reached table.
For other problems, we can keep reference counts of the number of times a state has been
reached, and remove it from the reached table when there are no more ways to reach the
state. For example, on a grid world where each state can be reached only from its four
neighbors, once we have reached a state four times, we can remove it from the table.
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F194.xhtml#P700101715800000000000000000F1C1
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F194.xhtml#P700101715800000000000000000F1A0
Reference count
Now let’s consider new algorithms that are designed to conserve memory usage.
Beam search limits the size of the frontier. The easiest approach is to keep only the nodes
with the best -scores, discarding any other expanded nodes. This of course makes the
search incomplete and suboptimal, but we can choose to make good use of available
memory, and the algorithm executes fast because it expands fewer nodes. For many
problems it can find good near-optimal solutions. You can think of uniform-cost or A*
search as spreading out everywhere in concentric contours, and think of beam search as
exploring only a focused portion of those contours, the portion that contains the best
candidates.
Beam search
An alternative version of beam search doesn’t keep a strict limit on the size of the frontier
but instead keeps every node whose -score is within of the best -score. That way, when
there are a few strong-scoring nodes only a few will be kept, but if there are no strong nodes
then more will be kept until a strong one emerges.
Iterative-deepening A* search (IDA*) is to A* what iterative-deepening search is to depth-
first: IDA* gives us the benefits of A* without the requirement to keep all reached states in
memory, at a cost of visiting some states multiple times. It is a very important and
commonly used algorithm for problems that do not fit in memory.
Iterative-deepening A* search
k
f
k
k
f δ f
In standard iterative deepening the cutoff is the depth, which is increased by one each
iteration. In IDA* the cutoff is the -cost ( ); at each iteration, the cutoff value is the
smallest -cost of any node that exceeded the cutoff on the previous iteration. In other
words, each iteration exhaustively searches an -contour, finds a node just beyond that
contour, and uses that node’s -cost as the next contour. For problems like the 8-puzzle
where each path’s -cost is an integer, this works very well, resulting in steady progress
towards the goal each iteration. If the optimal solution has cost then there can be no
more than iterations (for example, no more than 31 iterations on the hardest 8-puzzle
problems). But for a problem where every node has a different -cost, each new contour
might contain only one new node, and the number of iterations could be equal to the
number of states.
Recursive best-first search (RBFS) (Figure 3.22 ) attempts to mimic the operation of
standard best-first search, but using only linear space. RBFS resembles a recursive depth-
first search, but rather than continuing indefinitely down the current path, it uses the _limit
variable to keep track of the -value of the best alternative path available from any ancestor
of the current node. If the current node exceeds this limit, the recursion unwinds back to the
alternative path. As the recursion unwinds, RBFS replaces the -value of each node along
the path with a backed-up value—the best -value of its children. In this way, RBFS
remembers the -value of the best leaf in the forgotten subtree and can therefore decide
whether it’s worth reexpanding the subtree at some later time. Figure 3.23 shows how
RBFS reaches Bucharest.
Figure 3.22
f g + h
f
f
f
f
C ∗,
C ∗
f
f
f
f
f
f
The algorithm for recursive best-first search.
Figure 3.23
Stages in an RBFS search for the shortest route to Bucharest. The -limit value for each recursive call is
shown on top of each current node, and every node is labeled with its -cost. (a) The path via Rimnicu
Vilcea is followed until the current best leaf (Pitesti) has a value that is worse than the best alternative
path (fa*garas). (b) The recursion unwinds and the best leaf value of the forgotten subtree (417) is backed
up to Rimnicu Vilcea; then fa*garas is expanded, revealing a best leaf value of 450. (c) The recursion
unwinds and the best leaf value of the forgotten subtree (450) is backed up to fa*garas; then Rimnicu
Vilcea is expanded. This time, because the best alternative path (through Timisoara) costs at least 447,
the expansion continues to Bucharest.
Recursive best-first search
f
f
Backed-up value
RBFS is somewhat more efficient than IDA*, but still suffers from excessive node
regeneration. In the example in Figure 3.23 , RBFS follows the path via Rimnicu Vilcea,
then “changes its mind” and tries fa*garas, and then changes its mind back again. These
mind changes occur because every time the current best path is extended, its -value is
likely to increase— is usually less optimistic for nodes closer to a goal. When this happens,
the second-best path might become the best path, so the search has to backtrack to follow it.
Each mind change corresponds to an iteration of IDA* and could require many
reexpansions of forgotten nodes to recreate the best path and extend it one more node.
RBFS is optimal if the heuristic function is admissible. Its space complexity is linear in
the depth of the deepest optimal solution, but its time complexity is rather difficult to
characterize:
,https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010C51.xhtml#P7001017158000000000000000010C51
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010C89.xhtml#P7001017158000000000000000010C89
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010D39.xhtml#P7001017158000000000000000010D39
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010DBD.xhtml#P7001017158000000000000000010DBD
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010E1F.xhtml#P7001017158000000000000000010E1F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010E28.xhtml#P7001017158000000000000000010E28
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010E3F.xhtml#P7001017158000000000000000010E3F
IV Uncertain knowledge and reasoning
12 Quantifying Uncertainty 385
12.1 Acting under Uncertainty 385
12.2 Basic Probability Notation 388
12.3 Inference Using Full Joint Distributions 395
12.4 Independence 397
12.5 Bayes’ Rule and Its Use 399
12.6 Naive Bayes Models 402
12.7 The Wumpus World Revisited 404
Summary 407
Bibliographical and Historical Notes 408
13 Probabilistic Reasoning 412
13.1 Representing Knowledge in an Uncertain Domain 412
13.2 The Semantics of Bayesian Networks 414
13.3 Exact Inference in Bayesian Networks 427
13.4 Approximate Inference for Bayesian Networks 435
13.5 Causal Networks 449
Summary 453
Bibliographical and Historical Notes 454
14 Probabilistic Reasoning over Time 461
14.1 Time and Uncertainty 461
14.2 Inference in Temporal Models 465
14.3 Hidden Markov Models 473
14.4 Kalman Filters 479
14.5 Dynamic Bayesian Networks 485
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000110B3.xhtml#P70010171580000000000000000110B3
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010E81.xhtml#P7001017158000000000000000010E81
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010E89.xhtml#P7001017158000000000000000010E89
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010EDD.xhtml#P7001017158000000000000000010EDD
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010F59.xhtml#P7001017158000000000000000010F59
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010F94.xhtml#P7001017158000000000000000010F94
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010FB2.xhtml#P7001017158000000000000000010FB2
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010FEF.xhtml#P7001017158000000000000000010FEF
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001100B.xhtml#P700101715800000000000000001100B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001105E.xhtml#P700101715800000000000000001105E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011077.xhtml#P7001017158000000000000000011077
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000110BE.xhtml#P70010171580000000000000000110BE
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000110C7.xhtml#P70010171580000000000000000110C7
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011115.xhtml#P7001017158000000000000000011115
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011246.xhtml#P7001017158000000000000000011246
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011339.xhtml#P7001017158000000000000000011339
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011485.xhtml#P7001017158000000000000000011485
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000114C8.xhtml#P70010171580000000000000000114C8
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000114DD.xhtml#P70010171580000000000000000114DD
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011529.xhtml#P7001017158000000000000000011529
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011533.xhtml#P7001017158000000000000000011533
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011598.xhtml#P7001017158000000000000000011598
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011682.xhtml#P7001017158000000000000000011682
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011813.xhtml#P7001017158000000000000000011813
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011953.xhtml#P7001017158000000000000000011953
Summary 496
Bibliographical and Historical Notes 497
15 Probabilistic Programming 500
15.1 Relational Probability Models 501
15.2 Open-Universe Probability Models 507
15.3 Keeping Track of a Complex World 514
15.4 Programs as Probability Models 519
Summary 523
Bibliographical and Historical Notes 524
16 Making Simple Decisions 528
16.1 Combining Beliefs and Desires under Uncertainty 528
16.2 The Basis of Utility Theory 529
16.3 Utility Functions 532
16.4 Multiattribute Utility Functions 540
16.5 Decision Networks 544
16.6 The Value of Information 547
16.7 Unknown Preferences 553
Summary 557
Bibliographical and Historical Notes 557
17 Making Complex Decisions 562
17.1 Sequential Decision Problems 562
17.2 Algorithms for MDPs 572
17.3 Bandit Problems 581
17.4 Partially Observable MDPs 588
17.5 Algorithms for Solving POMDPs 590
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011DE8.xhtml#P7001017158000000000000000011DE8
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011DF9.xhtml#P7001017158000000000000000011DF9
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011E2C.xhtml#P7001017158000000000000000011E2C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011E3D.xhtml#P7001017158000000000000000011E3D
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011F26.xhtml#P7001017158000000000000000011F26
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012063.xhtml#P7001017158000000000000000012063
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000120BB.xhtml#P70010171580000000000000000120BB
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012148.xhtml#P7001017158000000000000000012148
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012157.xhtml#P7001017158000000000000000012157
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001219F.xhtml#P700101715800000000000000001219F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000121AF.xhtml#P70010171580000000000000000121AF
,it depends both on the accuracy of the heuristic function and on how often the
best path changes as nodes are expanded. It expands nodes in order of increasing -score,
even if is nonmonotonic.
IDA* and RBFS suffer from using too little memory. Between iterations, IDA* retains only a
single number: the current -cost limit. RBFS retains more information in memory, but it
uses only linear space: even if more memory were available, RBFS has no way to make use
of it. Because they forget most of what they have done, both algorithms may end up
reexploring the same states many times over.
It seems sensible, therefore, to determine how much memory we have available, and allow
an algorithm to use all of it. Two algorithms that do this are MA* (memory-bounded A*)
and SMA* (simplified MA*). SMA* is—well—simpler, so we will describe it. SMA*
proceeds just like A*, expanding the best leaf until memory is full. At this point, it cannot
add a new node to the search tree without dropping an old one. SMA* always drops the
worst leaf node—the one with the highest -value. Like RBFS, SMA* then backs up the value
of the forgotten node to its parent. In this way, the ancestor of a forgotten subtree knows the
quality of the best path in that subtree. With this information, SMA* regenerates the subtree
only when all other paths have been shown to look worse than the path it has forgotten.
Another way of saying this is that if all the descendants of a node are forgotten, then we
f
h
h(n)
f
f
f
f
n
will not know which way to go from but we will still have an idea of how worthwhile it is
to go anywhere from
MA*
SMA*
The complete algorithm is described in the online code repository accompanying this book.
There is one subtlety worth mentioning. We said that SMA* expands the best leaf and
deletes the worst leaf. What if all the leaf nodes have the same -value? To avoid selecting
the same node for deletion and expansion, SMA* expands the newest best leaf and deletes
the oldest worst leaf. These coincide when there is only one leaf, but in that case, the current
search tree must be a single path from root to leaf that fills all of memory. If the leaf is not a
goal node, then even if it is on an optimal solution path, that solution is not reachable with the
available memory. Therefore, the node can be discarded exactly as if it had no successors.
SMA* is complete if there is any reachable solution—that is, if the depth of the shallowest
goal node, is less than the memory size (expressed in nodes). It is optimal if any optimal
solution is reachable; otherwise, it returns the best reachable solution. In practical terms,
SMA* is a fairly robust choice for finding optimal solutions, particularly when the state
space is a graph, action costs are not uniform, and node generation is expensive compared
to the overhead of maintaining the frontier and the reached set.
On very hard problems, however, it will often be the case that SMA* is forced to switch
back and forth continually among many candidate solution paths, only a small subset of
which can fit in memory. (This resembles the problem of thrashing in disk paging systems.)
Then the extra time required for repeated regeneration of the same nodes means that
problems that would be practically solvable by A*, given unlimited memory, become
intractable for SMA*. That is to say, memory limitations can make a problem intractable from
the point of view of computation time. Although no current theory explains the tradeoff
n,
n.
f
d,
between time and memory, it seems that this is an inescapable problem. The only way out is
to drop the optimality requirement.
Thrashing
3.5.6 Bidirectional heuristic search
With unidirectional best-first search, we saw that using as the evaluation
function gives us an A* search that is guaranteed to find optimal-cost solutions (assuming
an admissible ) while being optimally efficient in the number of nodes expanded.
With bidirectional best-first search we could also try using but
unfortunately there is no guarantee that this would lead to an optimal-cost solution, nor that
it would be optimally efficient, even with an admissible heuristic. With bidirectional search,
it turns out that it is not individual nodes but rather pairs of nodes (one from each frontier)
that can be proved to be surely expanded, so any proof of efficiency will have to consider
pairs of nodes (Eckerle et al., 2017).
We’ll start with some new notation. We use for nodes going in the
forward direction (with the initial state as root) and for nodes in the
backward direction (with a goal state as root). Although both forward and backward
searches are solving the same problem, they have different evaluation functions because, for
example, the heuristics are different depending on whether you are striving for the goal or
for the initial state. We’ll assume admissible heuristics.
Consider a forward path from the initial state to a node and a backward path from the
goal to a node We can define a lower bound on the cost of a solution that follows the
path from the initial state to then somehow gets to then follows the path to the goal as
In other words, the cost of such a path must be at least as large as the sum of the path costs
of the two parts (because the remaining connection between them must have nonnegative
f(n) = g(n) + h(n)
h
f(n) = g(n) + h(n),
fF (n) = gF (n) + hF (n)
fB(n) = gB(n) + hB(n)
m
n.
m, n,
lb(m,n) = max(gF (m) + gB(n),fF (m),fB(n))
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015A67
cost), and the cost must also be at least as much as the estimated cost of either part
(because the heuristic estimates are optimistic). Given that, the theorem is that for any pair
of nodes with less than the optimal cost we must expand either or
because the path that goes through both of them is a potential optimal solution. The
difficulty is that we don’t know for sure which node is best to expand, and therefore no
bidirectional search algorithm can be guaranteed to be optimally efficient—any algorithm
might expand up to twice the minimum number of nodes if it always chooses the wrong
member of a pair to expand first. Some bidirectional heuristic search algorithms explicitly
manage a queue of pairs, but we will stick with bidirectional best-first search (Figure
3.14 ), which has two frontier priority queues, and give it an evaluation function that
mimics the criteria:
The node to expand next will be the one that minimizes this value; the node can come
from either frontier. This function guarantees that we will never expand a node (from
either frontier) with We say the two halves of the search “meet in the middle” in
the sense that when the two frontiers touch, no node inside of either frontier has a path cost
greater than the bound Figure 3.24 works through an example bidirectional search.
Figure 3.24
Bidirectional search maintains two frontiers: on the left, nodes A and B are successors of Start; on the
right, node F is an inverse successor of Goal. Each node is labeled with values and the
value. (The values are the sum of the action costs as shown on each arrow; the
values are arbitrary and cannot be derived from anything in the figure.) The optimal solution, Start-A-F-
Goal, has cost so that means that a meet-in-the-middle bidirectional algorithm
should not expand any node with and indeed the next node to be expanded would be A or F
(each with ), leading us to an optimal solution. If we expanded the node with lowest cost first,
then B and C would come next, and D and E would be tied with A, but they all have and thus are
never expanded when is the evaluation function.
f
m,n lb(m,n) C ∗, m n,
(m,n)
lb
f2(n) = max(2g(n), g(n) + h(n))
f2
f2
g(n) > .C ∗
2
.C ∗
2
f = g + h
f2 = max(2g, g + h) g h
C ∗ = 4 + 2 + 4 = 10,
g > = 5;C ∗
2
,https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012207.xhtml#P7001017158000000000000000012207
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012299.xhtml#P7001017158000000000000000012299
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012309.xhtml#P7001017158000000000000000012309
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001238F.xhtml#P700101715800000000000000001238F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001240F.xhtml#P700101715800000000000000001240F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001247A.xhtml#P700101715800000000000000001247A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001248D.xhtml#P700101715800000000000000001248D
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124DA.xhtml#P70010171580000000000000000124DA
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124E5.xhtml#P70010171580000000000000000124E5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000126C9.xhtml#P70010171580000000000000000126C9
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000128CB.xhtml#P70010171580000000000000000128CB
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000129F3.xhtml#P70010171580000000000000000129F3
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012A17.xhtml#P7001017158000000000000000012A17
Summary 595
Bibliographical and Historical Notes 596
18 Multiagent Decision Making 599
18.1 Properties of Multiagent Environments 599
18.2 Non-Cooperative Game Theory 605
18.3 Cooperative Game Theory 626
18.4 Making Collective Decisions 632
Summary 645
Bibliographical and Historical Notes 646
V Machine Learning
19 Learning from Examples 651
19.1 Forms of Learning 651
19.2 Supervised Learning 653
19.3 Learning Decision Trees 657
19.4 Model Selection and Optimization 665
19.5 The Theory of Learning 672
19.6 Linear Regression and Classification 676
19.7 Nonparametric Models 686
19.8 Ensemble Learning 696
19.9 Developing Machine Learning Systems 704
Summary 714
Bibliographical and Historical Notes 715
20 Learning Probabilistic Models 721
20.1 Statistical Learning 721
20.2 Learning with Complete Data 724
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B47.xhtml#P7001017158000000000000000012B47
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B58.xhtml#P7001017158000000000000000012B58
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B87.xhtml#P7001017158000000000000000012B87
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B8F.xhtml#P7001017158000000000000000012B8F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012C2A.xhtml#P7001017158000000000000000012C2A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012F30.xhtml#P7001017158000000000000000012F30
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012FFF.xhtml#P7001017158000000000000000012FFF
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001312B.xhtml#P700101715800000000000000001312B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001313A.xhtml#P700101715800000000000000001313A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013AC4.xhtml#P7001017158000000000000000013AC4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001316E.xhtml#P700101715800000000000000001316E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001317B.xhtml#P700101715800000000000000001317B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000131B6.xhtml#P70010171580000000000000000131B6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000132DC.xhtml#P70010171580000000000000000132DC
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001344C.xhtml#P700101715800000000000000001344C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001356A.xhtml#P700101715800000000000000001356A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013602.xhtml#P7001017158000000000000000013602
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000136F6.xhtml#P70010171580000000000000000136F6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013855.xhtml#P7001017158000000000000000013855
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001399B.xhtml#P700101715800000000000000001399B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013A47.xhtml#P7001017158000000000000000013A47
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013A6B.xhtml#P7001017158000000000000000013A6B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013ACD.xhtml#P7001017158000000000000000013ACD
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013AD7.xhtml#P7001017158000000000000000013AD7
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013B12.xhtml#P7001017158000000000000000013B12
20.3 Learning with Hidden Variables: The EM Algorithm 737
Summary 746
Bibliographical and Historical Notes 747
21 Deep Learning 750
21.1 Simple Feedforward Networks 751
21.2 Computation Graphs for Deep Learning 756
21.3 Convolutional Networks 760
21.4 Learning Algorithms 765
21.5 Generalization 768
21.6 Recurrent Neural Networks 772
21.7 Unsupervised Learning and Transfer Learning 775
21.8 Applications 782
Summary 784
Bibliographical and Historical Notes 785
22 Reinforcement Learning 789
22.1 Learning from Rewards 789
22.2 Passive Reinforcement Learning 791
22.3 Active Reinforcement Learning 797
22.4 Generalization in Reinforcement Learning 803
22.5 Policy Search 810
22.6 Apprenticeship and Inverse Reinforcement Learning 812
22.7 Applications of Reinforcement Learning 815
Summary 818
Bibliographical and Historical Notes 819
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013C08.xhtml#P7001017158000000000000000013C08
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013CC0.xhtml#P7001017158000000000000000013CC0
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013CD2.xhtml#P7001017158000000000000000013CD2
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013CF5.xhtml#P7001017158000000000000000013CF5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013D19.xhtml#P7001017158000000000000000013D19
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013DA8.xhtml#P7001017158000000000000000013DA8
,https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013DCF.xhtml#P7001017158000000000000000013DCF
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013E44.xhtml#P7001017158000000000000000013E44
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013E84.xhtml#P7001017158000000000000000013E84
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013ECB.xhtml#P7001017158000000000000000013ECB
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013F1B.xhtml#P7001017158000000000000000013F1B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013F86.xhtml#P7001017158000000000000000013F86
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013F9D.xhtml#P7001017158000000000000000013F9D
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FAB.xhtml#P7001017158000000000000000013FAB
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FED.xhtml#P7001017158000000000000000013FED
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014016.xhtml#P7001017158000000000000000014016
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000140BB.xhtml#P70010171580000000000000000140BB
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014148.xhtml#P7001017158000000000000000014148
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000141B6.xhtml#P70010171580000000000000000141B6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000141E1.xhtml#P70010171580000000000000000141E1
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014211.xhtml#P7001017158000000000000000014211
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001422B.xhtml#P700101715800000000000000001422B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001424F.xhtml#P700101715800000000000000001424F
VI Communicating, perceiving, and acting
23 Natural Language Processing 823
23.1 Language Models 823
23.2 Grammar 833
23.3 Parsing 835
23.4 Augmented Grammars 841
23.5 Complications of Real Natural Language 845
23.6 Natural Language Tasks 849
Summary 850
Bibliographical and Historical Notes 851
24 Deep Learning for Natural Language Processing 856
24.1 Word Embeddings 856
24.2 Recurrent Neural Networks for NLP 860
24.3 Sequence-to-Sequence Models 864
24.4 The Transformer Architecture 868
24.5 Pretraining and Transfer Learning 871
24.6 State of the art 875
Summary 878
Bibliographical and Historical Notes 878
25 Computer Vision 881
25.1 Introduction 881
25.2 Image Formation 882
25.3 Simple Image Features 888
25.4 Classifying Images 895
25.5 Detecting Objects 899
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014614.xhtml#P7001017158000000000000000014614
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001427E.xhtml#P700101715800000000000000001427E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014291.xhtml#P7001017158000000000000000014291
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000143CC.xhtml#P70010171580000000000000000143CC
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001441D.xhtml#P700101715800000000000000001441D
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000144C7.xhtml#P70010171580000000000000000144C7
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001455E.xhtml#P700101715800000000000000001455E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000145AD.xhtml#P70010171580000000000000000145AD
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000145C6.xhtml#P70010171580000000000000000145C6
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000145DF.xhtml#P70010171580000000000000000145DF
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001461E.xhtml#P700101715800000000000000001461E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014628.xhtml#P7001017158000000000000000014628
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014699.xhtml#P7001017158000000000000000014699
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000146F9.xhtml#P70010171580000000000000000146F9
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000147A9.xhtml#P70010171580000000000000000147A9
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014809.xhtml#P7001017158000000000000000014809
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014848.xhtml#P7001017158000000000000000014848
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001489C.xhtml#P700101715800000000000000001489C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000148AA.xhtml#P70010171580000000000000000148AA
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000148CB.xhtml#P70010171580000000000000000148CB
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000148D4.xhtml#P70010171580000000000000000148D4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000148E3.xhtml#P70010171580000000000000000148E3
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001497D.xhtml#P700101715800000000000000001497D
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014A08.xhtml#P7001017158000000000000000014A08
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014A53.xhtml#P7001017158000000000000000014A53
25.6 The 3D World 901
25.7 Using Computer Vision 906
Summary 919
Bibliographical and Historical Notes 920
26 Robotics 925
26.1 Robots 925
26.2 Robot Hardware 926
26.3 What kind of problem is robotics solving? 930
26.4 Robotic Perception 931
26.5 Planning and Control 938
26.6 Planning Uncertain Movements 956
26.7 Reinforcement Learning in Robotics 958
26.8 Humans and Robots 961
26.9 Alternative Robotic Frameworks 968
26.10 Application Domains 971
Summary 974
Bibliographical and Historical Notes 975
VII Conclusions
27 Philosophy, Ethics, and Safety of AI 981
27.1 The Limits of AI 981
27.2 Can Machines Really Think? 984
27.3 The Ethics of AI 986
Summary 1005
Bibliographical and Historical Notes 1006
28 The Future of AI 1012
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014A8E.xhtml#P7001017158000000000000000014A8E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014AE7.xhtml#P7001017158000000000000000014AE7
,https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C49.xhtml#P7001017158000000000000000014C49
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C5A.xhtml#P7001017158000000000000000014C5A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C8B.xhtml#P7001017158000000000000000014C8B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C92.xhtml#P7001017158000000000000000014C92
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014CA2.xhtml#P7001017158000000000000000014CA2
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014D15.xhtml#P7001017158000000000000000014D15
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014D29.xhtml#P7001017158000000000000000014D29
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014DD4.xhtml#P7001017158000000000000000014DD4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014F53.xhtml#P7001017158000000000000000014F53
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014F85.xhtml#P7001017158000000000000000014F85
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014FAC.xhtml#P7001017158000000000000000014FAC
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015052.xhtml#P7001017158000000000000000015052
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001508A.xhtml#P700101715800000000000000001508A
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000150C4.xhtml#P70010171580000000000000000150C4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000150E7.xhtml#P70010171580000000000000000150E7
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000152F4.xhtml#P70010171580000000000000000152F4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001512B.xhtml#P700101715800000000000000001512B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015135.xhtml#P7001017158000000000000000015135
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015167.xhtml#P7001017158000000000000000015167
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015183.xhtml#P7001017158000000000000000015183
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000152A5.xhtml#P70010171580000000000000000152A5
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000152B8.xhtml#P70010171580000000000000000152B8
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000152FF.xhtml#P70010171580000000000000000152FF
28.1 AI Components 1012
28.2 AI Architectures 1018
A Mathematical Background 1023
A.1 Complexity Analysis and O() Notation 1023
A.2 Vectors, Matrices, and Linear Algebra 1025
A.3 Probability Distributions 1027
Bibliographical and Historical Notes 1029
B Notes on Languages and Algorithms 1030
B.1 Defining Languages with Backus–Naur Form (BNF) 1030
B.2 Describing Algorithms with Pseudocode 1031
B.3 Online Supplemental Material 1032
Bibliography 1033
Index 1069
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015309.xhtml#P7001017158000000000000000015309
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015352.xhtml#P7001017158000000000000000015352
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015390.xhtml#P7001017158000000000000000015390
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015390.xhtml#P7001017158000000000000000015393
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000153C1.xhtml#P70010171580000000000000000153C1
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000153E4.xhtml#P70010171580000000000000000153E4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001542F.xhtml#P700101715800000000000000001542F
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001543B.xhtml#P700101715800000000000000001543B
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001543B.xhtml#P700101715800000000000000001543E
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001545C.xhtml#P700101715800000000000000001545C
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015482.xhtml#P7001017158000000000000000015482
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015495
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000168A8.xhtml#P70010171580000000000000000168A8
I Artificial Intelligence
Chapter 1
Introduction
In which we try to explain why we consider artificial intelligence to be a subject most worthy of
study, and in which we try to decide what exactly it is, this being a good thing to decide before
embarking.
We call ourselves hom*o sapiens—man the wise—because our intelligence is so important to
us. For thousands of years, we have tried to understand how we think and act—that is, how
our brain, a mere handful of matter, can perceive, understand, predict, and manipulate a
world far larger and more complicated than itself. The field of artificial intelligence, or AI, is
concerned with not just understanding but also building intelligent entities—machines that
can compute how to act effectively and safely in a wide variety of novel situations.
Intelligence
Artificial Intelligence
Surveys regularly rank AI as one of the most interesting and fastest-growing fields, and it is
already generating over a trillion dollars a year in revenue. AI expert Kai-Fu Lee predicts
that its impact will be “more than anything in the history of mankind.” Moreover, the
intellectual frontiers of AI are wide open. Whereas a student of an older science such as
physics might feel that the best ideas have already been discovered by Galileo, Newton,
Curie, Einstein, and the rest, AI still has many openings for full-time masterminds.
AI currently encompasses a huge variety of subfields, ranging from the general (learning,
reasoning, perception, and so on) to the specific, such as playing chess, proving
mathematical theorems, writing poetry, driving a car, or diagnosing diseases. AI is relevant
to any intellectual task; it is truly a universal field.
1.1 What Is AI?
We have claimed that AI is interesting, but we have not said what it is. Historically,
researchers have pursued several different versions of AI. Some have defined intelligence in
terms of fidelity to human performance, while others prefer an abstract, formal definition of
intelligence called rationality—loosely speaking, doing the “right thing.” The subject matter
itself also varies: some consider intelligence to be a property of internal thought processes and
reasoning, while others focus on intelligent behavior, an external characterization.
1 In the public eye, there is sometimes confusion between the terms “artificial intelligence” and “machine learning.” Machine
,learning
is a subfield of AI that studies the ability to improve performance based on experience. Some AI systems use machine learning
methods to achieve competence, but some do not.
Rationality
From these two dimensions—human vs. rational and thought vs. behavior—there are four
possible combinations, and there have been adherents and research programs for all four.
The methods used are necessarily different: the pursuit of human-like intelligence must be
in part an empirical science related to psychology, involving observations and hypotheses
about actual human behavior and thought processes; a rationalist approach, on the other
hand, involves a combination of mathematics and engineering, and connects to statistics,
control theory, and economics. The various groups have both disparaged and helped each
other. Let us look at the four approaches in more detail.
2 We are not suggesting that humans are “irrational” in the dictionary sense of “deprived of normal mental clarity.” We are merely
conceding that human decisions are not always mathematically perfect.
1.1.1 Acting humanly: The Turing test approach
Turing test
1
2
The Turing test, proposed by Alan Turing (1950), was designed as a thought experiment
that would sidestep the philosophical vagueness of the question “Can a machine think?” A
computer passes the test if a human interrogator, after posing some written questions,
cannot tell whether the written responses come from a person or from a computer. Chapter
27 discusses the details of the test and whether a computer would really be intelligent if it
passed. For now, we note that programming a computer to pass a rigorously applied test
provides plenty to work on. The computer would need the following capabilities:
natural language processing to communicate successfully in a human language;
knowledge representation to store what it knows or hears;
automated reasoning to answer questions and to draw new conclusions;
machine learning to adapt to new circ*mstances and to detect and extrapolate patterns.
Natural language processing
Knowledge representation
Automated reasoning
Machine learning
Total Turing test
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166D4
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001512B.xhtml#P700101715800000000000000001512B
Turing viewed the physical simulation of a person as unnecessary to demonstrate
intelligence. However, other researchers have proposed a total Turing test, which requires
interaction with objects and people in the real world. To pass the total Turing test, a robot
will need
computer vision and speech recognition to perceive the world;
robotics to manipulate objects and move about.
Computer vision
Robotics
These six disciplines compose most of AI. Yet AI researchers have devoted little effort to
passing the Turing test, believing that it is more important to study the underlying principles
of intelligence. The quest for “artificial flight” succeeded when engineers and inventors
stopped imitating birds and started using wind tunnels and learning about aerodynamics.
Aeronautical engineering texts do not define the goal of their field as making “machines that
fly so exactly like pigeons that they can fool even other pigeons.”
1.1.2 Thinking humanly: The cognitive modeling approach
To say that a program thinks like a human, we must know how humans think. We can learn
about human thought in three ways:
introspection—trying to catch our own thoughts as they go by;
psychological experiments—observing a person in action;
brain imaging—observing the brain in action.
Introspection
Psychological experiments
Brain imaging
Once we have a sufficiently precise theory of the mind, it becomes possible to express the
theory as a computer program. If the program’s input–output behavior matches
corresponding human behavior, that is evidence that some of the program’s mechanisms
could also be operating in humans.
For example, Allen Newell and Herbert Simon, who developed GPS, the “General Problem
Solver” (Newell and Simon 1961), were not content merely to have their program solve
problems correctly. They were more concerned with comparing the sequence and timing of
its reasoning steps to those of human subjects solving the same problems. The
interdisciplinary field of cognitive science brings together computer models from AI and
experimental techniques from psychology to construct precise and testable theories of the
human mind.
Cognitive science
Cognitive science is a fascinating field in itself, worthy of several textbooks and at least one
encyclopedia (Wilson and Keil 1999). We will occasionally comment on similarities or
differences between AI techniques and human cognition. Real cognitive science, however, is
necessarily based on experimental investigation of actual humans or animals. We will leave
that for other books, as we assume the reader has only a computer for experimentation.
In the early days of AI there was often confusion between the approaches. An author would
argue that an algorithm performs well on a task and that it is therefore a good model of
human performance, or vice versa. Modern authors separate the two kinds of claims; this
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016250
https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167C9
distinction has allowed both AI and cognitive science to develop more rapidly. The two
fields fertilize each other, most notably in computer vision, which incorporates
neurophysiological evidence into computational models. Recently, the combination of
neuroimaging methods combined with machine learning techniques for analyzing such data
has led to the beginnings of a capability to “read minds”—that is, to ascertain the semantic
content of a person’s inner thoughts. This capability could, in turn, shed further light on
how human cognition works.
1.1.3 Thinking rationally: The “laws of thought” approach
The Greek philosopher Aristotle was one of the first to attempt to codify “right thinking”—
that is, irrefutable reasoning processes. His syllogisms provided patterns for argument
structures that always yielded correct conclusions when given correct premises. The
canonical example starts with Socrates is a man and all men are mortal and concludes that
Socrates is mortal. (This example is probably due to Sextus Empiricus rather than Aristotle.)
These laws of thought were supposed to govern the operation of the mind; their study
initiated the field called logic.
Syllogisms
Logicians in the 19th century developed a precise notation for statements about objects in
the world and the relations among them. (Contrast this with ordinary arithmetic notation,
which provides only for statements about numbers.) By 1965, programs could, in principle,
solve any solvable problem described in logical notation. The so-called logicist tradition
within artificial intelligence hopes to build on such programs to create intelligent systems.
Logicist
Logic as conventionally understood requires knowledge of the world that is certain—a
condition that, in reality, is seldom achieved. We simply don’t know the rules of, say,
politics or warfare in the same way that we know the rules of chess or arithmetic. The
theory of probability fills this gap, allowing rigorous reasoning with uncertain information.
In principle, it allows the construction of a comprehensive model of rational thought,
leading from raw perceptual information to an understanding of how the world works to
predictions about the future. What it does not do, is generate intelligent behavior. For that,
we need a theory of rational action. Rational thought, by