Inteligência Artificial Uma Abordagem Moderna - Inteligência Artificial (2024)

Artificial Intelligence

A Modern Approach

Fourth Edition

PEARSON SERIES

IN ARTIFICIAL INTELLIGENCE

Stuart Russell and Peter Norvig, Editors

Artificial Intelligence

A Modern Approach

Fourth Edition

Stuart J. Russell and Peter Norvig

Contributing writers:

Ming-Wei Chang

Jacob Devlin

Anca Dragan

David Forsyth

Ian Goodfellow

Jitendra M. Malik

Vikash Mansinghka

Judea Pearl

Michael Wooldridge

Copyright © 2021, 2010, 2003 by Pearson Education, Inc. or its affiliates, 221 River Street,

Hoboken, NJ 07030. All Rights Reserved. Manufactured in the United States of America.

This publication is protected by copyright, and permission should be obtained from the

publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission

in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise.

For information regarding permissions, request forms, and the appropriate contacts within

the Pearson Education Global Rights and Permissions department, please visit

www.pearsoned.com/permissions/.

Acknowledgments of third-party content appear on the appropriate page within the text.

Cover Images:

Alan Turing – Science History Images/Alamy Stock Photo

Statue of Aristotle – Panos Karas/Shutterstock

Ada Lovelace – Pictorial Press Ltd/Alamy Stock Photo

Autonomous cars – Andrey Suslov/Shutterstock

Atlas Robot – Boston Dynamics, Inc.

Berkeley Campanile and Golden Gate Bridge – Ben Chu/Shutterstock

Background ghosted nodes – Eugene Sergeev/Alamy Stock Photo

Chess board with chess figure – Titania/Shutterstock

Mars Rover – Stocktrek Images, Inc./Alamy Stock Photo

Kasparov – KATHY WILLENS/AP Images

http://www.pearsoned.com/permissions/

PEARSON, ALWAYS LEARNING is an exclusive trademark owned by Pearson Education,

Inc. or its affiliates in the U.S. and/or other countries.

Unless otherwise indicated herein, any third-party trademarks, logos, or icons that may

appear in this work are the property of their respective owners, and any references to third-

party trademarks, logos, icons, or other trade dress are for demonstrative or descriptive

purposes only. Such references are not intended to imply any sponsorship, endorsem*nt,

authorization, or promotion of Pearson’s products by the owners of such marks, or any

relationship between the owner and Pearson Education, Inc., or its affiliates, authors,

licensees, or distributors.

Library of Congress Cataloging-in-Publication Data

Names: Russell, Stuart J. (Stuart Jonathan), author. | Norvig, Peter, author.

Title: Artificial intelligence : a modern approach / Stuart J. Russell and Peter Norvig.

Description: Fourth edition. | Hoboken : Pearson, [2021] | Series: Pearson series in artificial

intelligence | Includes bibliographical references and index. | Summary: “Updated edition

of popular textbook on Artificial Intelligence.”— Provided by publisher.

Identifiers: LCCN 2019047498 | ISBN 9780134610993 (hardcover)

Subjects: LCSH: Artificial intelligence.

Classification: LCC Q335 .R86 2021 | DDC 006.3–dc23

LC record available at https://lccn.loc.gov/2019047498

ScoutAutomatedPrintCode

ISBN-10: 0-13-461099-7

ISBN-13: 978-0-13-461099-3

https://lccn.loc.gov/2019047498

For Loy, Gordon, Lucy, George, and Isaac — S.J.R.

For Kris, Isabella, and Juliet — P.N.

Preface

Artificial Intelligence (AI) is a big field, and this is a big book. We have tried to explore the

full breadth of the field, which encompasses logic, probability, and continuous mathematics;

perception, reasoning, learning, and action; fairness, trust, social good, and safety; and

applications that range from microelectronic devices to robotic planetary explorers to online

services with billions of users.

The subtitle of this book is “A Modern Approach.” That means we have chosen to tell the

story from a current perspective. We synthesize what is now known into a common

framework, recasting early work using the ideas and terminology that are prevalent today.

We apologize to those whose subfields are, as a result, less recognizable.

New to this edition

This edition reflects the changes in AI since the last edition in 2010:

We focus more on machine learning rather than hand-crafted knowledge engineering,

due to the increased availability of data, computing resources, and new algorithms.

Deep learning, probabilistic programming, and multiagent systems receive expanded

coverage, each with their own chapter.

The coverage of natural language understanding, robotics, and computer vision has

been revised to reflect the impact of deep learning.

The robotics chapter now includes robots that interact with humans and the application

of reinforcement learning to robotics.

Previously we defined the goal of AI as creating systems that try to maximize expected

utility, where the specific utility information—the objective—is supplied by the human

designers of the system. Now we no longer assume that the objective is fixed and

known by the AI system; instead, the system may be uncertain about the true objectives

of the humans on whose behalf it operates. It must learn what to maximize and must

function appropriately even while uncertain about the objective.

We increase coverage of the impact of AI on society, including the vital issues of ethics,

fairness, trust, and safety.

We have moved the exercises from the end of each chapter to an online site. This allows

us to continuously add to, update, and improve the exercises, to meet the needs of

instructors and to reflect advances in the field and in AI-related software tools.

Overall, about 25% of the material in the book is brand new. The remaining 75% has

been largely rewritten to present a more unified picture of the field. 22% of the citations

in this edition are to works published after 2010.

Overview of the book

The main unifying theme is the idea of an intelligent agent. We define AI as the study of

agents that receive percepts from the environment and perform actions. Each such agent

implements a function that maps percept sequences to actions, and we cover different ways

to represent these functions, such as reactive agents, real-time planners, decision-theoretic

systems, and deep learning systems. We emphasize learning both as a construction method

for competent systems and as a way of extending the reach of the designer into unknown

environments. We treat robotics and vision not as independently defined problems, but as

occurring in the service of achieving goals. We stress the importance of the task

environment in determining the appropriate agent design.

Our primary aim is to convey the ideas that have emerged over the past seventy years of AI

research and the past two millennia of related work. We have tried to avoid excessive

formality in the presentation of these ideas, while retaining precision. We have included

mathematical formulas and pseudocode algorithms to make the key ideas concrete;

mathematical concepts and notation are described in Appendix A and our pseudocode is

described in Appendix B .

This book is primarily intended for use in an undergraduate course or course sequence. The

book has 28 chapters, each requiring about a week’s worth of lectures, so working through

the whole book requires a two-semester sequence. A one-semester course can use selected

chapters to suit the interests of the instructor and students. The book can also be used in a

graduate-level course (perhaps with the addition of some of the primary sources suggested

in the bibliographical notes), or for self-study or as a reference.

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015390.xhtml#P7001017158000000000000000015390

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001543B.xhtml#P700101715800000000000000001543B

Throughout the book, important points are marked with a triangle icon in the margin.

Wherever a new term is defined,

,

itself, is not enough.

Probability

1.1.4 Acting rationally: The rational agent approach

Agent

An agent is just something that acts (agent comes from the Latin agere, to do). Of course, all

computer programs do something, but computer agents are expected to do more: operate

autonomously, perceive their environment, persist over a prolonged time period, adapt to

change, and create and pursue goals. A rational agent is one that acts so as to achieve the

best outcome or, when there is uncertainty, the best expected outcome.

Rational agent

In the “laws of thought” approach to AI, the emphasis was on correct inferences. Making

correct inferences is sometimes part of being a rational agent, because one way to act

rationally is to deduce that a given action is best and then to act on that conclusion. On the

other hand, there are ways of acting rationally that cannot be said to involve inference. For

example, recoiling from a hot stove is a reflex action that is usually more successful than a

slower action taken after careful deliberation.

All the skills needed for the Turing test also allow an agent to act rationally. Knowledge

representation and reasoning enable agents to reach good decisions. We need to be able to

generate comprehensible sentences in natural language to get by in a complex society. We

need learning not only for erudition, but also because it improves our ability to generate

effective behavior, especially in circ*mstances that are new.

The rational-agent approach to AI has two advantages over the other approaches. First, it is

more general than the “laws of thought” approach because correct inference is just one of

several possible mechanisms for achieving rationality. Second, it is more amenable to

scientific development. The standard of rationality is mathematically well defined and

completely general. We can often work back from this specification to derive agent designs

that provably achieve it—something that is largely impossible if the goal is to imitate human

behavior or thought processes.

For these reasons, the rational-agent approach to AI has prevailed throughout most of the

field’s history. In the early decades, rational agents were built on logical foundations and

formed definite plans to achieve specific goals. Later, methods based on probability theory

and machine learning allowed the creation of agents that could make decisions under

uncertainty to attain the best expected outcome. In a nutshell, AI has focused on the study and

construction of agents that do the right thing. What counts as the right thing is defined by the

objective that we provide to the agent. This general paradigm is so pervasive that we might

call it the standard model. It prevails not only in AI, but also in control theory, where a

controller minimizes a cost function; in operations research, where a policy maximizes a

sum of rewards; in statistics, where a decision rule minimizes a loss function; and in

economics, where a decision maker maximizes utility or some measure of social welfare.

Do the right thing

Standard model

We need to make one important refinement to the standard model to account for the fact

that perfect rationality—always taking the exactly optimal action—is not feasible in complex

environments. The computational demands are just too high. Chapters 5 and 17 deal

with the issue of limited rationality—acting appropriately when there is not enough time to

do all the computations one might like. However, perfect rationality often remains a good

starting point for theoretical analysis.

Limited rationality

1.1.5 Beneficial machines

The standard model has been a useful guide for AI research since its inception, but it is

probably not the right model in the long run. The reason is that the standard model assumes

that we will supply a fully specified objective to the machine.

For an artificially defined task such as chess or shortest-path computation, the task comes

with an objective built in—so the standard model is applicable. As we move into the real

world, however, it becomes more and more difficult to specify the objective completely and

correctly. For example, in designing a self-driving car, one might think that the objective is

to reach the destination safely. But driving along any road incurs a risk of injury due to other

errant drivers, equipment failure, and so on; thus, a strict goal of safety requires staying in

the garage. There is a tradeoff between making progress towards the destination and

incurring a risk of injury. How should this tradeoff be made? Furthermore, to what extent

can we allow the car to take actions that would annoy other drivers? How much should the

car moderate its acceleration, steering, and braking to avoid shaking up the passenger?

These kinds of questions are difficult to answer a priori. They are particularly problematic in

the general area of human–robot interaction, of which the self-driving car is one example.

The problem of achieving agreement between our true preferences and the objective we put

into the machine is called the value alignment problem: the values or objectives put into

 

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F90E.xhtml#P700101715800000000000000000F90E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124DA.xhtml#P70010171580000000000000000124DA

the machine must be aligned with those of the human. If we are developing an AI system in

the lab or in a simulator—as has been the case for most of the field’s history—there is an

easy fix for an incorrectly specified objective: reset the system, fix the objective, and try

again. As the field progresses towards increasingly capable intelligent systems that are

deployed in the real world, this approach is no longer viable. A system deployed with an

incorrect objective will have negative consequences. Moreover, the more intelligent the

system, the more negative the consequences.

Value alignment problem

Returning to the apparently unproblematic example of chess, consider what happens if the

machine is intelligent enough to reason and act beyond the confines of the chessboard. In

that case, it might attempt to increase its chances of winning by such ruses as hypnotizing or

blackmailing its opponent or bribing the audience to make rustling noises during its

opponent’s thinking time. It might also attempt to hijack additional computing power for

itself. These behaviors are not “unintelligent” or “insane”; they are a logical consequence of defining

winning as the sole objective for the machine.

3 In one of the first books on chess, Ruy Lopez (1561) wrote, “Always place the board so the sun is in your opponent’s eyes.”

It is impossible to anticipate all the ways in which a machine pursuing a fixed objective

might misbehave. There is good reason, then, to think that the standard model is

inadequate. We don’t want machines that are intelligent in the sense of pursuing their

objectives; we want them to pursue our objectives. If we cannot transfer those objectives

perfectly to the machine, then we need a new formulation—one in which the machine is

pursuing our objectives, but is necessarily uncertain as to what they are. When a machine

knows that it doesn’t know the complete objective, it has an incentive to act cautiously, to

ask permission, to learn more about our preferences through observation, and to defer to

human control. Ultimately, we want agents that are provably beneficial to humans. We will

return to this topic in Section 1.5 .

3

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016087

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EDBE.xhtml#P700101715800000000000000000EDBE

Provably beneficial

1.2 The Foundations of Artificial Intelligence

,

In this section, we provide a brief history of the disciplines that contributed ideas,

viewpoints, and techniques to AI. Like any history, this one concentrates on a small number

of people, events, and ideas and ignores others that also were important. We organize the

history around a series of questions. We certainly would not wish to give the impression

that these questions are the only ones the disciplines address or that the disciplines have all

been working toward AI as their ultimate fruition.

1.2.1 Philosophy

Can formal rules be used to draw valid conclusions?

How does the mind arise from a physical brain?

Where does knowledge come from?

How does knowledge lead to action?

Aristotle (384–322 BCE) was the first to formulate a precise set of laws governing the rational

part of the mind. He developed an informal system of syllogisms for proper reasoning,

which in principle allowed one to generate conclusions mechanically, given initial premises.

Ramon Llull (c. 1232–1315) devised a system of reasoning published as Ars Magna or The

Great Art (1305). Llull tried to implement his system using an actual mechanical device: a set

of paper wheels that could be rotated into different permutations.

Around 1500, Leonardo da Vinci (1452–1519) designed but did not build a mechanical

calculator; recent reconstructions have shown the design to be functional. The first known

calculating machine was constructed around 1623 by the German scientist Wilhelm

Schickard (1592–1635). Blaise Pascal (1623–1662) built the Pascaline in 1642 and wrote that

it “produces effects which appear nearer to thought than all the actions of animals.”

Gottfried Wilhelm Leibniz (1646–1716) built a mechanical device intended to carry out

operations on concepts rather than numbers, but its scope was rather limited. In his 1651

book Leviathan, Thomas Hobbes (1588–1679) suggested the idea of a thinking machine, an

“artificial animal” in his words, arguing “For what is the heart but a spring; and the nerves,

but so many strings; and the joints, but so many wheels.” He also suggested that reasoning

was like numerical computation: “For ‘reason’ ... is nothing but ‘reckoning,’ that is adding

and subtracting.”

It’s one thing to say that the mind operates, at least in part, according to logical or numerical

rules, and to build physical systems that emulate some of those rules. It’s another to say that

the mind itself is such a physical system. René Descartes (1596–1650) gave the first clear

discussion of the distinction between mind and matter. He noted that a purely physical

conception of the mind seems to leave little room for free will. If the mind is governed

entirely by physical laws, then it has no more free will than a rock “deciding” to fall

downward. Descartes was a proponent of dualism. He held that there is a part of the human

mind (or soul or spirit) that is outside of nature, exempt from physical laws. Animals, on the

other hand, did not possess this dual quality; they could be treated as machines.

Dualism

An alternative to dualism is materialism, which holds that the brain’s operation according

to the laws of physics constitutes the mind. Free will is simply the way that the perception of

available choices appears to the choosing entity. The terms physicalism and naturalism are

also used to describe this view that stands in contrast to the supernatural.

Given a physical mind that manipulates knowledge, the next problem is to establish the

source of knowledge. The empiricism movement, starting with Francis Bacon’s (1561–1626)

Novum Organum, is characterized by a dictum of John Locke (1632–1704): “Nothing is in

the understanding, which was not first in the senses.”

4 The Novum Organum is an update of Aristotle’s Organon, or instrument of thought.

Empiricism

4

Induction

David Hume’s (1711–1776) A Treatise of Human Nature (Hume, 1739) proposed what is now

known as the principle of induction: that general rules are acquired by exposure to repeated

associations between their elements.

Building on the work of Ludwig Wittgenstein (1889–1951) and Bertrand Russell (1872–

1970), the famous Vienna Circle [2017], a group of philosophers and mathematicians

meeting in Vienna in the 1920s and 1930s, developed the doctrine of logical positivism.

This doctrine holds that all knowledge can be characterized by logical theories connected,

ultimately, to observation sentences that correspond to sensory inputs; thus logical

positivism combines rationalism and empiricism.

Logical positivism

Observation sentence

The confirmation theory of Rudolf Carnap (1891–1970) and Carl Hempel (1905–1997)

attempted to analyze the acquisition of knowledge from experience by quantifying the

degree of belief that should be assigned to logical sentences based on their connection to

observations that confirm or disconfirm them. Carnap’s book The Logical Structure of the

World (1928) was perhaps the first theory of mind as a computational process.

Confirmation theory

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015DBE

The final element in the philosophical picture of the mind is the connection between

knowledge and action. This question is vital to AI because intelligence requires action as

well as reasoning. Moreover, only by understanding how actions are justified can we

understand how to build an agent whose actions are justifiable (or rational).

Aristotle argued (in De Motu Animalium) that actions are justified by a logical connection

between goals and knowledge of the action’s outcome:

But how does it happen that thinking is sometimes accompanied by action and sometimes not,

sometimes by motion, and sometimes not? It looks as if almost the same thing happens as in the case

of reasoning and making inferences about unchanging objects. But in that case the end is a

speculative proposition … whereas here the conclusion which results from the two premises is an

action. … I need covering; a cloak is a covering. I need a cloak. What I need, I have to make; I need a

cloak. I have to make a cloak. And the conclusion, the “I have to make a cloak,” is an action.

In the Nicomachean Ethics (Book III. 3, 1112b), Aristotle further elaborates on this topic,

suggesting an algorithm:

We deliberate not about ends, but about means. For a doctor does not deliberate whether he shall

heal, nor an orator whether he shall persuade, … They assume the end and consider how and by

what means it is attained, and if it seems easily and best produced thereby; while if it is achieved by

one means only they consider how it will be achieved by this and by what means this will be

achieved, till they come to the first cause, … and what is last in the order of analysis seems to be first

in the order of becoming. And if we come on an impossibility, we give up the search, e.g., if we need

money and this cannot be got; but if a thing appears possible we try to do it.

Aristotle’s algorithm was implemented 2300 years later by Newell and Simon in their

General Problem Solver program. We would now call it a greedy regression planning

system (see Chapter 11 ). Methods based on logical planning to achieve definite goals

dominated the first few decades of theoretical research in AI.

Utility

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B83.xhtml#P7001017158000000000000000010B83

Thinking purely in terms of actions achieving goals is often useful but sometimes

inapplicable. For example, if there are several different ways to achieve a goal, there needs

to be some way to choose among them. More importantly, it may not be possible to achieve

a goal with certainty, but some action must still be taken. How then should one decide?

Antoine Arnauld (1662), analyzing the notion of rational decisions in gambling, proposed a

quantitative formula for maximizing the expected monetary

,

value of the outcome. Later,

Daniel Bernoulli (1738) introduced the more general notion of utility to capture the internal,

subjective value of an outcome. The modern notion of rational decision making under

uncertainty involves maximizing expected utility, as explained in Chapter 16 .

In matters of ethics and public policy, a decision maker must consider the interests of

multiple individuals. Jeremy Bentham (1823) and John Stuart Mill (1863) promoted the idea

of utilitarianism: that rational decision making based on maximizing utility should apply to

all spheres of human activity, including public policy decisions made on behalf of many

individuals. Utilitarianism is a specific kind of consequentialism: the idea that what is right

and wrong is determined by the expected outcomes of an action.

Utilitarianism

In contrast, Immanuel Kant, in 1875 proposed a theory of rule-based or deontological

ethics, in which “doing the right thing” is determined not by outcomes but by universal

social laws that govern allowable actions, such as “don’t lie” or “don’t kill.” Thus, a utilitarian

could tell a white lie if the expected good outweighs the bad, but a Kantian would be bound

not to, because lying is inherently wrong. Mill acknowledged the value of rules, but

understood them as efficient decision procedures compiled from first-principles reasoning

about consequences. Many modern AI systems adopt exactly this approach.

Deontological ethics

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000155A6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015696

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015678

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016195

1.2.2 Mathematics

What are the formal rules to draw valid conclusions?

What can be computed?

How do we reason with uncertain information?

Philosophers staked out some of the fundamental ideas of AI, but the leap to a formal

science required the mathematization of logic and probability and the introduction of a new

branch of mathematics: computation.

The idea of formal logic can be traced back to the philosophers of ancient Greece, India,

and China, but its mathematical development really began with the work of George Boole

(1815–1864), who worked out the details of propositional, or Boolean, logic (Boole, 1847).

In 1879, Gottlob Frege (1848–1925) extended Boole’s logic to include objects and relations,

creating the first-order logic that is used today. In addition to its central role in the early

period of AI research, first-order logic motivated the work of Gödel and Turing that

underpinned computation itself, as we explain below.

5 Frege’s proposed notation for first-order logic—an arcane combination of textual and geometric features—never became popular.

Formal logic

The theory of probability can be seen as generalizing logic to situations with uncertain

information—a consideration of great importance for AI. Gerolamo Cardano (1501–1576)

first framed the idea of probability, describing it in terms of the possible outcomes of

gambling events. In 1654, Blaise Pascal (1623–1662), in a letter to Pierre Fermat (1601–

1665), showed how to predict the future of an unfinished gambling game and assign average

payoffs to the gamblers. Probability quickly became an invaluable part of the quantitative

sciences, helping to deal with uncertain measurements and incomplete theories. Jacob

Bernoulli (1654–1705, uncle of Daniel), Pierre Laplace (1749–1827), and others advanced

the theory and introduced new statistical methods. Thomas Bayes (1702–1761) proposed a

rule for updating probabilities in the light of new evidence; Bayes’ rule is a crucial tool for AI

systems.

5

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001570D

Probability

Statistics

The formalization of probability, combined with the availability of data, led to the

emergence of statistics as a field. One of the first uses was John Graunt’s analysis of London

census data in 1662. Ronald Fisher is considered the first modern statistician (Fisher, 1922).

He brought together the ideas of probability, experiment design, analysis of data, and

computing—in 1919, he insisted that he couldn’t do his work without a mechanical

calculator called the MILLIONAIRE (the first calculator that could do multiplication), even

though the cost of the calculator was more than his annual salary (Ross, 2012).

The history of computation is as old as the history of numbers, but the first nontrivial

algorithm is thought to be Euclid’s algorithm for computing greatest common divisors. The

word algorithm comes from Muhammad ibn Musa al-Khwarizmi, a 9th century

mathematician, whose writings also introduced Arabic numerals and algebra to Europe.

Boole and others discussed algorithms for logical deduction, and, by the late 19th century,

efforts were under way to formalize general mathematical reasoning as logical deduction.

Algorithm

Kurt Gödel (1906–1978) showed that there exists an effective procedure to prove any true

statement in the first-order logic of Frege and Russell, but that first-order logic could not

capture the principle of mathematical induction needed to characterize the natural numbers.

In 1931, Gödel showed that limits on deduction do exist. His incompleteness theorem

showed that in any formal theory as strong as Peano arithmetic (the elementary theory of

natural numbers), there are necessarily true statements that have no proof within the

theory.

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AFA

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001644E

Incompleteness theorem

This fundamental result can also be interpreted as showing that some functions on the

integers cannot be represented by an algorithm—that is, they cannot be computed. This

motivated Alan Turing (1912–1954) to try to characterize exactly which functions are

computable—capable of being computed by an effective procedure. The Church–Turing

thesis proposes to identify the general notion of computability with functions computed by a

Turing machine (Turing, 1936). Turing also showed that there were some functions that no

Turing machine can compute. For example, no machine can tell in general whether a given

program will return an answer on a given input or run forever.

Computability

Although computability is important to an understanding of computation, the notion of

tractability has had an even greater impact on AI. Roughly speaking, a problem is called

intractable if the time required to solve instances of the problem grows exponentially with

the size of the instances. The distinction between polynomial and exponential growth in

complexity was first emphasized in the mid-1960s (Cobham, 1964; Edmonds, 1965). It is

important because exponential growth means that even moderately large instances cannot

be solved in any reasonable time.

Tractability

The theory of NP-completeness, pioneered by Cook (1971) and Karp (1972), provides a

basis for analyzing the tractability of problems: any problem class to which the class of NP-

complete problems can be reduced is likely to be intractable. (Although it has not been

proved that NP-complete problems are necessarily intractable, most theoreticians believe

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166D0

,

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000158AE

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015A6F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000158CE

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015E86

it.) These results contrast with the optimism with which the popular press greeted the first

computers—“Electronic Super-Brains” that were “Faster than Einstein!” Despite the

increasing speed of computers, careful use of resources and necessary imperfection will

characterize intelligent systems. Put crudely, the world is an extremely large problem

instance!

NP-completeness

1.2.3 Economics

How should we make decisions in accordance with our preferences?

How should we do this when others may not go along?

How should we do this when the payoff may be far in the future?

The science of economics originated in 1776, when Adam Smith (1723–1790) published An

Inquiry into the Nature and Causes of the Wealth of Nations. Smith proposed to analyze

economies as consisting of many individual agents attending to their own interests. Smith

was not, however, advocating financial greed as a moral position: his earlier (1759) book

The Theory of Moral Sentiments begins by pointing out that concern for the well-being of

others is an essential component of the interests of every individual.

Most people think of economics as being about money, and indeed the first mathematical

analysis of decisions under uncertainty, the maximum-expected-value formula of Arnauld

(1662), dealt with the monetary value of bets. Daniel Bernoulli (1738) noticed that this

formula didn’t seem to work well for larger amounts of money, such as investments in

maritime trading expeditions. He proposed instead a principle based on maximization of

expected utility, and explained human investment choices by proposing that the marginal

utility of an additional quantity of money diminished as one acquired more money.

Léon Walras (pronounced “Valrasse”) (1834–1910) gave utility theory a more general

foundation in terms of preferences between gambles on any outcomes (not just monetary

outcomes). The theory was improved by Ramsey (1931) and later by John von Neumann

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000155A6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015696

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000163C3

and Oskar Morgenstern in their book The Theory of Games and Economic Behavior (1944).

Economics is no longer the study of money; rather it is the study of desires and preferences.

Decision theory, which combines probability theory with utility theory, provides a formal

and complete framework for individual decisions (economic or otherwise) made under

uncertainty—that is, in cases where probabilistic descriptions appropriately capture the

decision maker’s environment. This is suitable for “large” economies where each agent need

pay no attention to the actions of other agents as individuals. For “small” economies, the

situation is much more like a game: the actions of one player can significantly affect the

utility of another (either positively or negatively). Von Neumann and Morgenstern’s

development of game theory (see also Luce and Raiffa, 1957) included the surprising result

that, for some games, a rational agent should adopt policies that are (or least appear to be)

randomized. Unlike decision theory, game theory does not offer an unambiguous

prescription for selecting actions. In AI, decisions involving multiple agents are studied

under the heading of multiagent systems (Chapter 18 ).

Decision theory

Economists, with some exceptions, did not address the third question listed above: how to

make rational decisions when payoffs from actions are not immediate but instead result

from several actions taken in sequence. This topic was pursued in the field of operations

research, which emerged in World War II from efforts in Britain to optimize radar

installations, and later found innumerable civilian applications. The work of Richard

Bellman (1957) formalized a class of sequential decision problems called Markov decision

processes, which we study in Chapter 17 and, under the heading of reinforcement

learning, in Chapter 22 .

Operations research

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016737

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000160AF

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B87.xhtml#P7001017158000000000000000012B87

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001566A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124DA.xhtml#P70010171580000000000000000124DA

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4

Work in economics and operations research has contributed much to our notion of rational

agents, yet for many years AI research developed along entirely separate paths. One reason

was the apparent complexity of making rational decisions. The pioneering AI researcher

Herbert Simon (1916–2001) won the Nobel Prize in economics in 1978 for his early work

showing that models based on satisficing—making decisions that are “good enough,” rather

than laboriously calculating an optimal decision—gave a better description of actual human

behavior (Simon, 1947). Since the 1990s, there has been a resurgence of interest in decision-

theoretic techniques for AI.

Satisficing

1.2.4 Neuroscience

How do brains process information?

Neuroscience is the study of the nervous system, particularly the brain. Although the exact

way in which the brain enables thought is one of the great mysteries of science, the fact that

it does enable thought has been appreciated for thousands of years because of the evidence

that strong blows to the head can lead to mental incapacitation. It has also long been known

that human brains are somehow different; in about 335 BCE Aristotle wrote, “Of all the

animals, man has the largest brain in proportion to his size.” Still, it was not until the

middle of the 18th century that the brain was widely recognized as the seat of

consciousness. Before then, candidate locations included the heart and the spleen.

6 It has since been discovered that the tree shrew and some bird species exceed the human brain/body ratio.

Neuroscience

Paul Broca’s (1824–1880) investigation of aphasia (speech deficit) in brain-damaged patients

in 1861 initiated the study of the brain’s functional organization by identifying a localized

6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016586

area in the left hemisphere—now called Broca’s area—that is responsible for speech

production. By that time, it was known that the brain consisted largely of nerve cells, or

neurons, but it was not until 1873 that Camillo Golgi (1843–1926) developed a staining

technique allowing the observation of individual neurons (see Figure 1.1 ). This technique

was used by Santiago Ramon y Cajal (1852–1934) in

,

his pioneering studies of neuronal

organization. It is now widely accepted that cognitive functions result from the

electrochemical operation of these structures. That is, a collection of simple cells can lead to

thought, action, and consciousness. In the pithy words of John Searle (1992), brains cause

minds.

7 Many cite Alexander Hood (1824) as a possible prior source.

8 Golgi persisted in his belief that the brain’s functions were carried out primarily in a continuous medium in which neurons were

embedded, whereas Cajal propounded the “neuronal doctrine.” The two shared the Nobel Prize in 1906 but gave mutually

antagonistic acceptance speeches.

Figure 1.1

The parts of a nerve cell or neuron. Each neuron consists of a cell body, or soma, that contains a cell

nucleus. Branching out from the cell body are a number of fibers called dendrites and a single long fiber

called the axon. The axon stretches out for a long distance, much longer than the scale in this diagram

indicates. Typically, an axon is 1 cm long (100 times the diameter of the cell body), but can reach up to 1

meter. A neuron makes connections with 10 to 100,000 other neurons at junctions called synapses.

Signals are propagated from neuron to neuron by a complicated electrochemical reaction. The signals

control brain activity in the short term and also enable long-term changes in the connectivity of neurons.

These mechanisms are thought to form the basis for learning in the brain. Most information processing

7

8

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001651B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015D6C

goes on in the cerebral cortex, the outer layer of the brain. The basic organizational unit appears to be a

column of tissue about 0.5 mm in diameter, containing about 20,000 neurons and extending the full

depth of the cortex (about 4 mm in humans).

Neuron

We now have some data on the mapping between areas of the brain and the parts of the

body that they control or from which they receive sensory input. Such mappings are able to

change radically over the course of a few weeks, and some animals seem to have multiple

maps. Moreover, we do not fully understand how other areas can take over functions when

one area is damaged. There is almost no theory on how an individual memory is stored or

on how higher-level cognitive functions operate.

The measurement of intact brain activity began in 1929 with the invention by Hans Berger

of the electroencephalograph (EEG). The development of functional magnetic resonance

imaging (fMRI) (Ogawa et al., 1990; Cabeza and Nyberg, 2001) is giving neuroscientists

unprecedentedly detailed images of brain activity, enabling measurements that correspond

in interesting ways to ongoing cognitive processes. These are augmented by advances in

single-cell electrical recording of neuron activity and by the methods of optogenetics (Crick,

1999; Optogenetics Zemelman et al., 2002; Han and Boyden, 2007), which allow both

measurement and control of individual neurons modified to be light-sensitive.

Optogenetics

The development of brain–machine interfaces (Lebedev and Nicolelis, 2006) for both

sensing and motor control not only promises to restore function to disabled individuals, but

also sheds light on many aspects of neural systems. A remarkable finding from this work is

that the brain is able to adjust itself to interface successfully with an external device, treating

it in effect like another sensory organ or limb.

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001629F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000157E4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000158FC

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016861

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015CA9

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015FF2

Brain–machine interface

Brains and digital computers have somewhat different properties. Figure 1.2 shows that

computers have a cycle time that is a million times faster than a brain. The brain makes up

for that with far more storage and interconnection than even a high-end personal computer,

although the largest supercomputers match the brain on some metrics. Futurists make much

of these numbers, pointing to an approaching singularity at which computers reach a

superhuman level of performance (Vinge, 1993; Kurzweil, 2005; Doctorow and Stross,

2012), and then rapidly improve themselves even further. But the comparisons of raw

numbers are not especially informative. Even with a computer of virtually unlimited

capacity, we still require further conceptual breakthroughs in our understanding of

intelligence (see Chapter 28 ). Crudely put, without the right theory, faster machines just

give you the wrong answer faster.

Singularity

Figure 1.2

A crude comparison of a leading supercomputer, Summit (Feldman, 2017); a typical personal computer of

2019; and the human brain. Human brain power has not changed much in thousands of years, whereas

supercomputers have improved from megaFLOPs in the 1960s to gigaFLOPs in the 1980s, teraFLOPs in

the 1990s, petaFLOPs in 2008, and exaFLOPs in 2018 ( floating point operations per

second).

1.2.5 Psychology

1exaFLOP = 10

18

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016722

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015FA2

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015A0C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000152FF.xhtml#P70010171580000000000000000152FF

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AC5

How do humans and animals think and act?

The origins of scientific psychology are usually traced to the work of the German physicist

Hermann von Helmholtz (1821–1894) and his student Wilhelm Wundt (1832–1920).

Helmholtz applied the scientific method to the study of human vision, and his Handbook of

Physiological Optics has been described as “the single most important treatise on the physics

and physiology of human vision” (Nalwa, 1993, p.15). In 1879, Wundt opened the first

laboratory of experimental psychology, at the University of Leipzig. Wundt insisted on

carefully controlled experiments in which his workers would perform a perceptual or

associative task while introspecting on their thought processes. The careful controls went a

long way toward making psychology a science, but the subjective nature of the data made it

unlikely that experimenters would ever disconfirm their own theories.

Biologists studying animal behavior, on the other hand, lacked introspective data and

developed an objective methodology, as described byH. S. Jennings (1906) in his influential

work Behavior of the Lower Organisms. Applying this viewpoint to humans, the behaviorism

movement, led by John Watson (1878–1958), rejected any theory involving mental

processes on the grounds that introspection could not provide reliable evidence.

Behaviorists insisted on studying only objective

,

measures of the percepts (or stimulus) given

to an animal and its resulting actions (or response). Behaviorism discovered a lot about rats

and pigeons but had less success at understanding humans.

Behaviorism

Cognitive psychology, which views the brain as an information-processing device, can be

traced back at least to the works of William James (1842–1910). Helmholtz also insisted that

perception involved a form of unconscious logical inference. The cognitive viewpoint was

largely eclipsed by behaviorism in the United States, but at Cambridge’s Applied Psychology

Unit, directed by Frederic Bartlett (1886–1969), cognitive modeling was able to flourish. The

Nature of Explanation, by Bartlett’s student and successor Kenneth Craik (1943), forcefully

reestablished the legitimacy of such “mental” terms as beliefs and goals, arguing that they

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016236

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015E11

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000158F4

are just as scientific as, say, using pressure and temperature to talk about gases, despite

gasses being made of molecules that have neither.

Cognitive psychology

Craik specified the three key steps of a knowledge-based agent: (1) the stimulus must be

translated into an internal representation, (2) the representation is manipulated by cognitive

processes to derive new internal representations, and (3) these are in turn retranslated back

into action. He clearly explained why this was a good design for an agent:

If the organism carries a “small-scale model” of external reality and of its own possible actions within

its head, it is able to try out various alternatives, conclude which is the best of them, react to future

situations before they arise, utilize the knowledge of past events in dealing with the present and

future, and in every way to react in a much fuller, safer, and more competent manner to the

emergencies which face it. (Craik, 1943)

After Craik’s death in a bicycle accident in 1945, his work was continued by Donald

Broadbent, whose book Perception and Communication (1958) was one of the first works to

model psychological phenomena as information processing. Meanwhile, in the United

States, the development of computer modeling led to the creation of the field of cognitive

science. The field can be said to have started at a workshop in September 1956 at MIT—just

two months after the conference at which AI itself was “born.”

At the workshop, George Miller presented The Magic Number Seven, Noam Chomsky

presented Three Models of Language, and Allen Newell and Herbert Simon presented The

Logic Theory Machine. These three influential papers showed how computer models could be

used to address the psychology of memory, language, and logical thinking, respectively. It is

now a common (although far from universal) view among psychologists that “a cognitive

theory should be like a computer program” (Anderson, 1980); that is, it should describe the

operation of a cognitive function in terms of the processing of information.

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000158F4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015570

For purposes of this review, we will count the field of human–computer interaction (HCI)

under psychology. Doug Engelbart, one of the pioneers of HCI, championed the idea of

intelligence augmentation—IA rather than AI. He believed that computers should augment

human abilities rather than automate away human tasks. In 1968, Engelbart’s “mother of all

demos” showed off for the first time the computer mouse, a windowing system, hypertext,

and video conferencing—all in an effort to demonstrate what human knowledge workers

could collectively accomplish with some intelligence augmentation.

Intelligence augmentation

Today we are more likely to see IA and AI as two sides of the same coin, with the former

emphasizing human control and the latter emphasizing intelligent behavior on the part of

the machine. Both are needed for machines to be useful to humans.

1.2.6 Computer engineering

How can we build an efficient computer?

The modern digital electronic computer was invented independently and almost

simultaneously by scientists in three countries embattled in World War II. The first

operational computer was the electromechanical Heath Robinson, built in 1943 by Alan

Turing’s team for a single purpose: deciphering German messages. In 1943, the same group

developed the Colossus, a powerful general-purpose machine based on vacuum tubes.

The first operational programmable computer was the Z-3, the invention of Konrad Zuse in

Germany in 1941. Zuse also invented floating-point numbers and the first high-level

programming language, Plankalkül. The first electronic computer, the ABC, was assembled

by John Atanasoff and his student Clifford Berry between 1940 and 1942 at Iowa State

University. Atanasoff’s research received little support or recognition; it was the ENIAC,

developed as part of a secret military project at the University of Pennsylvania by a team

including John Mauchly and J. Presper Eckert, that proved to be the most influential

forerunner of modern computers.

9

10

9 A complex machine named after a British cartoonist who depicted whimsical and absurdly complicated contraptions for everyday

tasks such as buttering toast.

10 In the postwar period, Turing wanted to use these computers for AI research—for example, he created an outline of the first chess

program (Turing et al., 1953) —but the British government blocked this research.

Moore’s law

Since that time, each generation of computer hardware has brought an increase in speed

and capacity and a decrease in price—a trend captured in Moore’s law. Performance

doubled every 18 months or so until around 2005, when power dissipation problems led

manufacturers to start multiplying the number of CPU cores rather than the clock speed.

Current expectations are that future increases in functionality will come from massive

parallelism—a curious convergence with the properties of the brain. We also see new

hardware designs based on the idea that in dealing with an uncertain world, we don’t need

64 bits of precision in our numbers; just 16 bits (as in the bfloat16 format) or even 8 bits will

be enough, and will enable faster processing.

We are just beginning to see hardware tuned for AI applications, such as the graphics

processing unit (GPU), tensor processing unit (TPU), and wafer scale engine (WSE). From

the 1960s to about 2012, the amount of computing power used to train top machine learning

applications followed Moore’s law. Beginning in 2012, things changed: from 2012 to 2018

there was a 300,000-fold increase, which works out to a doubling every 100 days or so

(Amodei and Hernandez, 2018). A machine learning model that took a full day to train in

2014 takes only two minutes in 2018 (Ying et al., 2018). Although it is not yet practical,

quantum computing holds out the promise of far greater accelerations for some important

subclasses of AI algorithms.

Quantum computing

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166D6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001556A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016834

Of course, there were calculating

,

devices before the electronic computer. The earliest

automated machines, dating from the 17th century, were discussed on 6. The first

programmable machine was a loom, devised in 1805 by Joseph Marie Jacquard (1752–1834),

that used punched cards to store instructions for the pattern to be woven.

In the mid-19th century, Charles Babbage (1792–1871) designed two computing machines,

neither of which he completed. The Difference Engine was intended to compute

mathematical tables for engineering and scientific projects. It was finally built and shown to

work in 1991 (Swade, 2000). Babbage’s Analytical Engine was far more ambitious: it

included addressable memory, stored programs based on Jacquard’s punched cards, and

conditional jumps. It was the first machine capable of universal computation.

Babbage’s colleague Ada Lovelace, daughter of the poet Lord Byron, understood its

potential, describing it as “a thinking or ... a reasoning machine,” one capable of reasoning

about “all subjects in the universe” (Lovelace, 1843). She also anticipated AI’s hype cycles,

writing, “It is desirable to guard against the possibility of exaggerated ideas that might arise

as to the powers of the Analytical Engine.” Unfortunately, Babbage’s machines and

Lovelace’s ideas were largely forgotten.

AI also owes a debt to the software side of computer science, which has supplied the

operating systems, programming languages, and tools needed to write modern programs

(and papers about them). But this is one area where the debt has been repaid: work in AI

has pioneered many ideas that have made their way back to mainstream computer science,

including time sharing, interactive interpreters, personal computers with windows and mice,

rapid development environments, the linked-list data type, automatic storage management,

and key concepts of symbolic, functional, declarative, and object-oriented programming.

1.2.7 Control theory and cybernetics

How can artifacts operate under their own control?

Ktesibios of Alexandria (c. 250 BCE) built the first self-controlling machine: a water clock

with a regulator that maintained a constant flow rate. This invention changed the definition

of what an artifact could do. Previously, only living things could modify their behavior in

response to changes in the environment. Other examples of self-regulating feedback control

systems include the steam engine governor, created by James Watt (1736–1819), and the

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001663B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001608F

thermostat, invented by Cornelis Drebbel (1572–1633), who also invented the submarine.

James Clerk Maxwell (1868) initiated the mathematical theory of control systems.

A central figure in the post-war development of control theory was Norbert Wiener (1894–

1964). Wiener was a brilliant mathematician who worked with Bertrand Russell, among

others, before developing an interest in biological and mechanical control systems and their

connection to cognition. Like Craik (who also used control systems as psychological

models), Wiener and his colleagues Arturo Rosenblueth and Julian Bigelow challenged the

behaviorist orthodoxy (Rosenblueth et al., 1943). They viewed purposive behavior as arising

from a regulatory mechanism trying to minimize “error”—the difference between current

state and goal state. In the late 1940s, Wiener, along with Warren McCulloch, Walter Pitts,

and John von Neumann, organized a series of influential conferences that explored the new

mathematical and computational models of cognition. Wiener’s book Cybernetics (1948)

became a bestseller and awoke the public to the possibility of artificially intelligent

machines.

Control theory

Cybernetics

Meanwhile, in Britain, W. Ross Ashby pioneered similar ideas (Ashby, 1940). Ashby, Alan

Turing, Grey Walter, and others formed the Ratio Club for “those who had Wiener’s ideas

before Wiener’s book appeared.” Ashby’s Design for a Brain 1948, 1952 elaborated on his

idea that intelligence could be created by the use of homeostatic devices containing

appropriate feedback loops to achieve stable adaptive behavior.

Homeostatic

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016120

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016447

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167AB

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000155B7

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000155B9

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000155BB

Modern control theory, especially the branch known as stochastic optimal control, has as its

goal the design of systems that maximize a cost function over time. This roughly matches

the standard model of AI: designing systems that behave optimally. Why, then, are AI and

control theory two different fields, despite the close connections among their founders? The

answer lies in the close coupling between the mathematical techniques that were familiar to

the participants and the corresponding sets of problems that were encompassed in each

world view. Calculus and matrix algebra, the tools of control theory, lend themselves to

systems that are describable by fixed sets of continuous variables, whereas AI was founded

in part as a way to escape from these perceived limitations. The tools of logical inference

and computation allowed AI researchers to consider problems such as language, vision, and

symbolic planning that fell completely outside the control theorist’s purview.

Cost function

1.2.8 Linguistics

How does language relate to thought?

In 1957, B. F. Skinner published Verbal Behavior. This was a comprehensive, detailed

account of the behaviorist approach to language learning, written by the foremost expert in

the field. But curiously, a review of the book became as well known as the book itself, and

served to almost kill off interest in behaviorism. The author of the review was the linguist

Noam Chomsky, who had just published a book on his own theory, Syntactic Structures.

Chomsky pointed out that the behaviorist theory did not address the notion of creativity in

language—it did not explain how children could understand and make up sentences that

they had never heard before. Chomsky’s theory—based on syntactic models going back to

the Indian linguist Panini (c. 350 BCE)—could explain this, and unlike previous theories, it

was formal enough that it could in principle be programmed.

Computational linguistics

Modern linguistics and AI, then, were “born” at about the same time, and grew up together,

intersecting in a hybrid field called computational linguistics or natural language

processing. The problem of understanding language turned out to be considerably more

complex than it seemed in 1957. Understanding language requires an understanding of the

subject matter and context, not just an understanding of the structure of sentences. This

might seem obvious, but it was not widely appreciated until the 1960s. Much of the early

work in knowledge representation (the study of how to put knowledge into a form that a

computer can reason with) was tied to language and informed by research in linguistics,

which was connected in turn to decades of work on the philosophical analysis of language.

1.3 The History of Artificial Intelligence

,

One quick way to summarize the milestones in AI history is to list the Turing Award

winners: Marvin Minsky (1969) and John McCarthy (1971) for defining the foundations of

the field based on representation and reasoning; Ed Feigenbaum and Raj Reddy (1994) for

developing expert systems that encode human knowledge to solve real-world problems;

Judea Pearl (2011) for developing probabilistic reasoning techniques that deal with

uncertainty in a principled manner; and finally Yoshua Bengio, Geoffrey Hinton, and Yann

LeCun (2019) for making “deep learning” (multilayer neural networks) a critical part of

modern computing. The rest of this section goes into more detail on each phase of AI

history.

1.3.1 The inception of artificial intelligence (1943–1956)

The first work that is now generally recognized as AI was done by Warren McCulloch and

Walter Pitts (1943). Inspired by the mathematical modeling work of Pitts’s advisor Nicolas

(1936, 1938), they drew on three sources: knowledge of the basic physiology and function of

neurons in the brain; a formal analysis of propositional logic due to Russell and Whitehead;

and Turing’s theory of computation. They proposed a model of artificial neurons in which

each neuron is characterized as being “on” or “off,” with a switch to “on” occurring in

response to stimulation by a sufficient number of neighboring neurons. The state of a

neuron was conceived of as “factually equivalent to a proposition which proposed its

adequate stimulus.” They showed, for example, that any computable function could be

computed by some network of connected neurons, and that all the logical connectives (AND,

OR, NOT, etc.) could be implemented by simple network structures. McCulloch and Pitts also

suggested that suitably defined networks could learn. Donald Hebb (1949) demonstrated a

simple updating rule for modifying the connection strengths between neurons. His rule,

now called Hebbian learning, remains an influential model to this day.

Hebbian learning

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000161A7

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016146

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015CFB

Two undergraduate students at Harvard, Marvin Minsky (1927–2016) and Dean Edmonds,

built the first neural network computer in 1950. The SNARC, as it was called, used 3000

vacuum tubes and a surplus automatic pilot mechanism from a B-24 bomber to simulate a

network of 40 neurons. Later, at Princeton, Minsky studied universal computation in neural

networks. His Ph.D. committee was skeptical about whether this kind of work should be

considered mathematics, but von Neumann reportedly said, “If it isn’t now, it will be

someday.”

There were a number of other examples of early work that can be characterized as AI,

including two checkers-playing programs developed independently in 1952 by Christopher

Strachey at the University of Manchester and by Arthur Samuel at IBM. However, Alan

Turing’s vision was the most influential. He gave lectures on the topic as early as 1947 at the

London Mathematical Society and articulated a persuasive agenda in his 1950 article

“Computing Machinery and Intelligence.” Therein, he introduced the Turing test, machine

learning, genetic algorithms, and reinforcement learning. He dealt with many of the

objections raised to the possibility of AI, as described in Chapter 27 . He also suggested

that it would be easier to create human-level AI by developing learning algorithms and then

teaching the machine rather than by programming its intelligence by hand. In subsequent

lectures he warned that achieving this goal might not be the best thing for the human race.

In 1955, John McCarthy of Dartmouth College convinced Minsky, Claude Shannon, and

Nathaniel Rochester to help him bring together U.S. researchers interested in automata

theory, neural nets, and the study of intelligence. They organized a two-month workshop at

Dartmouth in the summer of 1956. There were 10 attendees in all, including Allen Newell

and Herbert Simon from Carnegie Tech, Trenchard More from Princeton, Arthur Samuel

from IBM, and Ray Solomonoff and Oliver Selfridge from MIT. The proposal states:

11 Now Carnegie Mellon University (CMU).

12 This was the first official usage of McCarthy’s term artificial intelligence. Perhaps “computational rationality” would have been more

precise and less threatening, but “AI” has stuck. At the 50th anniversary of the Dartmouth conference, McCarthy stated that he

resisted the terms “computer” or “computational” in deference to Norbert Wiener, who was promoting analog cybernetic devices

rather than digital computers.

We propose that a 2 month, 10 man study of artificial intelligence be carried out during the summer

of 1956 at Dartmouth College in Hanover, New Hampshire. The study is to proceed on the basis of

the conjecture that every aspect of learning or any other feature of intelligence can in principle be so

11

12

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001512B.xhtml#P700101715800000000000000001512B

precisely described that a machine can be made to simulate it. An attempt will be made to find how

to make machines use language, form abstractions and concepts, solve kinds of problems now

reserved for humans, and improve themselves. We think that a significant advance can be made in

one or more of these problems if a carefully selected group of scientists work on it together for a

summer.

Despite this optimistic prediction, the Dartmouth workshop did not lead to any

breakthroughs. Newell and Simon presented perhaps the most mature work, a mathematical

theorem-proving system called the Logic Theorist (LT). Simon claimed, “We have invented

a computer program capable of thinking non-numerically, and thereby solved the venerable

mind–body problem.” Soon after the workshop, the program was able to prove most of

the theorems in Chapter 2 of Russell and Whitehead’s Principia Mathematica. Russell was

reportedly delighted when told that LT had come up with a proof for one theorem that was

shorter than the one in Principia. The editors of the Journal of Symbolic Logic were less

impressed; they rejected a paper coauthored by Newell, Simon, and Logic Theorist.

13 Newell and Simon also invented a list-processing language, IPL, to write LT. They had no compiler and translated it into machine

code by hand. To avoid errors, they worked in parallel, calling out binary numbers to each other as they wrote each instruction to

make sure they agreed.

1.3.2 Early enthusiasm, great expectations (1952–1969)

The intellectual establishment of the 1950s, by and large, preferred to believe that “a

machine can never do .” (See Chapter 27 for a long list of ’s gathered by Turing.) AI

researchers naturally responded by demonstrating one after another. They focused in

particular on tasks considered indicative of intelligence in humans, including games,

puzzles, mathematics, and IQ tests. John McCarthy referred to this period as the “Look, Ma,

no hands!” era.

Newell and Simon followed up their success with LT with the General Problem Solver, or

GPS. Unlike LT, this program was designed from the start to imitate human problem-solving

protocols. Within the limited class of puzzles it could handle, it turned out that the order in

which the program considered subgoals and possible actions was similar to that in which

humans approached the same problems. Thus, GPS was probably the first program to

embody the “thinking humanly” approach. The success of GPS and subsequent programs as

models of cognition led Newell and Simon (1976) to formulate the famous physical symbol

system hypothesis, which

,

states that “a physical symbol system has the necessary and

13

X  X

X

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE4C.xhtml#P700101715800000000000000000EE4C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001512B.xhtml#P700101715800000000000000001512B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016254

sufficient means for general intelligent action.” What they meant is that any system (human

or machine) exhibiting intelligence must operate by manipulating data structures composed

of symbols. We will see later that this hypothesis has been challenged from many directions.

Physical symbol system

At IBM, Nathaniel Rochester and his colleagues produced some of the first AI programs.

Herbert Gelernter (1959) constructed the Geometry Theorem Prover, which was able to

prove theorems that many students of mathematics would find quite tricky. This work was a

precursor of modern mathematical theorem provers.

Of all the exploratory work done during this period, perhaps the most influential in the long

run was that of Arthur Samuel on checkers (draughts). Using methods that we now call

reinforcement learning (see Chapter 22 ), Samuel’s programs learned to play at a strong

amateur level. He thereby disproved the idea that computers can do only what they are told

to: his program quickly learned to play a better game than its creator. The program was

demonstrated on television in 1956, creating a strong impression. Like Turing, Samuel had

trouble finding computer time. Working at night, he used machines that were still on the

testing floor at IBM’s manufacturing plant. Samuel’s program was the precursor of later

systems such as TD-GAMMON (Tesauro, 1992), which was among the world’s best

backgammon players, and ALPHAGO (Silver et al., 2016), which shocked the world by

defeating the human world champion at Go (see Chapter 5 ).

In 1958, John McCarthy made two important contributions to AI. In MIT AI Lab Memo No.

1, he defined the high-level language Lisp, which was to become the dominant AI

programming language for the next 30 years. In a paper entitled Programs with Common

Sense, he advanced a conceptual proposal for AI systems based on knowledge and

reasoning. The paper describes the Advice Taker, a hypothetical program that would

embody general knowledge of the world and could use it to derive plans of action. The

concept was illustrated with simple logical axioms that suffice to generate a plan to drive to

the airport. The program was also designed to accept new axioms in the normal course of

operation, thereby allowing it to achieve competence in new areas without being

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015BA5

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001667D

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016577

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F90E.xhtml#P700101715800000000000000000F90E

reprogrammed. The Advice Taker thus embodied the central principles of knowledge

representation and reasoning: that it is useful to have a formal, explicit representation of the

world and its workings and to be able to manipulate that representation with deductive

processes. The paper influenced the course of AI and remains relevant today.

Lisp

1958 also marked the year that Marvin Minsky moved to MIT. His initial collaboration with

McCarthy did not last, however. McCarthy stressed representation and reasoning in formal

logic, whereas Minsky was more interested in getting programs to work and eventually

developed an anti-logic outlook. In 1963, McCarthy started the AI lab at Stanford. His plan

to use logic to build the ultimate Advice Taker was advanced by J. A. Robinson’s discovery

in 1965 of the resolution method (a complete theorem-proving algorithm for first-order

logic; see Chapter 9 ). Work at Stanford emphasized general-purpose methods for logical

reasoning. Applications of logic included Cordell Green’s question-answering and planning

systems (Green, 1969b) and the Shakey robotics project at the Stanford Research Institute

(SRI). The latter project, discussed further in Chapter 26 , was the first to demonstrate the

complete integration of logical reasoning and physical activity.

Microworld

At MIT, Minsky supervised a series of students who chose limited problems that appeared to

require intelligence to solve. These limited domains became known as microworlds. James

Slagle’s SAINT program (1963) was able to solve closed-form calculus integration problems

typical of first-year college courses. Tom Evans’s ANALOGY program (1968) solved geometric

analogy problems that appear in IQ tests. Daniel Bobrow’s STUDENT program (1967) solved

algebra story problems, such as the following:

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001063E.xhtml#P700101715800000000000000001063E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015C5E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C8B.xhtml#P7001017158000000000000000014C8B

If the number of customers Tom gets is twice the square of 20 percent of the number of

advertisem*nts he runs, and the number of advertisem*nts he runs is 45, what is the number of

customers Tom gets?

The most famous microworld is the blocks world, which consists of a set of solid blocks

placed on a tabletop (or more often, a simulation of a tabletop), as shown in Figure 1.3 . A

typical task in this world is to rearrange the blocks in a certain way, using a robot hand that

can pick up one block at a time. The blocks world was home to the vision project of David

Huffman (1971), the vision and constraint-propagation work of David Waltz (1975), the

learning theory of Patrick Winston (1970), the natural-language-understanding program of

Terry Winograd (1972), and the planner of Scott Fahlman (1974).

Figure 1.3

A scene from the blocks world. SHRDLU (Winograd, 1972) has just completed the command “Find a block

which is taller than the one you are holding and put it in the box.”

Blocks world

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015DB6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001674B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167DA

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167D8

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AB1

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167D8

Early work building on the neural networks of McCulloch and Pitts also flourished. The

work of Shmuel Winograd and Jack Cowan (1963) showed how a large number of elements

could collectively represent an individual concept, with a corresponding increase in

robustness and parallelism. Hebb’s learning methods were enhanced by Bernie Widrow

(Widrow and

,

Hoff, 1960; Widrow, 1962), who called his networks adalines, and by Frank

Rosenblatt (1962) with his perceptrons. The perceptron convergence theorem (Block et al.,

1962) says that the learning algorithm can adjust the connection strengths of a perceptron to

match any input data, provided such a match exists.

1.3.3 A dose of reality (1966–1973)

From the beginning, AI researchers were not shy about making predictions of their coming

successes. The following statement by Herbert Simon in 1957 is often quoted:

It is not my aim to surprise or shock you—but the simplest way I can summarize is to say that there

are now in the world machines that think, that learn and that create. Moreover, their ability to do

these things is going to increase rapidly until—in a visible future—the range of problems they can

handle will be coextensive with the range to which the human mind has been applied.

The term “visible future” is vague, but Simon also made more concrete predictions: that

within 10 years a computer would be chess champion and a significant mathematical

theorem would be proved by machine. These predictions came true (or approximately true)

within 40 years rather than 10. Simon’s overconfidence was due to the promising

performance of early AI systems on simple examples. In almost all cases, however, these

early systems failed on more difficult problems.

There were two main reasons for this failure. The first was that many early AI systems were

based primarily on “informed introspection” as to how humans perform a task, rather than

on a careful analysis of the task, what it means to be a solution, and what an algorithm

would need to do to reliably produce such solutions.

The second reason for failure was a lack of appreciation of the intractability of many of the

problems that AI was attempting to solve. Most of the early problem-solving systems

worked by trying out different combinations of steps until the solution was found. This

strategy worked initially because microworlds contained very few objects and hence very

few possible actions and very short solution sequences. Before the theory of computational

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167D6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167A3

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167A1

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016443

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000156E5

complexity was developed, it was widely thought that “scaling up” to larger problems was

simply a matter of faster hardware and larger memories. The optimism that accompanied

the development of resolution theorem proving, for example, was soon dampened when

researchers failed to prove theorems involving more than a few dozen facts. The fact that a

program can find a solution in principle does not mean that the program contains any of the

mechanisms needed to find it in practice.

The illusion of unlimited computational power was not confined to problem-solving

programs. Early experiments in machine evolution (now called genetic programming)

(Fried-Machine evolutionberg, 1958; Friedberg et al., 1959) were based on the undoubtedly

correct belief that by making an appropriate series of small mutations to a machine-code

program, one can generate a program with good performance for any particular task. The

idea, then, was to try random mutations with a selection process to preserve mutations that

seemed useful. Despite thousands of hours of CPU time, almost no progress was

demonstrated.

Machine evolution

Failure to come to grips with the “combinatorial explosion” was one of the main criticisms of

AI contained in the Lighthill report (Lighthill, 1973), which formed the basis for the decision

by the British government to end support for AI research in all but two universities. (Oral

tradition paints a somewhat different and more colorful picture, with political ambitions and

personal animosities whose description is beside the point.)

A third difficulty arose because of some fundamental limitations on the basic structures

being used to generate intelligent behavior. For example, Minsky and Papert’s book

Perceptrons (1969) proved that, although perceptrons (a simple form of neural network)

could be shown to learn anything they were capable of representing, they could represent

very little. In particular, a two-input perceptron could not be trained to recognize when its

two inputs were different. Although their results did not apply to more complex, multilayer

networks, research funding for neural-net research soon dwindled to almost nothing.

Ironically, the new back-propagation learning algorithms that were to cause an enormous

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015B4A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015B4C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016049

resurgence in neural-net research in the late 1980s and again in the 2010s had already been

developed in other contexts in the early 1960s (Kelley, 1960; Bryson, 1962).

1.3.4 Expert systems (1969–1986)

The picture of problem solving that had arisen during the first decade of AI research was of

a general-purpose search mechanism trying to string together elementary reasoning steps to

find complete solutions. Such approaches have been called weak methods because,

although general, they do not scale up to large or difficult problem instances. The

alternative to weak methods is to use more powerful, domain-specific knowledge that

allows larger reasoning steps and can more easily handle typically occurring cases in narrow

areas of expertise. One might say that to solve a hard problem, you have to almost know the

answer already.

Weak method

The DENDRAL program (Buchanan et al., 1969) was an early example of this approach. It was

developed at Stanford, where Ed Feigenbaum (a former student of Herbert Simon), Bruce

Buchanan (a philosopher turned computer scientist), and Joshua Lederberg (a Nobel

laureate geneticist) teamed up to solve the problem of inferring molecular structure from the

information provided by a mass spectrometer. The input to the program consists of the

elementary formula of the molecule (e.g., ) and the mass spectrum giving the

masses of the various fragments of the molecule generated when it is bombarded by an

electron beam. For example, the mass spectrum might contain a peak at ,

corresponding to the mass of a methyl ( ) fragment.

The naive version of the program generated all possible structures consistent with the

formula, and then predicted what mass spectrum would be observed for each, comparing

this with the actual spectrum. As one might expect, this is intractable for even moderate-

sized molecules. The DENDRAL researchers consulted analytical chemists and found that they

worked by looking for well-known patterns of peaks in the spectrum that suggested

common substructures in the molecule. For example, the following rule is used to recognize

a ketone subgroup (which weighs 28):

C6H13NO2

m = 15

CH3

(C=O)

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015EB8

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000157B6

,

it is also noted in the margin. Subsequent significant uses

of the term are in bold, but not in the margin. We have included a comprehensive index and

an extensive bibliography.

Term

The only prerequisite is familiarity with basic concepts of computer science (algorithms,

data structures, complexity) at a sophom*ore level. Freshman calculus and linear algebra are

useful for some of the topics.

Online resources

Online resources are available through pearsonhighered.com/cs-resources or at the

book’s Web site, aima.cs.berkeley.edu. There you will find:

Exercises, programming projects, and research projects. These are no longer at the end

of each chapter; they are online only. Within the book, we refer to an online exercise

with a name like “Exercise 6.NARY.” Instructions on the Web site allow you to find

exercises by name or by topic.

Implementations of the algorithms in the book in Python, Java, and other programming

languages (currently hosted at github.com/aimacode).

A list of over 1400 schools that have used the book, many with links to online course

materials and syllabi.

Supplementary material and links for students and instructors.

Instructions on how to report errors in the book, in the likely event that some exist.

Book cover

The cover depicts the final position from the decisive game 6 of the 1997 chess match in

which the program Deep Blue defeated Garry Kasparov (playing Black), making this the first

http://pearsonhighered.com/

time a computer had beaten a world champion in a chess match. Kasparov is shown at the

top. To his right is a pivotal position from the second game of the historic Go match

between former world champion Lee Sedol and DeepMind’s ALPHAGO program. Move 37 by

ALPHAGO violated centuries of Go orthodoxy and was immediately seen by human experts as

an embarrassing mistake, but it turned out to be a winning move. At top left is an Atlas

humanoid robot built by Boston Dynamics. A depiction of a self-driving car sensing its

environment appears between Ada Lovelace, the world’s first computer programmer, and

Alan Turing, whose fundamental work defined artificial intelligence. At the bottom of the

chess board are a Mars Exploration Rover robot and a statue of Aristotle, who pioneered the

study of logic; his planning algorithm from De Motu Animalium appears behind the authors’

names. Behind the chess board is a probabilistic programming model used by the UN

Comprehensive Nuclear-Test-Ban Treaty Organization for detecting nuclear explosions

from seismic signals.

Acknowledgments

It takes a global village to make a book. Over 600 people read parts of the book and made

suggestions for improvement. The complete list is at aima.cs.berkeley.edu/ack.html;

we are grateful to all of them. We have space here to mention only a few especially

important contributors. First the contributing writers:

Judea Pearl (Section 13.5 , Causal Networks);

Vikash Mansinghka (Section 15.3 , Programs as Probability Models);

Michael Wooldridge (Chapter 18 , Multiagent Decision Making);

Ian Goodfellow (Chapter 21 , Deep Learning);

Jacob Devlin and Mei-Wing Chang (Chapter 24 , Deep Learning for Natural

Language);

Jitendra Malik and David Forsyth (Chapter 25 , Computer Vision);

Anca Dragan (Chapter 26 , Robotics).

Then some key roles:

Cynthia Yeung and Malika Cantor (project management);

Julie Sussman and Tom Galloway (copyediting and writing suggestions);

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011485.xhtml#P7001017158000000000000000011485

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012063.xhtml#P7001017158000000000000000012063

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B87.xhtml#P7001017158000000000000000012B87

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013CF5.xhtml#P7001017158000000000000000013CF5

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001461E.xhtml#P700101715800000000000000001461E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000148CB.xhtml#P70010171580000000000000000148CB

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C8B.xhtml#P7001017158000000000000000014C8B

Omari Stephens (illustrations);

Tracy Johnson (editor);

Erin Ault and Rose Kernan (cover and color conversion);

Nalin Chhibber, Sam Goto, Raymond de Lacaze, Ravi Mohan, Ciaran O’Reilly, Amit

Patel, Dragomir Radiv, and Samagra Sharma (online code development and mentoring);

Google Summer of Code students (online code development).

Stuart would like to thank his wife, Loy Sheflott, for her endless patience and boundless

wisdom. He hopes that Gordon, Lucy, George, and Isaac will soon be reading this book after

they have forgiven him for working so long on it. RUGS (Russell’s Unusual Group of

Students) have been unusually helpful, as always.

Peter would like to thank his parents (Torsten and Gerda) for getting him started, and his

wife (Kris), children (Bella and Juliet), colleagues, boss, and friends for encouraging and

tolerating him through the long hours of writing and rewriting.

About the Authors

STUART RUSSELL was born in 1962 in Portsmouth, England. He received his B.A. with

first-class honours in physics from Oxford University in 1982, and his Ph.D. in computer

science from Stanford in 1986. He then joined the faculty of the University of California at

Berkeley, where he is a professor and former chair of computer science, director of the

Center for Human-Compatible AI, and holder of the Smith–Zadeh Chair in Engineering. In

1990, he received the Presidential Young Investigator Award of the National Science

Foundation, and in 1995 he was cowinner of the Computers and Thought Award. He is a

Fellow of the American Association for Artificial Intelligence, the Association for Computing

Machinery, and the American Association for the Advancement of Science, an Honorary

Fellow of Wadham College, Oxford, and an Andrew Carnegie Fellow. He held the Chaire

Blaise Pascal in Paris from 2012 to 2014. He has published over 300 papers on a wide range

of topics in artificial intelligence. His other books include The Use of Knowledge in Analogy and

Induction, Do the Right Thing: Studies in Limited Rationality (with Eric Wefald), and Human

Compatible: Artificial Intelligence and the Problem of Control.

PETER NORVIG is currently a Director of Research at Google, Inc., and was previously the

director responsible for the core Web search algorithms. He co-taught an online AI class

that signed up 160,000 students, helping to kick off the current round of massive open

online classes. He was head of the Computational Sciences Division at NASA Ames

Research Center, overseeing research and development in artificial intelligence and

robotics. He received a B.S. in applied mathematics from Brown University and a Ph.D. in

computer science from Berkeley. He has been a professor at the University of Southern

California and a faculty member at Berkeley and Stanford. He is a Fellow of the American

Association for Artificial Intelligence, the Association for Computing Machinery, the

American Academy of Arts and Sciences, and the California Academy of Science. His other

books are Paradigms of AI Programming: Case Studies in Common Lisp, Verbmobil: A Translation

System for Face-to-Face Dialog, and Intelligent Help Systems for UNIX.

The two authors shared the inaugural AAAI/EAAI Outstanding Educator award in 2016.

Contents

I Artificial Intelligence

1 Introduction 1

1.1 What Is AI? 1

1.2 The Foundations of Artificial Intelligence 5

1.3 The History of Artificial Intelligence 17

1.4 The State of the Art 27

1.5 Risks and Benefits of AI 31

Summary 34

Bibliographical and Historical Notes 35

2

,

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000157C0

if is the mass of the whole molecule and there are two peaks at and such that (a)

; (b) is a high peak; (c) is a high peak; and (d) At least one of and

is high then there is a ketone subgroup.

Recognizing that the molecule contains a particular substructure reduces the number of

possible candidates enormously. According to its authors, DENDRAL was powerful because it

embodied the relevant knowledge of mass spectroscopy not in the form of first principles

but in efficient “cookbook recipes” (Feigenbaum et al., 1971). The significance of DENDRAL

was that it was the first successful knowledge-intensive system: its expertise derived from

large numbers of special-purpose rules. In 1971, Feigenbaum and others at Stanford began

the Heuristic Programming Project (HPP) to investigate the extent to which the new

methodology of expert systems could be applied to other areas.

Expert systems

The next major effort was the MYCIN system for diagnosing blood infections. With about 450

rules, MYCIN was able to perform as well as some experts, and considerably better than

junior doctors. It also contained two major differences from DENDRAL. First, unlike the

DENDRAL rules, no general theoretical model existed from which the MYCIN rules could be

deduced. They had to be acquired from extensive interviewing of experts. Second, the rules

had to reflect the uncertainty associated with medical knowledge. MYCIN incorporated a

calculus of uncertainty called certainty factors (see Chapter 13 ), which seemed (at the

time) to fit well with how doctors assessed the impact of evidence on the diagnosis.

Certainty factor

The first successful commercial expert system, R1, began operation at the Digital Equipment

Corporation (McDermott, 1982). The program helped configure orders for new computer

systems; by 1986, it was saving the company an estimated $40 million a year. By 1988,

M x1 x2

x1 + x2 = M + 28 x1 − 28 x2 − 28 x1

x2

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015ABF

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000110BE.xhtml#P70010171580000000000000000110BE

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001615B

DEC’s AI group had 40 expert systems deployed, with more on the way. DuPont had 100 in

use and 500 in development. Nearly every major U.S. corporation had its own AI group and

was either using or investigating expert systems.

The importance of domain knowledge was also apparent in the area of natural language

understanding. Despite the success of Winograd’s SHRDLU system, its methods did not extend

to more general tasks: for problems such as ambiguity resolution it used simple rules that

relied on the tiny scope of the blocks world.

Several researchers, including Eugene Charniak at MIT and Roger Schank at Yale, suggested

that robust language understanding would require general knowledge about the world and

a general method for using that knowledge. (Schank went further, claiming, “There is no

such thing as syntax,” which upset a lot of linguists but did serve to start a useful

discussion.) Schank and his students built a series of programs (Schank and Abelson, 1977;

Wilensky, 1978; Schank and Riesbeck, 1981) that all had the task of understanding natural

language. The emphasis, however, was less on language per se and more on the problems of

representing and reasoning with the knowledge required for language understanding.

The widespread growth of applications to real-world problems led to the development of a

wide range of representation and reasoning tools. Some were based on logic—for example,

the Prolog language became popular in Europe and Japan, and the PLANNER family in the

United States. Others, following Minsky’s idea of frames (1975), adopted a more structured

approach, assembling facts about particular object and event types and arranging the types

into a large taxonomic hierarchy analogous to a biological taxonomy.

Frames

In 1981, the Japanese government announced the “Fifth Generation” project, a 10-year plan

to build massively parallel, intelligent computers running Prolog. The budget was to exceed

a $1.3 billion in today’s money. In response, the United States formed the Microelectronics

and Computer Technology Corporation (MCC), a consortium designed to assure national

competitiveness. In both cases, AI was part of a broad effort, including chip design and

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000164E2

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167B3

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000164E4

human-interface research. In Britain, the Alvey report reinstated the funding removed by

the Lighthill report. However, none of these projects ever met its ambitious goals in terms of

new AI capabilities or economic impact.

Overall, the AI industry boomed from a few million dollars in 1980 to billions of dollars in

1988, including hundreds of companies building expert systems, vision systems, robots, and

software and hardware specialized for these purposes.

Soon after that came a period called the “AI winter,” in which many companies fell by the

wayside as they failed to deliver on extravagant promises. It turned out to be difficult to

build and maintain expert systems for complex domains, in part because the reasoning

methods used by the systems broke down in the face of uncertainty and in part because the

systems could not learn from experience.

1.3.5 The return of neural networks (1986–present)

In the mid-1980s at least four different groups reinvented the back-propagation learning

algorithm first developed in the early 1960s. The algorithm was applied to many learning

problems in computer science and psychology, and the widespread dissemination of the

results in the collection Parallel Distributed Processing (Rumelhart and McClelland, 1986)

caused great excitement.

These so-called connectionist models were seen by some as direct competitors both to the

symbolic models promoted by Newell and Simon and to the logicist approach of McCarthy

and others. It might seem obvious that at some level humans manipulate symbols—in fact,

the anthropologist Terrence Deacon’s book The Symbolic Species (1997) suggests that this is

the defining characteristic of humans. Against this, Geoff Hinton, a leading figure in the

resurgence of neural networks in the 1980s and 2010s, has described symbols as the

“luminiferous aether of AI”—a reference to the non-existent medium through which many

19th-century physicists believed that electromagnetic waves propagated. Certainly, many

concepts that we name in language fail, on closer inspection, to have the kind of logically

defined necessary and sufficient conditions that early AI researchers hoped to capture in

axiomatic form. It may be that connectionist models form internal concepts in a more fluid

and imprecise way that is better suited to the messiness of the real world. They also have

the capability to learn from examples—they can compare their predicted output value to the

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001646E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015987

true value on a problem and modify their parameters to decrease the difference, making

,

them more likely to perform well on future examples.

connectionist

1.3.6 Probabilistic reasoning and machine learning (1987–

present)

The brittleness of expert systems led to a new, more scientific approach incorporating

probability rather than Boolean logic, machine learning rather than hand-coding, and

experimental results rather than philosophical claims. It became more common to build

on existing theories than to propose brand-new ones, to base claims on rigorous theorems

or solid experimental methodology (Cohen, 1995) rather than on intuition, and to show

relevance to real-world applications rather than toy examples.

14 Some have characterized this change as a victory of the neats—those who think that AI theories should be grounded in

mathematical rigor—over the scruffies—those who would rather try out lots of ideas, write some programs, and then assess what

seems to be working. Both approaches are important. A shift toward neatness implies that the field has reached a level of stability and

maturity. The present emphasis on deep learning may represent a resurgence of the scruffies.

Shared benchmark problem sets became the norm for demonstrating progress, including the

UC Irvine repository for machine learning data sets, the International Planning Competition

for planning algorithms, the LibriSpeech corpus for speech recognition, the MNIST data set

for handwritten digit recognition, ImageNet and COCO for image object recognition,

SQUAD for natural language question answering, the WMT competition for machine

translation, and the International SAT Competitions for Boolean satisfiability solvers.

AI was founded in part as a rebellion against the limitations of existing fields like control

theory and statistics, but in this period it embraced the positive results of those fields. As

David McAllester (1998) put it:

In the early period of AI it seemed plausible that new forms of symbolic computation, e.g., frames

and semantic networks, made much of classical theory obsolete. This led to a form of isolationism in

which AI became largely separated from the rest of computer science. This isolationism is currently

14

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000158B0

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016128

being abandoned. There is a recognition that machine learning should not be isolated from

information theory, that uncertain reasoning should not be isolated from stochastic modeling, that

search should not be isolated from classical optimization and control, and that automated reasoning

should not be isolated from formal methods and static analysis.

The field of speech recognition illustrates the pattern. In the 1970s, a wide variety of

different architectures and approaches were tried. Many of these were rather ad hoc and

fragile, and worked on only a few carefully selected examples. In the 1980s, approaches

using hidden Markov models (HMMs) came to dominate the area. Two aspects of HMMs

are relevant. First, they are based on a rigorous mathematical theory. This allowed speech

researchers to build on several decades of mathematical results developed in other fields.

Second, they are generated by a process of training on a large corpus of real speech data.

This ensures that the performance is robust, and in rigorous blind tests HMMs improved

their scores steadily. As a result, speech technology and the related field of handwritten

character recognition made the transition to widespread industrial and consumer

applications. Note that there was no scientific claim that humans use HMMs to recognize

speech; rather, HMMs provided a mathematical framework for understanding and solving

the problem. We will see in Section 1.3.8 , however, that deep learning has rather upset

this comfortable narrative.

Hidden Markov models

1988 was an important year for the connection between AI and other fields, including

statistics, operations research, decision theory, and control theory. Judea Pearl’s (1988)

Probabilistic Reasoning in Intelligent Systems led to a new acceptance of probability and

decision theory in AI. Pearl’s development of Bayesian networks yielded a rigorous and

efficient formalism for representing uncertain knowledge as well as practical algorithms for

probabilistic reasoning. Chapters 12 to 16 cover this area, in addition to more recent

developments that have greatly increased the expressive power of probabilistic formalisms;

Chapter 20 describes methods for learning Bayesian networks and related models from

data.

 

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016303

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010E81.xhtml#P7001017158000000000000000010E81

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013ACD.xhtml#P7001017158000000000000000013ACD

Bayesian network

A second major contribution in 1988 was Rich Sutton’s work connecting reinforcement

learning—which had been used in Arthur Samuel’s checker-playing program in the 1950s—

to the theory of Markov decision processes (MDPs) developed in the field of operations

research. A flood of work followed connecting AI planning research to MDPs, and the field

of reinforcement learning found applications in robotics and process control as well as

acquiring deep theoretical foundations.

One consequence of AI’s newfound appreciation for data, statistical modeling, optimization,

and machine learning was the gradual reunification of subfields such as computer vision,

robotics, speech recognition, multiagent systems, and natural language processing that had

become somewhat separate from core AI. The process of reintegration has yielded

significant benefits both in terms of applications—for example, the deployment of practical

robots expanded greatly during this period—and in a better theoretical understanding of the

core problems of AI.

1.3.7 Big data (2001–present)

Remarkable advances in computing power and the creation of the World Wide Web have

facilitated the creation of very large data sets—a phenomenon sometimes known as big

data. These data sets include trillions of words of text, billions of images, and billions of

hours of speech and video, as well as vast amounts of genomic data, vehicle tracking data,

clickstream data, social network data, and so on.

Big data

This has led to the development of learning algorithms specially designed to take advantage

of very large data sets. Often, the vast majority of examples in such data sets are unlabeled;

for example, in Yarowsky’s (1995) influential work on word-sense disambiguation,

occurrences of a word such as “plant” are not labeled in the data set to indicate whether they

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001682C

refer to flora or factory. With large enough data sets, however, suitable learning algorithms

can achieve an accuracy of over 96% on the task of identifying which sense was intended in

a sentence. Moreover, Banko and Brill (2001) argued that the improvement in performance

obtained from increasing the size of the data set by two or three orders of magnitude

outweighs any improvement that can be obtained from tweaking the algorithm.

A similar phenomenon seems to occur in computer vision tasks such as filling in holes in

photographs—holes caused either by damage or by the removal of ex-friends. Hays and

Efros (2007) developed a clever method for doing this by blending in pixels from

,

similar

images; they found that the technique worked poorly with a database of only thousands of

images but crossed a threshold of quality with millions of images. Soon after, the availability

of tens of millions of images in the ImageNet database (Deng et al., 2009) sparked a

revolution in the field of computer vision.

The availability of big data and the shift towards machine learning helped AI recover

commercial attractiveness (Havenstein, 2005; Halevy et al., 2009). Big data was a crucial

factor in the 2011 victory of IBM’s Watson system over human champions in the Jeopardy!

quiz game, an event that had a major impact on the public’s perception of AI.

1.3.8 Deep learning (2011–present)

The term deep learning refers to machine learning using multiple layers of simple,

adjustable computing elements. Experiments were carried out with such networks as far

back as the 1970s, and in the form of convolutional neural networks they found some

success in handwritten digit recognition in the 1990s (LeCun et al., 1995). It was not until

2011, however, that deep learning methods really took off. This occurred first in speech

recognition and then in visual object recognition.

Deep learning

In the 2012 ImageNet competition, which required classifying images into one of a thousand

categories (armadillo, bookshelf, corkscrew, etc.), a deep learning system created in

Geoffrey Hinton’s group at the University of Toronto (Krizhevsky et al., 2013) demonstrated

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001560B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015CF3

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000159D6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015CE5

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015C9C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015FFA

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015F7F

a dramatic improvement over previous systems, which were based largely on handcrafted

features. Since then, deep learning systems have exceeded human performance on some

vision tasks (and lag behind in some other tasks). Similar gains have been reported in

speech recognition, machine translation, medical diagnosis, and game playing. The use of a

deep network to represent the evaluation function contributed to ALPHAGO’s victories over

the leading human Go players (Silver et al., 2016, 2017, 2018).

These remarkable successes have led to a resurgence of interest in AI among students,

companies, investors, governments, the media, and the general public. It seems that every

week there is news of a new AI application approaching or exceeding human performance,

often accompanied by speculation of either accelerated success or a new AI winter.

Deep learning relies heavily on powerful hardware. Whereas a standard computer CPU can

do or operations per second. a deep learning algorithm running on specialized

hardware (e.g., GPU, TPU, or FPGA) might consume between and operations per

second, mostly in the form of highly parallelized matrix and vector operations. Of course,

deep learning also depends on the availability of large amounts of training data, and on a

few algorithmic tricks (see Chapter 21 ).

109 1010

1014 1017

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016577

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001657B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016579

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013CF5.xhtml#P7001017158000000000000000013CF5

1.4 The State of the Art

AI Index

Stanford University’s One Hundred Year Study on AI (also known as AI100) convenes

panels of experts to provide reports on the state of the art in AI. Their 2016 report (Stone et

al., 2016; Grosz and Stone, 2018) concludes that “Substantial increases in the future uses of

AI applications, including more self-driving cars, healthcare diagnostics and targeted

treatment, and physical assistance for elder care can be expected” and that “Society is now at

a crucial juncture in determining how to deploy AI-based technologies in ways that promote

rather than hinder democratic values such as freedom, equality, and transparency.” AI100

also produces an AI Index at aiindex.org to help track progress. Some highlights from the

2018 and 2019 reports (comparing to a year 2000 baseline unless otherwise stated):

Publications: AI papers increased 20-fold between 2010 and 2019 to about 20,000 a year.

The most popular category was machine learning. (Machine learning papers in arXiv.org

doubled every year from 2009 to 2017.) Computer vision and natural language

processing were the next most popular.

Sentiment: About 70% of news articles on AI are neutral, but articles with positive tone

increased from 12% in 2016 to 30% in 2018. The most common issues are ethical: data

privacy and algorithm bias.

Students: Course enrollment increased 5-fold in the U.S. and 16-fold internationally

from a 2010 baseline. AI is the most popular specialization in Computer Science.

Diversity: AI Professors worldwide are about 80% male, 20% female. Similar numbers

hold for Ph.D. students and industry hires.

Conferences: Attendance at NeurIPS increased 800% since 2012 to 13,500 attendees.

Other conferences are seeing annual growth of about 30%.

Industry: AI startups in the U.S. increased 20-fold to over 800.

Internationalization: China publishes more papers per year than the U.S. and about as

many as all of Europe. However, in citation-weighted impact, U.S. authors are 50%

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001660C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015C68

https://aiindex.org/

ahead of Chinese authors. Singapore, Brazil, Australia, Canada, and India are the fastest

growing countries in terms of the number of AI hires.

Vision: Error rates for object detection (as achieved in LSVRC, the Large-Scale Visual

Recognition Challenge) improved from 28% in 2010 to 2% in 2017, exceeding human

performance. Accuracy on open-ended visual question answering (VQA) improved from

55% to 68% since 2015, but lags behind human performance at 83%.

Speed: Training time for the image recognition task dropped by a factor of 100 in just

the past two years. The amount of computing power used in top AI applications is

doubling every 3.4 months.

Language: Accuracy on question answering, as measured by F1 score on the Stanford

Question Answering Dataset (SQUAD), increased from 60 to 95 from 2015 to 2019; on

the SQUAD 2 variant, progress was faster, going from 62 to 90 in just one year. Both

scores exceed human-level performance.

Human benchmarks: By 2019, AI systems had reportedly met or exceeded human-level

performance in chess, Go, poker, Pac-Man, Jeopardy!, ImageNet object detection,

speech recognition in a limited domain, Chinese-to-English translation in a restricted

domain, Quake III, Dota 2, StarCraft II, various Atari games, skin cancer detection,

prostate cancer

,

detection, protein folding, and diabetic retinopathy diagnosis.

When (if ever) will AI systems achieve human-level performance across a broad variety of

tasks? Ford (2018) interviews AI experts and finds a wide range of target years, from 2029 to

2200, with a mean of 2099. In a similar survey (Grace et al., 2017) 50% of respondents

thought this could happen by 2066, although 10% thought it could happen as early as 2025,

and a few said “never.” The experts were also split on whether we need fundamental new

breakthroughs or just refinements on current approaches. But don’t take their predictions

too seriously; as Philip Tetlock (2017) demonstrates in the area of predicting world events,

experts are no better than amateurs.

How will future AI systems operate? We can’t yet say. As detailed in this section, the field

has adopted several stories about itself—first the bold idea that intelligence by a machine

was even possible, then that it could be achieved by encoding expert knowledge into logic,

then that probabilistic models of the world would be the main tool, and most recently that

machine learning would induce models that might not be based on any well-understood

theory at all. The future will reveal what model comes next.

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015B14

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015C52

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016683

What can AI do today? Perhaps not as much as some of the more optimistic media articles

might lead one to believe, but still a great deal. Here are some examples:

ROBOTIC VEHICLES: The history of robotic vehicles stretches back to radio-controlled

cars of the 1920s, but the first demonstrations of autonomous road driving without special

guides occurred in the 1980s (Kanade et al., 1986; Dickmanns and Zapp, 1987). After

successful demonstrations of driving on dirt roads in the 132-mile DARPA Grand Challenge

in 2005 (Thrun, 2006) and on streets with traffic in the 2007 Urban Challenge, the race to

develop self-driving cars began in earnest. In 2018, Waymo test vehicles passed the

landmark of 10 million miles driven on public roads without a serious accident, with the

human driver stepping in to take over control only once every 6,000 miles. Soon after, the

company began offering a commercial robotic taxi service.

In the air, autonomous fixed-wing drones have been providing cross-country blood

deliveries in Rwanda since 2016. Quadcopters perform remarkable aerobatic maneuvers,

explore buildings while constructing 3-D maps, and self-assemble into autonomous

formations.

Legged locomotion: BigDog, a quadruped robot by Raibert et al. (2008), upended our

notions of how robots move—no longer the slow, stiff-legged, side-to-side gait of Hollywood

movie robots, but something closely resembling an animal and able to recover when shoved

or when slipping on an icy puddle. Atlas, a humanoid robot, not only walks on uneven

terrain but jumps onto boxes and does backflips (Ackerman and Guizzo, 2016).

AUTONOMOUS PLANNING AND SCHEDULING: A hundred million miles from Earth,

NASA’s Remote Agent program became the first on-board autonomous planning program to

control the scheduling of operations for a spacecraft (Jonsson et al., 2000). Remote Agent

generated plans from high-level goals specified from the ground and monitored the

execution of those plans—detecting, diagnosing, and recovering from problems as they

occurred. Today, the EUROPA planning toolkit (Barreiro et al., 2012) is used for daily

operations of NASA’s Mars rovers and the SEXTANT system (Winternitz, 2017) allows

autonomous navigation in deep space, beyond the global GPS system.

During the Persian Gulf crisis of 1991, U.S. forces deployed a Dynamic Analysis and

Replanning Tool, DART (Cross and Walker, 1994), to do automated logistics planning and

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015E6E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000159F6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166A6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000163BF

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001552C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015E2F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015621

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167DE

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001590C

scheduling for transportation. This involved up to 50,000 vehicles, cargo, and people at a

time, and had to account for starting points, destinations, routes, transport capacities, port

and airfield capacities, and conflict resolution among all parameters. The Defense Advanced

Research Project Agency (DARPA) stated that this single application more than paid back

DARPA’s 30-year investment in AI.

Every day, ride hailing companies such as Uber and mapping services such as Google Maps

provide driving directions for hundreds of millions of users, quickly plotting an optimal

route taking into account current and predicted future traffic conditions.

MACHINE TRANSLATION: Online machine translation systems now enable the reading of

documents in over 100 languages, including the native languages of over 99% of humans,

and render hundreds of billions of words per day for hundreds of millions of users. While

not perfect, they are generally adequate for understanding. For closely related languages

with a great deal of training data (such as French and English) translations within a narrow

domain are close to the level of a human (Wu et al., 2016b).

SPEECH RECOGNITION: In 2017, Microsoft showed that its Conversational Speech

Recognition System had reached a word error rate of 5.1%, matching human performance

on the Switchboard task, which involves transcribing telephone conversations (Xiong et al.,

2017). About a third of computer interaction worldwide is now done by voice rather than

keyboard; Skype provides real-time speech-to-speech translation in ten languages. Alexa,

Siri, Cortana, and Google offer assistants that can answer questions and carry out tasks for

the user; for example the Google Duplex service uses speech recognition and speech

synthesis to make restaurant reservations for users, carrying out a fluent conversation on

their behalf.

RECOMMENDATIONS: Companies such as Amazon, Facebook, Netflix, Spotify, YouTube,

Walmart, and others use machine learning to recommend what you might like based on

your past experiences and those of others like you. The field of recommender systems has a

long history (Resnick and Varian, 1997) but is changing rapidly due to new deep learning

methods that analyze content (text, music, video) as well as history and metadata (van den

Oord et al., 2014; Zhang et al., 2017). Spam filtering can also be considered a form of

recommendation (or dis-recommendation); current AI techniques filter out over 99.9% of

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016818

,

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001681E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000163F7

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166F6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016871

spam, and email services can also recommend potential recipients, as well as possible

response text.

Game playing: When Deep Blue defeated world chess champion Garry Kasparov in 1997,

defenders of human supremacy placed their hopes on Go. Piet Hut, an astrophysicist and

Go enthusiast, predicted that it would take “a hundred years before a computer beats

humans at Go—maybe even longer.” But just 20 years later, ALPHAGO surpassed all human

players (Silver et al., 2017). Ke Jie, the world champion, said, “Last year, it was still quite

human-like when it played. But this year, it became like a god of Go.” ALPHAGO benefited

from studying hundreds of thousands of past games by human Go players, and from the

distilled knowledge of expert Go players that worked on the team.

A followup program, ALPHAZERO, used no input from humans (except for the rules of the

game), and was able to learn through self-play alone to defeat all opponents, human and

machine, at Go, chess, and shogi (Silver et al., 2018). Meanwhile, human champions have

been beaten by AI systems at games as diverse as Jeopardy! (Ferrucci et al., 2010), poker

(Bowling et al., 2015; Moravčík et al., 2017; Brown and Sandholm, 2019), and the video

games Dota 2 (Fernandez and Mahlmann, 2018), StarCraft II (Vinyals et al., 2019), and

Quake III (Jaderberg et al., 2019).

IMAGE UNDERSTANDING: Not content with exceeding human accuracy on the

challenging ImageNet object recognition task, computer vision researchers have taken on

the more difficult problem of image captioning. Some impressive examples include “A

person riding a motorcycle on a dirt road,” “Two pizzas sitting on top of a stove top oven,”

and “A group of young people playing a game of frisbee” (Vinyals et al., 2017b). Current

systems are far from perfect, however: a “refrigerator filled with lots of food and drinks”

turns out to be a no-parking sign partially obscured by lots of small stickers.

MEDICINE: AI algorithms now equal or exceed expert doctors at diagnosing many

conditions, particularly when the diagnosis is based on images. Examples include

Alzheimer’s disease (Ding et al., 2018), metastatic cancer (Liu et al., 2017; Esteva et al., 2017),

ophthalmic disease (Gulshan et al., 2016), and skin diseases (Liu et al., 2019c). A systematic

review and meta-analysis (Liu et al., 2019a) found that the performance of AI programs, on

average, was equivalent to health care professionals. One current emphasis in medical AI is

in facilitating human–machine partnerships. For example, the LYNA system achieves 99.6%

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001657B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016579

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AE7

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015739

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000161EC

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001579C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AE1

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016724

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015DF7

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016728

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000159FE

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016061

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015A99

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015C7A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016071

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001606D

overall accuracy in diagnosing metastatic breast cancer—better than an unaided human

expert—but the combination does better still (Liu et al., 2018; Steiner et al., 2018)..

The widespread adoption of these techniques is now limited not by diagnostic accuracy but

by the need to demonstrate improvement in clinical outcomes and to ensure transparency,

lack of bias, and data privacy (Topol, 2019). In 2017, only two medical AI applications were

approved by the FDA, but that increased to 12 in 2018, and continues to rise.

CLIMATE SCIENCE: A team of scientists won the 2018 Gordon Bell Prize for a deep

learning model that discovers detailed information about extreme weather events that were

previously buried in climate data. They used a supercomputer with specialized GPU

hardware to exceed the exaop level ( operations per second), the first machine learning

program to do so (Kurth et al., 2018). Rolnick et al. (2019) present a 60-page catalog of ways

in which machine learning can be used to tackle climate change.

These are just a few examples of artificial intelligence systems that exist today. Not magic or

science fiction—but rather science, engineering, and mathematics, to which this book

provides an introduction.

10

18

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001605F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000165F6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166B8

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015FA0

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016437

1.5 Risks and Benefits of AI

Francis Bacon, a philosopher credited with creating the scientific method, noted in The

Wisdom of the Ancients (1609) that the “mechanical arts are of ambiguous use, serving as well

for hurt as for remedy.” As AI plays an increasingly important role in the economic, social,

scientific, medical, financial, and military spheres, we would do well to consider the hurts

and remedies—in modern parlance, the risks and benefits—that it can bring. The topics

summarized here are covered in greater depth in Chapters 27 and 28 .

To begin with the benefits: put simply, our entire civilization is the product of our human

,

intelligence. If we have access to substantially greater machine intelligence, the ceiling on

our ambitions is raised substantially. The potential for AI and robotics to free humanity from

menial repetitive work and to dramatically increase the production of goods and services

could presage an era of peace and plenty. The capacity to accelerate scientific research could

result in cures for disease and solutions for climate change and resource shortages. As

Demis Hassabis, CEO of Google DeepMind, has suggested: “First solve AI, then use AI to

solve everything else.”

Long before we have an opportunity to “solve AI,” however, we will incur risks from the

misuse of AI, inadvertent or otherwise. Some of these are already apparent, while others

seem likely based on current trends:

LETHAL AUTONOMOUS WEAPONS: These are defined by the United Nations as

weapons that can locate, select, and eliminate human targets without human

intervention. A primary concern with such weapons is their scalability: the absence of a

requirement for human supervision means that a small group can deploy an arbitrarily

large number of weapons against human targets defined by any feasible recognition

criterion. The technologies needed for autonomous weapons are similar to those needed

for self-driving cars. Informal expert discussions on the potential risks of lethal

autonomous weapons began at the UN in 2014, moving to the formal pre-treaty stage of

a Group of Governmental Experts in 2017.

SURVEILLANCE AND PERSUASION: While it is expensive, tedious, and sometimes

legally questionable for security personnel to monitor phone lines, video camera feeds,

emails, and other messaging channels, AI (speech recognition, computer vision, and

 

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000155E9

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001512B.xhtml#P700101715800000000000000001512B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000152FF.xhtml#P70010171580000000000000000152FF

natural language understanding) can be used in a scalable fashion to perform mass

surveillance of individuals and detect activities of interest. By tailoring information flows

to individuals through social media, based on machine learning techniques, political

behavior can be modified and controlled to some extent—a concern that became

apparent in elections beginning in 2016.

BIASED DECISION MAKING: Careless or deliberate misuse of machine learning

algorithms for tasks such as evaluating parole and loan applications can result in

decisions that are biased by race, gender, or other protected categories. Often, the data

themselves reflect pervasive bias in society.

IMPACT ON EMPLOYMENT: Concerns about machines eliminating jobs are centuries

old. The story is never simple: machines do some of the tasks that humans might

otherwise do, but they also make humans more productive and therefore more

employable, and make companies more profitable and therefore able to pay higher

wages. They may render some activities economically viable that would otherwise be

impractical. Their use generally results in increasing wealth but tends to have the effect

of shifting wealth from labor to capital, further exacerbating increases in inequality.

Previous advances in technology—such as the invention of mechanical looms—have

resulted in serious disruptions to employment, but eventually people find new kinds of

work to do. On the other hand, it is possible that AI will be doing those new kinds of

work too. This topic is rapidly becoming a major focus for economists and governments

around the world.

SAFETY-CRITICAL APPLICATIONS: As AI techniques advance, they are increasingly

used in high-stakes, safety-critical applications such as driving cars and managing the

water supplies of cities. Fatal accidents have already occurred and highlight the difficulty

of formal verification and statistical risk analysis for systems developed using machine

learning techniques. The field of AI will need to develop technical and ethical standards

at least comparable to those prevalent in other engineering and healthcare disciplines

where people’s lives are at stake.

CYBERSECURITY: AI techniques are useful in defending against cyberattack, for

example by detecting unusual patterns of behavior, but they will also contribute to the

potency, survivability, and proliferation capability of malware. For example,

reinforcement learning methods have been used to create highly effective tools for

automated, personalized blackmail and phishing attacks.

We will revisit these topics in more depth in Section 27.3 . As AI systems become more

capable, they will take on more of the societal roles previously played by humans. Just as

humans have used these roles in the past to perpetrate mischief, we can expect that humans

may misuse AI systems in these roles to perpetrate even more mischief. All of the examples

given above point to the importance of governance and, eventually, regulation. At present,

the research community and the major corporations involved in AI research have developed

voluntary self-governance principles for AI-related activities (see Section 27.3 ).

Governments and international organizations are setting up advisory bodies to devise

appropriate regulations for each specific use case, to prepare for the economic and social

impacts, and to take advantage of AI capabilities to address major societal problems.

What of the longer term? Will we achieve the long-standing goal: the creation of intelligence

comparable to or more capable than human intelligence? And, if we do, what then?

Human-level AI

Artificial general intelligence (AGI)

For much of AI’s history, these questions have been overshadowed by the daily grind of

getting AI systems to do anything even remotely intelligent. As with any broad discipline,

the great majority of AI researchers have specialized in a specific subfield such as game-

playing, knowledge representation, vision, or natural language understanding—often on the

assumption that progress in these subfields would contribute to the broader goals of AI. Nils

Nilsson (1995), one of the original leaders of the Shakey project at SRI, reminded the field of

those broader goals and warned that the subfields were in danger of becoming ends in

themselves. Later, some influential founders of AI, including John McCarthy (2007), Marvin

Minsky (2007), and Patrick Winston (Beal and Winston, 2009), concurred with Nilsson’s

warnings, suggesting that instead of focusing on measurable performance in specific

applications, AI should return to its roots of striving for, in Herb Simon’s words, “machines

that think, that learn and that create.” They called the effort human-level AI or HLAI—a

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015183.xhtml#P7001017158000000000000000015183

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015183.xhtml#P7001017158000000000000000015183

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001627D

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001613E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000161A5

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015642

machine should be able to learn to do anything a human can do. Their first symposium was

in 2004 (Minsky et al., 2004). Another effort with similar goals, the artificial general

intelligence (AGI) movement

,

(Goertzel and Pennachin, 2007), held its first conference and

organized the Journal of Artificial General Intelligence in 2008.

Artificial superintelligence (ASI)

At around the same time, concerns were raised that creating artificial superintelligence or

ASI—intelligence that far surpasses human ability—might be a bad idea (Yudkowsky, 2008;

Omohundro, 2008). Turing (1996) himself made the same point in a lecture given in

Manchester in 1951, drawing on earlier ideas from Samuel Butler (1863):

15 Even earlier, in 1847, Richard Thornton, editor of the Primitive Expounder, railed against mechanical calculators: “Mind ... outruns

itself and does away with the necessity of its own existence by inventing machines to do its own thinking. ... But who knows that such

machines when brought to greater perfection, may not think of a plan to remedy all their own defects and then grind out ideas beyond

the ken of mortal mind!”

It seems probable that once the machine thinking method had started, it would not take long to

outstrip our feeble powers. ... At some stage therefore we should have to expect the machines to take

control, in the way that is mentioned in Samuel Butler’s Erewhon.

These concerns have only become more widespread with recent advances in deep learning,

the publication of books such as Superintelligence by Nick Bostrom (2014), and public

pronouncements from Stephen Hawking, Bill Gates, Martin Rees, and Elon Musk.

Experiencing a general sense of unease with the idea of creating superintelligent machines is

only natural. We might call this the gorilla problem: about seven million years ago, a now-

extinct primate evolved, with one branch leading to gorillas and one to humans. Today, the

gorillas are not too happy about the human branch; they have essentially no control over

their future. If this is the result of success in creating superhuman AI—that humans cede

control over their future—then perhaps we should stop work on AI, and, as a corollary, give

up the benefits it might bring. This is the essence of Turing’s warning: it is not obvious that

we can control machines that are more intelligent than us.

15

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000161AB

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015C05

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001684C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000162AD

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166DA

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000157DC

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015721

Gorilla problem

If superhuman AI were a black box that arrived from outer space, then indeed it would be

wise to exercise caution in opening the box. But it is not: we design the AI systems, so if

they do end up “taking control,” as Turing suggests, it would be the result of a design failure.

To avoid such an outcome, we need to understand the source of potential failure. Norbert

Wiener (1960), who was motivated to consider the long-term future of AI after seeing

Arthur Samuel’s checker-playing program learn to beat its creator, had this to say:

If we use, to achieve our purposes, a mechanical agency with whose operation we cannot interfere

effectively ... we had better be quite sure that the purpose put into the machine is the purpose which

we really desire.

Many cultures have myths of humans who ask gods, genies, magicians, or devils for

something. Invariably, in these stories, they get what they literally ask for, and then regret it.

The third wish, if there is one, is to undo the first two. We will call this the King Midas

problem: Midas, a legendary King in Greek mythology, asked that everything he touched

should turn to gold, but then regretted it after touching his food, drink, and family

members.

16 Midas would have done better if he had followed basic principles of safety and included an “undo” button and a “pause” button in

his wish.

King Midas problem

We touched on this issue in Section 1.1.5 , where we pointed out the need for a significant

modification to the standard model of putting fixed objectives into the machine. The

solution to Wiener’s predicament is not to have a definite “purpose put into the machine” at

all. Instead, we want machines that strive to achieve human objectives but know that they

don’t know for certain exactly what those objectives are.

16

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167AF

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBB1.xhtml#P700101715800000000000000000EC10

It is perhaps unfortunate that almost all AI research to date has been carried out within the

standard model, which means that almost all of the technical material in this edition reflects

that intellectual framework. There are, however, some early results within the new

framework. In Chapter 16 , we show that a machine has a positive incentive to allow itself

to be switched off if and only if it is uncertain about the human objective. In Chapter 18 ,

we formulate and study assistance games, which describe mathematically the situation in

which a human has an objective and a machine tries to achieve it, but is initially uncertain

about what it is. In Chapter 22 , we explain the methods of inverse reinforcement learning

that allow machines to learn more about human preferences from observations of the

choices that humans make. In Chapter 27 , we explore two of the principal difficulties: first,

that our choices depend on our preferences through a very complex cognitive architecture

that is hard to invert; and, second, that we humans may not have consistent preferences in

the first place—either individually or as a group—so it may not be clear what AI systems

should be doing for us.

Assistance game

Inverse reinforcement learning

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B87.xhtml#P7001017158000000000000000012B87

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001512B.xhtml#P700101715800000000000000001512B

Summary

This chapter defines AI and establishes the cultural background against which it has

developed. Some of the important points are as follows:

Different people approach AI with different goals in mind. Two important questions to

ask are: Are you concerned with thinking, or behavior? Do you want to model humans,

or try to achieve the optimal results?

According to what we have called the standard model, AI is concerned mainly with

rational action. An ideal intelligent agent takes the best possible action in a situation.

We study the problem of building agents that are intelligent in this sense.

Two refinements to this simple idea are needed: first, the ability of any agent, human or

otherwise, to choose rational actions is limited by the computational intractability of

doing so; second, the concept of a machine that pursues a definite objective needs to be

replaced with that of a machine

,

pursuing objectives to benefit humans, but uncertain as

to what those objectives are.

Philosophers (going back to 400 BCE) made AI conceivable by suggesting that the mind is

in some ways like a machine, that it operates on knowledge encoded in some internal

language, and that thought can be used to choose what actions to take.

Mathematicians provided the tools to manipulate statements of logical certainty as well

as uncertain, probabilistic statements. They also set the groundwork for understanding

computation and reasoning about algorithms.

Economists formalized the problem of making decisions that maximize the expected

utility to the decision maker.

Neuroscientists discovered some facts about how the brain works and the ways in which

it is similar to and different from computers.

Psychologists adopted the idea that humans and animals can be considered

information-processing machines. Linguists showed that language use fits into this

model.

Computer engineers provided the ever-more-powerful machines that make AI

applications possible, and software engineers made them more usable.

Control theory deals with designing devices that act optimally on the basis of feedback

from the environment. Initially, the mathematical tools of control theory were quite

different from those used in AI, but the fields are coming closer together.

The history of AI has had cycles of success, misplaced optimism, and resulting cutbacks

in enthusiasm and funding. There have also been cycles of introducing new, creative

approaches and systematically refining the best ones.

AI has matured considerably compared to its early decades, both theoretically and

methodologically. As the problems that AI deals with became more complex, the field

moved from Boolean logic to probabilistic reasoning, and from hand-crafted knowledge

to machine learning from data. This has led to improvements in the capabilities of real

systems and greater integration with other disciplines.

As AI systems find application in the real world, it has become necessary to consider a

wide range of risks and ethical consequences.

In the longer term, we face the difficult problem of controlling superintelligent AI

systems that may evolve in unpredictable ways. Solving this problem seems to

necessitate a change in our conception of AI.

Bibliographical and Historical Notes

A comprehensive history of AI is given by Nils Nilsson (2009), one of the early pioneers of

the field. Pedro Domingos (2015) and Melanie Mitchell (2019) give overviews of machine

learning for a general audience, and Kai-Fu Lee (2018) describes the race for international

leadership in AI. Martin Ford (2018) interviews 23 leading AI researchers.

The main professional societies for AI are the Association for the Advancement of Artificial

Intelligence (AAAI), the ACM Special Interest Group in Artificial Intelligence (SIGAI,

formerly SIGART), the European Association for AI, and the Society for Artificial

Intelligence and Simulation of Behaviour (AISB). The Partnership on AI brings together

many commercial and nonprofit organizations concerned with the ethical and social impacts

of AI. AAAI’s AI Magazine contains many topical and tutorial articles, and its Web site,

aaai.org, contains news, tutorials, and background information.

The most recent work appears in the proceedings of the major AI conferences: the

International Joint Conference on AI (IJCAI), the annual European Conference on AI

(ECAI), and the AAAI Conference. Machine learning is covered by the International

Conference on Machine Learning and the Neural Information Processing Systems (NeurIPS)

meeting. The major journals for general AI are Artificial Intelligence, Computational

Intelligence, the IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Intelligent

Systems, and the Journal of Artificial Intelligence Research. There are also many conferences

and journals devoted to specific areas, which we cover in the appropriate chapters.

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001627F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015A14

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000161B3

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016000

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015B14

Chapter 2

Intelligent Agents

In which we discuss the nature of agents, perfect or otherwise, the diversity of environments, and the

resulting menagerie of agent types.

Chapter 1 identified the concept of rational agents as central to our approach to artificial

intelligence. In this chapter, we make this notion more concrete. We will see that the

concept of rationality can be applied to a wide variety of agents operating in any imaginable

environment. Our plan in this book is to use this concept to develop a small set of design

principles for building successful agents—systems that can reasonably be called intelligent.

We begin by examining agents, environments, and the coupling between them. The

observation that some agents behave better than others leads naturally to the idea of a

rational agent—one that behaves as well as possible. How well an agent can behave

depends on the nature of the environment; some environments are more difficult than

others. We give a crude categorization of environments and show how properties of an

environment influence the design of suitable agents for that environment. We describe a

number of basic “skeleton” agent designs, which we flesh out in the rest of the book.

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBA2.xhtml#P700101715800000000000000000EBA2

2.1 Agents and Environments

An agent is anything that can be viewed as perceiving its environment through sensors and

acting upon that environment through actuators. This simple idea is illustrated in Figure

2.1 . A human agent has eyes, ears, and other organs for sensors and hands, legs, vocal

tract, and so on for actuators. A robotic agent might have cameras and infrared range finders

for sensors and various motors for actuators. A software agent receives file contents,

network packets, and human input (keyboard/mouse/touchscreen/voice) as sensory inputs

and acts on the environment by writing files, sending network packets, and displaying

information or generating sounds. The environment could be everything—the entire

universe! In practice it is just that part of the universe whose state we care about when

designing this agent—the part that affects what the agent perceives and that is affected by

the agent’s actions.

Figure 2.1

Agents interact with environments through sensors and actuators.

Environment

Sensor

Actuator

We use the term percept to refer to the content an agent’s sensors are perceiving. An agent’s

percept sequence is the complete history of everything the agent has ever perceived. In

general, an agent’s choice of action at any given instant can depend on its built-in knowledge and

on the entire percept sequence observed to date, but not on anything it hasn’t perceived. By

specifying the agent’s choice of action for every possible percept sequence, we have said

more or less everything there is to say about the agent. Mathematically speaking, we say

that an agent’s behavior is described by the agent function that maps any given percept

sequence to an action.

Percept

Percept sequence

Agent function

We can imagine tabulating the agent function that describes any given agent; for most

agents, this would be a very large table—infinite, in

,

fact, unless we place a bound on the

length of percept sequences we want to consider. Given an agent to experiment with, we

can, in principle, construct this table by trying out all possible percept sequences and

recording which actions the agent does in response. The table is, of course, an external

characterization of the agent. Internally, the agent function for an artificial agent will be

implemented by an agent program. It is important to keep these two ideas distinct. The

agent function is an abstract mathematical description; the agent program is a concrete

implementation, running within some physical system.

1 If the agent uses some randomization to choose its actions, then we would have to try each sequence many times to identify the

probability of each action. One might imagine that acting randomly is rather silly, but we show later in this chapter that it can be very

intelligent.

Agent program

To illustrate these ideas, we use a simple example—the vacuum-cleaner world, which

consists of a robotic vacuum-cleaning agent in a world consisting of squares that can be

either dirty or clean. Figure 2.2 shows a configuration with just two squares, and . The

vacuum agent perceives which square it is in and whether there is dirt in the square. The

agent starts in square . The available actions are to move to the right, move to the left,

suck up the dirt, or do nothing. One very simple agent function is the following: if the

current square is dirty, then suck; otherwise, move to the other square. A partial tabulation

of this agent function is shown in Figure 2.3 and an agent program that implements it

appears in Figure 2.8 on page 49.

2 In a real robot, it would be unlikely to have an actions like “move right” and “move left.” Instead the actions would be “spin wheels

forward” and “spin wheels backward.” We have chosen the actions to be easier to follow on the page, not for ease of implementation

in an actual robot.

Figure 2.2

1

 A B

A

2

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EF69.xhtml#P700101715800000000000000000EF93

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EF69.xhtml#P700101715800000000000000000EF83

A vacuum-cleaner world with just two locations. Each location can be clean or dirty, and the agent can

move left or right and can clean the square that it occupies. Different versions of the vacuum world allow

for different rules about what the agent can perceive, whether its actions always succeed, and so on.

Figure 2.3

Partial tabulation of a simple agent function for the vacuum-cleaner world shown in Figure 2.2 . The

agent cleans the current square if it is dirty, otherwise it moves to the other square. Note that the table is

of unbounded size unless there is a restriction on the length of possible percept sequences.

Looking at Figure 2.3 , we see that various vacuum-world agents can be defined simply by

filling in the right-hand column in various ways. The obvious question, then, is this: What is

the right way to fill out the table? In other words, what makes an agent good or bad, intelligent

or stupid? We answer these questions in the next section.

Before closing this section, we should emphasize that the notion of an agent is meant to be a

tool for analyzing systems, not an absolute characterization that divides the world into

agents and non-agents. One could view a hand-held calculator as an agent that chooses the

action of displaying “4” when given the percept sequence but such an analysis

would hardly aid our understanding of the calculator. In a sense, all areas of engineering can

be seen as designing artifacts that interact with the world; AI operates at (what the authors

consider to be) the most interesting end of the spectrum, where the artifacts have significant

computational resources and the task environment requires nontrivial decision making.

“2 + 2 = ,”

2.2 Good Behavior: The Concept of Rationality

A rational agent is one that does the right thing. Obviously, doing the right thing is better

than doing the wrong thing, but what does it mean to do the right thing?

Rational agent

2.2.1 Performance measures

Moral philosophy has developed several different notions of the “right thing,” but AI has

generally stuck to one notion called consequentialism: we evaluate an agent’s behavior by

its consequences. When an agent is plunked down in an environment, it generates a

sequence of actions according to the percepts it receives. This sequence of actions causes

the environment to go through a sequence of states. If the sequence is desirable, then the

agent has performed well. This notion of desirability is captured by a performance measure

that evaluates any given sequence of environment states.

Consequentialism

Performance measure

Humans have desires and preferences of their own, so the notion of rationality as applied to

humans has to do with their success in choosing actions that produce sequences of

environment states that are desirable from their point of view. Machines, on the other hand,

do not have desires and preferences of their own; the performance measure is, initially at

least, in the mind of the designer of the machine, or in the mind of the users the machine is

designed for. We will see that some agent designs have an explicit representation of (a

version of) the performance measure, while in other designs the performance measure is

entirely implicit—the agent may do the right thing, but it doesn’t know why.

Recalling Norbert Wiener’s warning to ensure that “the purpose put into the machine is the

purpose which we really desire” (page 33), notice that it can be quite hard to formulate a

performance measure correctly. Consider, for example, the vacuum-cleaner agent from the

preceding section. We might propose to measure performance by the amount of dirt cleaned

up in a single eight-hour shift. With a rational agent, of course, what you ask for is what you

get. A rational agent can maximize this performance measure by cleaning up the dirt, then

dumping it all on the floor, then cleaning it up again, and so on. A more suitable

performance measure would reward the agent for having a clean floor. For example, one

point could be awarded for each clean square at each time step (perhaps with a penalty for

electricity consumed and noise generated). As a general rule, it is better to design performance

measures according to what one actually wants to be achieved in the environment, rather than

according to how one thinks the agent should behave.

Even when the obvious pitfalls are avoided, some knotty problems remain. For example, the

notion of “clean floor” in the preceding paragraph is based on average cleanliness over time.

Yet the same average cleanliness can be achieved by two different agents, one of which does

a mediocre job all the time while the other cleans energetically but takes long breaks. Which

is preferable might seem to be a fine point of janitorial science, but in fact it is a deep

philosophical question with far-reaching implications. Which is better—a reckless life of

highs and lows, or a safe but humdrum existence? Which is better—an economy where

everyone lives in moderate poverty, or one in which some live in plenty while others are

very poor? We leave these questions as an exercise for the diligent reader.

For most of the book, we will assume that the performance measure can be specified

correctly. For the reasons given above, however, we must accept the possibility that we

might put the wrong purpose into the machine—precisely the King Midas problem described

on page 33. Moreover, when designing one piece of software, copies of which will belong to

different users, we cannot anticipate the exact preferences of each individual user. Thus, we

may need to build agents that reflect initial uncertainty about the true performance measure

,

and learn more about it as time goes by; such agents are described in Chapters 16 , 18 ,

and 22 .

 

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EDBE.xhtml#P700101715800000000000000000EDD9

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EDBE.xhtml#P700101715800000000000000000EDD9

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B87.xhtml#P7001017158000000000000000012B87

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4

2.2.2 Rationality

What is rational at any given time depends on four things:

The performance measure that defines the criterion of success.

The agent’s prior knowledge of the environment.

The actions that the agent can perform.

The agent’s percept sequence to date.

This leads to a definition of a rational agent:

For each possible percept sequence, a rational agent should select an action that is expected to

maximize its performance measure, given the evidence provided by the percept sequence and

whatever built-in knowledge the agent has.

Definition of a rational agent

Consider the simple vacuum-cleaner agent that cleans a square if it is dirty and moves to the

other square if not; this is the agent function tabulated in Figure 2.3 . Is this a rational

agent? That depends! First, we need to say what the performance measure is, what is known

about the environment, and what sensors and actuators the agent has. Let us assume the

following:

The performance measure awards one point for each clean square at each time step,

over a “lifetime” of 1000 time steps.

The “geography” of the environment is known a priori (Figure 2.2 ) but the dirt

distribution and the initial location of the agent are not. Clean squares stay clean and

sucking cleans the current square. The and actions move the agent one

square except when this would take the agent outside the environment, in which case

the agent remains where it is.

The only available actions are , , and .

The agent correctly perceives its location and whether that location contains dirt.

Right Left

Right Left Suck

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE56.xhtml#P700101715800000000000000000EE77

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE56.xhtml#P700101715800000000000000000EE71

Under these circ*mstances the agent is indeed rational; its expected performance is at least

as good as any other agent’s.

One can see easily that the same agent would be irrational under different circ*mstances.

For example, once all the dirt is cleaned up, the agent will oscillate needlessly back and

forth; if the performance measure includes a penalty of one point for each movement, the

agent will fare poorly. A better agent for this case would do nothing once it is sure that all

the squares are clean. If clean squares can become dirty again, the agent should occasionally

check and re-clean them if needed. If the geography of the environment is unknown, the

agent will need to explore it. Exercise 2.VACR asks you to design agents for these cases.

2.2.3 Omniscience, learning, and autonomy

Omniscience

We need to be careful to distinguish between rationality and omniscience. An omniscient

agent knows the actual outcome of its actions and can act accordingly; but omniscience is

impossible in reality. Consider the following example: I am walking along the Champs

Elysées one day and I see an old friend across the street. There is no traffic nearby and I’m

not otherwise engaged, so, being rational, I start to cross the street. Meanwhile, at 33,000

feet, a cargo door falls off a passing airliner, and before I make it to the other side of the

street I am flattened. Was I irrational to cross the street? It is unlikely that my obituary

would read “Idiot attempts to cross street.”

3 See N. Henderson, “New door latches urged for Boeing 747 jumbo jets,” Washington Post, August 24, 1989.

This example shows that rationality is not the same as perfection. Rationality maximizes

expected performance, while perfection maximizes actual performance. Retreating from a

requirement of perfection is not just a question of being fair to agents. The point is that if we

expect an agent to do what turns out after the fact to be the best action, it will be impossible

to design an agent to fulfill this specification—unless we improve the performance of crystal

balls or time machines.

3

Our definition of rationality does not require omniscience, then, because the rational choice

depends only on the percept sequence to date. We must also ensure that we haven’t

inadvertently allowed the agent to engage in decidedly underintelligent activities. For

example, if an agent does not look both ways before crossing a busy road, then its percept

sequence will not tell it that there is a large truck approaching at high speed. Does our

definition of rationality say that it’s now OK to cross the road? Far from it!

First, it would not be rational to cross the road given this uninformative percept sequence:

the risk of accident from crossing without looking is too great. Second, a rational agent

should choose the “looking” action before stepping into the street, because looking helps

maximize the expected performance. Doing actions in order to modify future percepts—

sometimes called information gathering—is an important part of rationality and is covered

in depth in Chapter 16 . A second example of information gathering is provided by the

exploration that must be undertaken by a vacuum-cleaning agent in an initially unknown

environment.

Information gathering

Our definition requires a rational agent not only to gather information but also to learn as

much as possible from what it perceives. The agent’s initial configuration could reflect some

prior knowledge of the environment, but as the agent gains experience this may be modified

and augmented. There are extreme cases in which the environment is completely known a

priori and completely predictable. In such cases, the agent need not perceive or learn; it

simply acts correctly.

Learning

Of course, such agents are fragile. Consider the lowly dung beetle. After digging its nest and

laying its eggs, it fetches a ball of dung from a nearby heap to plug the entrance. If the ball

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194

of dung is removed from its grasp en route, the beetle continues its task and pantomimes

plugging the nest with the nonexistent dung ball, never noticing that it is missing. Evolution

has built an assumption into the beetle’s behavior, and when it is violated, unsuccessful

behavior results.

Slightly more intelligent is the sphex wasp. The female sphex will dig a burrow, go out and

sting a caterpillar and drag it to the burrow, enter the burrow again to check all is well, drag

the caterpillar inside, and lay its eggs. The caterpillar serves as a food source when the eggs

hatch. So far so good, but if an entomologist moves the caterpillar a few inches away while

the sphex is doing the check, it will revert to the “drag the caterpillar” step of its plan and

will continue the plan without modification, re-checking the burrow, even after dozens of

caterpillar-moving interventions. The sphex is unable to learn that its innate plan is failing,

and thus will not change it.

To the extent that an agent relies on the prior knowledge of its designer rather than on its

own percepts and learning processes, we say that the agent lacks autonomy. A rational

agent

,

Intelligent Agents 36

2.1 Agents and Environments 36

2.2 Good Behavior: The Concept of Rationality 39

2.3 The Nature of Environments 42

2.4 The Structure of Agents 47

Summary 60

Bibliographical and Historical Notes 60

II Problem-solving

3 Solving Problems by Searching 63

3.1 Problem-Solving Agents 63

3.2 Example Problems 66

3.3 Search Algorithms 71

3.4 Uninformed Search Strategies 76

3.5 Informed (Heuristic) Search Strategies 84

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE41.xhtml#P700101715800000000000000000EE41

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBA2.xhtml#P700101715800000000000000000EBA2

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBB1.xhtml#P700101715800000000000000000EBB1

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EC1C.xhtml#P700101715800000000000000000EC1C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000ED0D.xhtml#P700101715800000000000000000ED0D

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000ED8C.xhtml#P700101715800000000000000000ED8C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EDBE.xhtml#P700101715800000000000000000EDBE

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EDF2.xhtml#P700101715800000000000000000EDF2

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE13.xhtml#P700101715800000000000000000EE13

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE4C.xhtml#P700101715800000000000000000EE4C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE56.xhtml#P700101715800000000000000000EE56

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE93.xhtml#P700101715800000000000000000EE93

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EED7.xhtml#P700101715800000000000000000EED7

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EF69.xhtml#P700101715800000000000000000EF69

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F05B.xhtml#P700101715800000000000000000F05B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F06F.xhtml#P700101715800000000000000000F06F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F575.xhtml#P700101715800000000000000000F575

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F097.xhtml#P700101715800000000000000000F097

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F0A7.xhtml#P700101715800000000000000000F0A7

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F106.xhtml#P700101715800000000000000000F106

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F194.xhtml#P700101715800000000000000000F194

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F254.xhtml#P700101715800000000000000000F254

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F32C.xhtml#P700101715800000000000000000F32C

3.6 Heuristic Functions 97

Summary 104

Bibliographical and Historical Notes 106

4 Search in Complex Environments 110

4.1 Local Search and Optimization Problems 110

4.2 Local Search in Continuous Spaces 119

4.3 Search with Nondeterministic Actions 122

4.4 Search in Partially Observable Environments 126

4.5 Online Search Agents and Unknown Environments 134

Summary 141

Bibliographical and Historical Notes 142

5 Adversarial Search and Games 146

5.1 Game Theory 146

5.2 Optimal Decisions in Games 148

5.3 Heuristic Alpha–Beta Tree Search 156

5.4 Monte Carlo Tree Search 161

5.5 Stochastic Games 164

5.6 Partially Observable Games 168

5.7 Limitations of Game Search Algorithms 173

Summary 174

Bibliographical and Historical Notes 175

6 Constraint Satisfaction Problems 180

6.1 Defining Constraint Satisfaction Problems 180

6.2 Constraint Propagation: Inference in CSPs 185

6.3 Backtracking Search for CSPs 191

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F450.xhtml#P700101715800000000000000000F450

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F4FA.xhtml#P700101715800000000000000000F4FA

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F52C.xhtml#P700101715800000000000000000F52C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F57F.xhtml#P700101715800000000000000000F57F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F588.xhtml#P700101715800000000000000000F588

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F6E7.xhtml#P700101715800000000000000000F6E7

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F717.xhtml#P700101715800000000000000000F717

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F77C.xhtml#P700101715800000000000000000F77C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F81B.xhtml#P700101715800000000000000000F81B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F8BE.xhtml#P700101715800000000000000000F8BE

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F8D1.xhtml#P700101715800000000000000000F8D1

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F90E.xhtml#P700101715800000000000000000F90E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F919.xhtml#P700101715800000000000000000F919

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F95D.xhtml#P700101715800000000000000000F95D

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FA0F.xhtml#P700101715800000000000000000FA0F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FA7B.xhtml#P700101715800000000000000000FA7B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FACB.xhtml#P700101715800000000000000000FACB

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FB03.xhtml#P700101715800000000000000000FB03

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FB3F.xhtml#P700101715800000000000000000FB3F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FB54.xhtml#P700101715800000000000000000FB54

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FB6B.xhtml#P700101715800000000000000000FB6B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FBA5.xhtml#P700101715800000000000000000FBA5

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FBB2.xhtml#P700101715800000000000000000FBB2

,

should be autonomous—it should learn what it can to compensate for partial or

incorrect prior knowledge. For example, a vacuum-cleaning agent that learns to predict

where and when additional dirt will appear will do better than one that does not.

Autonomy

As a practical matter, one seldom requires complete autonomy from the start: when the

agent has had little or no experience, it would have to act randomly unless the designer

gave some assistance. Just as evolution provides animals with enough built-in reflexes to

survive long enough to learn for themselves, it would be reasonable to provide an artificial

intelligent agent with some initial knowledge as well as an ability to learn. After sufficient

experience of its environment, the behavior of a rational agent can become effectively

independent of its prior knowledge. Hence, the incorporation of learning allows one to

design a single rational agent that will succeed in a vast variety of environments.

2.3 The Nature of Environments

Now that we have a definition of rationality, we are almost ready to think about building

rational agents. First, however, we must think about task environments, which are

essentially the “problems” to which rational agents are the “solutions.” We begin by showing

how to specify a task environment, illustrating the process with a number of examples. We

then show that task environments come in a variety of flavors. The nature of the task

environment directly affects the appropriate design for the agent program.

Task environment

2.3.1 Specifying the task environment

In our discussion of the rationality of the simple vacuum-cleaner agent, we had to specify

the performance measure, the environment, and the agent’s actuators and sensors. We

group all these under the heading of the task environment. For the acronymically minded,

we call this the PEAS (Performance, Environment, Actuators, Sensors) description. In

designing an agent, the first step must always be to specify the task environment as fully as

possible.

PEAS

The vacuum world was a simple example; let us consider a more complex problem: an

automated taxi driver. Figure 2.4 summarizes the PEAS description for the taxi’s task

environment. We discuss each element in more detail in the following paragraphs.

Figure 2.4

PEAS description of the task environment for an automated taxi driver.

First, what is the performance measure to which we would like our automated driver to

aspire? Desirable qualities include getting to the correct destination; minimizing fuel

consumption and wear and tear; minimizing the trip time or cost; minimizing violations of

traffic laws and disturbances to other drivers; maximizing safety and passenger comfort;

maximizing profits. Obviously, some of these goals conflict, so tradeoffs will be required.

Next, what is the driving environment that the taxi will face? Any taxi driver must deal with

a variety of roads, ranging from rural lanes and urban alleys to 12-lane freeways. The roads

contain other traffic, pedestrians, stray animals, road works, police cars, puddles, and

potholes. The taxi must also interact with potential and actual passengers. There are also

some optional choices. The taxi might need to operate in Southern California, where snow

is seldom a problem, or in Alaska, where it seldom is not. It could always be driving on the

right, or we might want it to be flexible enough to drive on the left when in Britain or Japan.

Obviously, the more restricted the environment, the easier the design problem.

The actuators for an automated taxi include those available to a human driver: control over

the engine through the accelerator and control over steering and braking. In addition, it will

need output to a display screen or voice synthesizer to talk back to the passengers, and

perhaps some way to communicate with other vehicles, politely or otherwise.

The basic sensors for the taxi will include one or more video cameras so that it can see, as

well as lidar and ultrasound sensors to detect distances to other cars and obstacles. To avoid

speeding tickets, the taxi should have a speedometer, and to control the vehicle properly,

especially on curves, it should have an accelerometer. To determine the mechanical state of

the vehicle, it will need the usual array of engine, fuel, and electrical system sensors. Like

many human drivers, it might want to access GPS signals so that it doesn’t get lost. Finally, it

will need touchscreen or voice input for the passenger to request a destination.

In Figure 2.5 , we have sketched the basic PEAS elements for a number of additional agent

types. Further examples appear in Exercise 2.PEAS. The examples include physical as well as

virtual environments. Note that virtual task environments can be just as complex as the

“real” world: for example, a software agent (or software robot or softbot) that trades on

auction and reselling Web sites deals with millions of other users and billions of objects,

many with real images.

Figure 2.5

Examples of agent types and their PEAS descriptions.

Software agent

Softbot

2.3.2 Properties of task environments

The range of task environments that might arise in AI is obviously vast. We can, however,

identify a fairly small number of dimensions along which task environments can be

categorized. These dimensions determine, to a large extent, the appropriate agent design

and the applicability of each of the principal families of techniques for agent

implementation. First we list the dimensions, then we analyze several task environments to

illustrate the ideas. The definitions here are informal; later chapters provide more precise

statements and examples of each kind of environment.

Fully observable

Partially observable

FULLY OBSERVABLE VS. PARTIALLY OBSERVABLE: If an agent’s sensors give it access

to the complete state of the environment at each point in time, then we say that the task

environment is fully observable. A task environment is effectively fully observable if the

sensors detect all aspects that are relevant to the choice of action; relevance, in turn,

depends on the performance measure. Fully observable environments are convenient

because the agent need not maintain any internal state to keep track of the world. An

environment might be partially observable because of noisy and inaccurate sensors or

because parts of the state are simply missing from the sensor data—for example, a vacuum

agent with only a local dirt sensor cannot tell whether there is dirt in other squares, and an

automated taxi cannot see what other drivers are thinking. If the agent has no sensors at all

then the environment is unobservable. One might think that in such cases the agent’s plight

is hopeless, but, as we discuss in Chapter 4 , the agent’s goals may still be achievable,

sometimes with certainty.

Unobservable

SINGLE-AGENT VS. MULTIAGENT: The distinction between single-agent and multiagent

environments may seem simple enough. For example, an agent solving a crossword puzzle

by itself is clearly in a single-agent environment, whereas an agent playing chess is in a two-

agent environment. However, there are some subtle issues. First, we have described how an

entity may be viewed as an agent, but we have not explained which entities must be viewed

as agents. Does an agent (the taxi driver for example) have to treat an object (another

vehicle) as an agent, or can it be treated merely as an object behaving according to the laws

of physics, analogous to waves at the beach or leaves blowing in the wind? The key

distinction is whether ’s behavior is best described as maximizing a performance measure

whose value depends on agent ’s behavior.

Single-agent

Multiagent

For example, in chess, the opponent entity is trying to maximize its performance measure,

which, by the rules of chess, minimizes agent ’s performance measure. Thus, chess is a

competitive multiagent environment. On the

,

other hand, in the taxi-driving environment,

avoiding collisions maximizes the performance measure of all agents, so it is a partially

A B

B

A

B

A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F57F.xhtml#P700101715800000000000000000F57F

cooperative multiagent environment. It is also partially competitive because, for example,

only one car can occupy a parking space.

Competitive

Cooperative

The agent-design problems in multiagent environments are often quite different from those

in single-agent environments; for example, communication often emerges as a rational

behavior in multiagent environments; in some competitive environments, randomized

behavior is rational because it avoids the pitfalls of predictability.

Deterministic vs. nondeterministic. If the next state of the environment is completely

determined by the current state and the action executed by the agent(s), then we say the

environment is deterministic; otherwise, it is nondeterministic. In principle, an agent need

not worry about uncertainty in a fully observable, deterministic environment. If the

environment is partially observable, however, then it could appear to be nondeterministic.

Deterministic

Nondeterministic

Most real situations are so complex that it is impossible to keep track of all the unobserved

aspects; for practical purposes, they must be treated as nondeterministic. Taxi driving is

clearly nondeterministic in this sense, because one can never predict the behavior of traffic

exactly; moreover, one’s tires may blow out unexpectedly and one’s engine may seize up

without warning. The vacuum world as we described it is deterministic, but variations can

include nondeterministic elements such as randomly appearing dirt and an unreliable

suction mechanism (Exercise 2.VFIN).

One final note: the word stochastic is used by some as a synonym for “nondeterministic,”

but we make a distinction between the two terms; we say that a model of the environment

is stochastic if it explicitly deals with probabilities (e.g., “there’s a 25% chance of rain

tomorrow”) and “nondeterministic” if the possibilities are listed without being quantified

(e.g., “there’s a chance of rain tomorrow”).

Stochastic

EPISODIC VS. SEQUENTIAL: In an episodic task environment, the agent’s experience is

divided into atomic episodes. In each episode the agent receives a percept and then

performs a single action. Crucially, the next episode does not depend on the actions taken in

previous episodes. Many classification tasks are episodic. For example, an agent that has to

spot defective parts on an assembly line bases each decision on the current part, regardless

of previous decisions; moreover, the current decision doesn’t affect whether the next part is

defective. In sequential environments, on the other hand, the current decision could affect

all future decisions. Chess and taxi driving are sequential: in both cases, short-term actions

can have long-term consequences. Episodic environments are much simpler than sequential

environments because the agent does not need to think ahead.

4 The word “sequential” is also used in computer science as the antonym of “parallel.” The two meanings are largely unrelated.

Episodic

Sequential

4

Static

Dynamic

STATIC VS. DYNAMIC: If the environment can change while an agent is deliberating, then

we say the environment is dynamic for that agent; otherwise, it is static. Static environments

are easy to deal with because the agent need not keep looking at the world while it is

deciding on an action, nor need it worry about the passage of time. Dynamic environments,

on the other hand, are continuously asking the agent what it wants to do; if it hasn’t decided

yet, that counts as deciding to do nothing. If the environment itself does not change with the

passage of time but the agent’s performance score does, then we say the environment is

semidynamic. Taxi driving is clearly dynamic: the other cars and the taxi itself keep moving

while the driving algorithm dithers about what to do next. Chess, when played with a clock,

is semidynamic. Crossword puzzles are static.

Semidynamic

DISCRETE VS. CONTINUOUS: The discrete/continuous distinction applies to the state of

the environment, to the way time is handled, and to the percepts and actions of the agent. For

example, the chess environment has a finite number of distinct states (excluding the clock).

Chess also has a discrete set of percepts and actions. Taxi driving is a continuous-state and

continuous-time problem: the speed and location of the taxi and of the other vehicles sweep

through a range of continuous values and do so smoothly over time. Taxi-driving actions are

also continuous (steering angles, etc.). Input from digital cameras is discrete, strictly

speaking, but is typically treated as representing continuously varying intensities and

locations.

Discrete

Continuous

KNOWN VS. UNKNOWN: Strictly speaking, this distinction refers not to the environment

itself but to the agent’s (or designer’s) state of knowledge about the “laws of physics” of the

environment. In a known environment, the outcomes (or outcome probabilities if the

environment is nondeterministic) for all actions are given. Obviously, if the environment is

unknown, the agent will have to learn how it works in order to make good decisions.

Known

Unknown

The distinction between known and unknown environments is not the same as the one

between fully and partially observable environments. It is quite possible for a known

environment to be partially observable—for example, in solitaire card games, I know the

rules but am still unable to see the cards that have not yet been turned over. Conversely, an

unknown environment can be fully observable—in a new video game, the screen may show

the entire game state but I still don’t know what the buttons do until I try them.

As noted on page 39, the performance measure itself may be unknown, either because the

designer is not sure how to write it down correctly or because the ultimate user—whose

preferences matter—is not known. For example, a taxi driver usually won’t know whether a

new passenger prefers a leisurely or speedy journey, a cautious or aggressive driving style. A

virtual personal assistant starts out knowing nothing about the personal preferences of its

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE93.xhtml#P700101715800000000000000000EE95

new owner. In such cases, the agent may learn more about the performance measure based

on further interactions with the designer or user. This, in turn, suggests that the task

environment is necessarily viewed as a multiagent environment.

The hardest case is partially observable, multiagent, nondeterministic, sequential, dynamic,

continuous, and unknown. Taxi driving is hard in all these senses, except that the driver’s

environment is mostly known. Driving a rented car in a new country with unfamiliar

geography, different traffic laws, and nervous passengers is a lot more exciting.

Figure 2.6 lists the properties of a number of familiar environments. Note that the

properties are not always cut and dried. For example, we have listed the medical-diagnosis

task as single-agent because the disease process in a patient is not profitably modeled as an

agent; but a medical-diagnosis system might also have to deal with recalcitrant patients and

skeptical staff, so the environment could have a multiagent aspect. Furthermore, medical

diagnosis is episodic if one conceives of the task as selecting a diagnosis given a list of

symptoms; the problem is sequential if the task can include proposing a series of tests,

evaluating progress over the course of treatment, handling multiple patients, and so on.

Figure 2.6

Examples of task environments and their characteristics.

We have not included a “known/unknown” column because, as

,

explained earlier, this is not

strictly a property of the environment. For some environments, such as chess and poker, it is

quite easy to supply the agent with full knowledge of the rules, but it is nonetheless

interesting to consider how an agent might learn to play these games without such

knowledge.

The code repository associated with this book (aima.cs.berkeley.edu) includes multiple

environment implementations, together with a general-purpose environment simulator for

evaluating an agent’s performance. Experiments are often carried out not for a single

environment but for many environments drawn from an environment class. For example, to

evaluate a taxi driver in simulated traffic, we would want to run many simulations with

different traffic, lighting, and weather conditions. We are then interested in the agent’s

average performance over the environment class.

Environment class

2.4 The Structure of Agents

So far we have talked about agents by describing behavior—the action that is performed after

any given sequence of percepts. Now we must bite the bullet and talk about how the insides

work. The job of AI is to design an agent program that implements the agent function—the

mapping from percepts to actions. We assume this program will run on some sort of

computing device with physical sensors and actuators—we call this the agent architecture:

Agent program

Agent architecture

Obviously, the program we choose has to be one that is appropriate for the architecture. If

the program is going to recommend actions like Walk, the architecture had better have legs.

The architecture might be just an ordinary PC, or it might be a robotic car with several

onboard computers, cameras, and other sensors. In general, the architecture makes the

percepts from the sensors available to the program, runs the program, and feeds the

program’s action choices to the actuators as they are generated. Most of this book is about

designing agent programs, although Chapters 25 and 26 deal directly with the sensors

and actuators.

2.4.1 Agent programs

The agent programs that we design in this book all have the same skeleton: they take the

current percept as input from the sensors and return an action to the actuators. Notice the

difference between the agent program, which takes the current percept as input, and the

agent function, which may depend on the entire percept history. The agent program has no

agent = architecture + program.

 

5

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000148CB.xhtml#P70010171580000000000000000148CB

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C8B.xhtml#P7001017158000000000000000014C8B

choice but to take just the current percept as input because nothing more is available from

the environment; if the agent’s actions need to depend on the entire percept sequence, the

agent will have to remember the percepts.

5 There are other choices for the agent program skeleton; for example, we could have the agent programs be coroutines that run

asynchronously with the environment. Each such coroutine has an input and output port and consists of a loop that reads the input

port for percepts and writes actions to the output port.

We describe the agent programs in the simple pseudocode language that is defined in

Appendix B . (The online code repository contains implementations in real programming

languages.) For example, Figure 2.7 shows a rather trivial agent program that keeps track

of the percept sequence and then uses it to index into a table of actions to decide what to

do. The table—an example of which is given for the vacuum world in Figure 2.3 —

represents explicitly the agent function that the agent program embodies. To build a rational

agent in this way, we as designers must construct a table that contains the appropriate

action for every possible percept sequence.

Figure 2.7

The TABLE-DRIVEN-AGENT program is invoked for each new percept and returns an action each time. It

retains the complete percept sequence in memory.

It is instructive to consider why the table-driven approach to agent construction is doomed

to failure. Let be the set of possible percepts and let be the lifetime of the agent (the

total number of percepts it will receive). The lookup table will contain entries.

Consider the automated taxi: the visual input from a single camera (eight cameras is typical)

comes in at the rate of roughly 70 megabytes per second (30 frames per second,

pixels with 24 bits of color information). This gives a lookup table with over

entries for an hour’s driving. Even the lookup table for chess—a tiny, well-behaved fragment

of the real world—has (it turns out) at least entries. In comparison, the number of

atoms in the observable universe is less than . The daunting size of these tables means

that (a) no physical agent in this universe will have the space to store the table; (b) the

P T

∑T

t = 1

∣∣P ∣∣

t

1080 × 720

10600,000,000,000

10150

1080

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001543B.xhtml#P700101715800000000000000001543B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE56.xhtml#P700101715800000000000000000EE77

designer would not have time to create the table; and (c) no agent could ever learn all the

right table entries from its experience.

Despite all this, TABLE-DRIVEN-AGENT does do what we want, assuming the table is filled in

correctly: it implements the desired agent function.

The key challenge for AI is to find out how to write programs that, to the extent possible, produce

rational behavior from a smallish program rather than from a vast table.

We have many examples showing that this can be done successfully in other areas: for

example, the huge tables of square roots used by engineers and schoolchildren prior to the

1970s have now been replaced by a five-line program for Newton’s method running on

electronic calculators. The question is, can AI do for general intelligent behavior what

Newton did for square roots? We believe the answer is yes.

In the remainder of this section, we outline four basic kinds of agent programs that embody

the principles underlying almost all intelligent systems:

Simple reflex agents;

Model-based reflex agents;

Goal-based agents; and

Utility-based agents.

Each kind of agent program combines particular components in particular ways to generate

actions. Section 2.4.6 explains in general terms how to convert all these agents into

learning agents that can improve the performance of their components so as to generate

better actions. Finally, Section 2.4.7 describes the variety of ways in which the components

themselves can be represented within the agent. This variety provides a major organizing

principle for the field and for the book itself.

2.4.2 Simple reflex agents

The simplest kind of agent is the simple reflex agent. These agents select actions on the

basis of the current percept, ignoring the rest of the percept history. For example, the

vacuum agent whose agent function is tabulated in Figure 2.3 is a simple reflex agent,

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE56.xhtml#P700101715800000000000000000EE77

because its decision is based only on the current location and on whether that location

contains dirt. An agent program for this agent is shown in Figure 2.8 .

Figure 2.8

The agent program for a simple reflex agent in the two-location vacuum environment. This program

implements the agent function tabulated in Figure 2.3 .

Simple reflex agent

Notice that the vacuum agent program is very small indeed compared to the corresponding

table. The most obvious reduction comes from ignoring the percept history, which cuts

down the number of relevant percept

,

sequences from to just 4. A further, small reduction

comes from the fact that when the current square is dirty, the action does not depend on the

location. Although we have written the agent program using if-then-else statements, it is

simple enough that it can also be implemented as a Boolean circuit.

Simple reflex behaviors occur even in more complex environments. Imagine yourself as the

driver of the automated taxi. If the car in front brakes and its brake lights come on, then you

should notice this and initiate braking. In other words, some processing is done on the

visual input to establish the condition we call “The car in front is braking.” Then, this

triggers some established connection in the agent program to the action “initiate braking.”

We call such a connection a condition–action rule, written as

6 Also called situation–action rules, productions, or if–then rules.

if car-in-front-is-braking then initiate-braking.

4T

6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE56.xhtml#P700101715800000000000000000EE77

condition–action rule

Humans also have many such connections, some of which are learned responses (as for

driving) and some of which are innate reflexes (such as blinking when something

approaches the eye). In the course of the book, we show several different ways in which

such connections can be learned and implemented.

The program in Figure 2.8 is specific to one particular vacuum environment. A more

general and flexible approach is first to build a general-purpose interpreter for condition–

action rules and then to create rule sets for specific task environments. Figure 2.9 gives the

structure of this general program in schematic form, showing how the condition–action

rules allow the agent to make the connection from percept to action. Do not worry if this

seems trivial; it gets more interesting shortly.

Figure 2.9

Schematic diagram of a simple reflex agent. We use rectangles to denote the current internal state of the

agent’s decision process, and ovals to represent the background information used in the process.

An agent program for Figure 2.9 is shown in Figure 2.10 . The INTERPRET-INPUT function

generates an abstracted description of the current state from the percept, and the RULE-

MATCH function returns the first rule in the set of rules that matches the given state

description. Note that the description in terms of “rules” and “matching” is purely

conceptual; as noted above, actual implementations can be as simple as a collection of logic

gates implementing a Boolean circuit. Alternatively, a “neural” circuit can be used, where

the logic gates are replaced by the nonlinear units of artificial neural networks (see Chapter

21 ).

Figure 2.10

A simple reflex agent. It acts according to a rule whose condition matches the current state, as defined by

the percept.

Simple reflex agents have the admirable property of being simple, but they are of limited

intelligence. The agent in Figure 2.10 will work only if the correct decision can be made on the

basis of just the current percept—that is, only if the environment is fully observable.

Even a little bit of unobservability can cause serious trouble. For example, the braking rule

given earlier assumes that the condition car-in-front-is-braking can be determined from the

current percept—a single frame of video. This works if the car in front has a centrally

mounted (and hence uniquely identifiable) brake light. Unfortunately, older models have

different configurations of taillights, brake lights, and turn-signal lights, and it is not always

possible to tell from a single image whether the car is braking or simply has its taillights on.

A simple reflex agent driving behind such a car would either brake continuously and

unnecessarily, or, worse, never brake at all.

We can see a similar problem arising in the vacuum world. Suppose that a simple reflex

vacuum agent is deprived of its location sensor and has only a dirt sensor. Such an agent has

just two possible percepts: and . It can in response to ; what

 

[Dirty] [Clean] Suck [Dirty]

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013CF5.xhtml#P7001017158000000000000000013CF5

should it do in response to ? Moving fails (forever) if it happens to start in

square , and moving fails (forever) if it happens to start in square . Infinite loops

are often unavoidable for simple reflex agents operating in partially observable

environments.

Escape from infinite loops is possible if the agent can randomize its actions. For example, if

the vacuum agent perceives , it might flip a coin to choose between and .

It is easy to show that the agent will reach the other square in an average of two steps.

Then, if that square is dirty, the agent will clean it and the task will be complete. Hence, a

randomized simple reflex agent might outperform a deterministic simple reflex agent.

Randomization

We mentioned in Section 2.3 that randomized behavior of the right kind can be rational in

some multiagent environments. In single-agent environments, randomization is usually not

rational. It is a useful trick that helps a simple reflex agent in some situations, but in most

cases we can do much better with more sophisticated deterministic agents.

2.4.3 Model-based reflex agents

The most effective way to handle partial observability is for the agent to keep track of the part

of the world it can’t see now. That is, the agent should maintain some sort of internal state that

depends on the percept history and thereby reflects at least some of the unobserved aspects

of the current state. For the braking problem, the internal state is not too extensive—just the

previous frame from the camera, allowing the agent to detect when two red lights at the

edge of the vehicle go on or off simultaneously. For other driving tasks such as changing

lanes, the agent needs to keep track of where the other cars are if it can’t see them all at

once. And for any driving to be possible at all, the agent needs to keep track of where its

keys are.

Internal state

[Clean] Left

A Right B

[Clean] Right Left

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EED7.xhtml#P700101715800000000000000000EED7

Updating this internal state information as time goes by requires two kinds of knowledge to

be encoded in the agent program in some form. First, we need some information about how

the world changes over time, which can be divided roughly into two parts: the effects of the

agent’s actions and how the world evolves independently of the agent. For example, when

the agent turns the steering wheel clockwise, the car turns to the right, and when it’s raining

the car’s cameras can get wet. This knowledge about “how the world works”—whether

implemented in simple Boolean circuits or in complete scientific theories—is called a

transition model of the world.

Transition model

Second, we need some information about how the state of the world is reflected in the

agent’s percepts. For example, when the car in front initiates braking, one or more

illuminated red regions appear in the forward-facing camera image, and, when the camera

gets wet, droplet-shaped objects appear in the image partially obscuring the road. This kind

of knowledge is called a sensor model.

Sensor model

Together, the transition model and sensor model allow an agent to keep track of the state of

the world—to the extent possible given the limitations of the agent’s sensors. An agent that

uses such models is called a model-based agent.

Model-based agent

Figure 2.11 gives the structure of the model-based reflex agent with internal state,

showing how the current percept is combined with the old internal state to generate the

updated description of the current state, based on the agent’s model of how the world

,

works. The agent program is shown in Figure 2.12 . The interesting part is the function

UPDATE-STATE, which is responsible for creating the new internal state description. The

details of how models and states are represented vary widely depending on the type of

environment and the particular technology used in the agent design.

Figure 2.11

A model-based reflex agent.

Figure 2.12

A model-based reflex agent. It keeps track of the current state of the world, using an internal model. It

then chooses an action in the same way as the reflex agent.

Regardless of the kind of representation used, it is seldom possible for the agent to

determine the current state of a partially observable environment exactly. Instead, the box

labeled “what the world is like now” (Figure 2.11 ) represents the agent’s “best guess” (or

sometimes best guesses, if the agent entertains multiple possibilities). For example, an

automated taxi may not be able to see around the large truck that has stopped in front of it

and can only guess about what may be causing the hold-up. Thus, uncertainty about the

current state may be unavoidable, but the agent still has to make a decision.

2.4.4 Goal-based agents

Knowing something about the current state of the environment is not always enough to

decide what to do. For example, at a road junction, the taxi can turn left, turn right, or go

straight on. The correct decision depends on where the taxi is trying to get to. In other

words, as well as a current state description, the agent needs some sort of goal information

that describes situations that are desirable—for example, being at a particular destination.

The agent program can combine this with the model (the same information as was used in

the model-based reflex agent) to choose actions that achieve the goal. Figure 2.13 shows

the goal-based agent’s structure.

Figure 2.13

A model-based, goal-based agent. It keeps track of the world state as well as a set of goals it is trying to

achieve, and chooses an action that will (eventually) lead to the achievement of its goals.

Goal

Sometimes goal-based action selection is straightforward—for example, when goal

satisfaction results immediately from a single action. Sometimes it will be more tricky—for

example, when the agent has to consider long sequences of twists and turns in order to find

a way to achieve the goal. Search (Chapters 3 to 5 ) and planning (Chapter 11 ) are the

subfields of AI devoted to finding action sequences that achieve the agent’s goals.

Notice that decision making of this kind is fundamentally different from the condition–

action rules described earlier, in that it involves consideration of the future—both “What will

happen if I do such-and-such?” and “Will that make me happy?” In the reflex agent designs,

this information is not explicitly represented, because the built-in rules map directly from

percepts to actions. The reflex agent brakes when it sees brake lights, period. It has no idea

  

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F097.xhtml#P700101715800000000000000000F097

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F90E.xhtml#P700101715800000000000000000F90E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B83.xhtml#P7001017158000000000000000010B83

why. A goal-based agent brakes when it sees brake lights because that’s the only action that

it predicts will achieve its goal of not hitting other cars.

Although the goal-based agent appears less efficient, it is more flexible because the

knowledge that supports its decisions is represented explicitly and can be modified. For

example, a goal-based agent’s behavior can easily be changed to go to a different

destination, simply by specifying that destination as the goal. The reflex agent’s rules for

when to turn and when to go straight will work only for a single destination; they must all

be replaced to go somewhere new.

2.4.5 Utility-based agents

Goals alone are not enough to generate high-quality behavior in most environments. For

example, many action sequences will get the taxi to its destination (thereby achieving the

goal), but some are quicker, safer, more reliable, or cheaper than others. Goals just provide

a crude binary distinction between “happy” and “unhappy” states. A more general

performance measure should allow a comparison of different world states according to

exactly how happy they would make the agent. Because “happy” does not sound very

scientific, economists and computer scientists use the term utility instead.

7 The word “utility” here refers to “the quality of being useful,” not to the electric company or waterworks.

Utility

We have already seen that a performance measure assigns a score to any given sequence of

environment states, so it can easily distinguish between more and less desirable ways of

getting to the taxi’s destination. An agent’s utility function is essentially an internalization

of the performance measure. Provided that the internal utility function and the external

performance measure are in agreement, an agent that chooses actions to maximize its utility

will be rational according to the external performance measure.

Utility function

7

Let us emphasize again that this is not the only way to be rational—we have already seen a

rational agent program for the vacuum world (Figure 2.8 ) that has no idea what its utility

function is—but, like goal-based agents, a utility-based agent has many advantages in terms

of flexibility and learning. Furthermore, in two kinds of cases, goals are inadequate but a

utility-based agent can still make rational decisions. First, when there are conflicting goals,

only some of which can be achieved (for example, speed and safety), the utility function

specifies the appropriate tradeoff. Second, when there are several goals that the agent can

aim for, none of which can be achieved with certainty, utility provides a way in which the

likelihood of success can be weighed against the importance of the goals.

Partial observability and nondeterminism are ubiquitous in the real world, and so, therefore,

is decision making under uncertainty. Technically speaking, a rational utility-based agent

chooses the action that maximizes the expected utility of the action outcomes—that is, the

utility the agent expects to derive, on average, given the probabilities and utilities of each

outcome. (Appendix A defines expectation more precisely.) In Chapter 16 , we show that

any rational agent must behave as if it possesses a utility function whose expected value it

tries to maximize. An agent that possesses an explicit utility function can make rational

decisions with a general-purpose algorithm that does not depend on the specific utility

function being maximized. In this way, the “global” definition of rationality—designating as

rational those agent functions that have the highest performance—is turned into a “local”

constraint on rational-agent designs that can be expressed in a simple program.

Expected utility

The utility-based agent structure appears in Figure 2.14 . Utility-based agent programs

appear in Chapters 16 and 17 , where we design decision-making agents that must

handle the uncertainty inherent in nondeterministic or partially observable environments.

Decision making in multiagent environments is also studied in the framework of utility

theory, as explained in Chapter 18 .

 

 

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015390.xhtml#P7001017158000000000000000015390

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194

,

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124DA.xhtml#P70010171580000000000000000124DA

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B87.xhtml#P7001017158000000000000000012B87

Figure 2.14

A model-based, utility-based agent. It uses a model of the world, along with a utility function that

measures its preferences among states of the world. Then it chooses the action that leads to the best

expected utility, where expected utility is computed by averaging over all possible outcome states,

weighted by the probability of the outcome.

At this point, the reader may be wondering, “Is it that simple? We just build agents that

maximize expected utility, and we’re done?” It’s true that such agents would be intelligent,

but it’s not simple. A utility-based agent has to model and keep track of its environment,

tasks that have involved a great deal of research on perception, representation, reasoning,

and learning. The results of this research fill many of the chapters of this book. Choosing the

utility-maximizing course of action is also a difficult task, requiring ingenious algorithms

that fill several more chapters. Even with these algorithms, perfect rationality is usually

unachievable in practice because of computational complexity, as we noted in Chapter 1 .

We also note that not all utility-based agents are model-based; we will see in Chapters 22

and 26 that a model-free agent can learn what action is best in a particular situation

without ever learning exactly how that action changes the environment.

Model-free agent

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBA2.xhtml#P700101715800000000000000000EBA2

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C8B.xhtml#P7001017158000000000000000014C8B

Finally, all of this assumes that the designer can specify the utility function correctly;

Chapters 17 , 18 , and 22 consider the issue of unknown utility functions in more depth.

2.4.6 Learning agents

We have described agent programs with various methods for selecting actions. We have not,

so far, explained how the agent programs come into being. In his famous early paper, Turing

(1950) considers the idea of actually programming his intelligent machines by hand. He

estimates how much work this might take and concludes, “Some more expeditious method

seems desirable.” The method he proposes is to build learning machines and then to teach

them. In many areas of AI, this is now the preferred method for creating state-of-the-art

systems. Any type of agent (model-based, goal-based, utility-based, etc.) can be built as a

learning agent (or not).

Learning has another advantage, as we noted earlier: it allows the agent to operate in

initially unknown environments and to become more competent than its initial knowledge

alone might allow. In this section, we briefly introduce the main ideas of learning agents.

Throughout the book, we comment on opportunities and methods for learning in particular

kinds of agents. Chapters 19 –22 go into much more depth on the learning algorithms

themselves.

Learning element

Performance element

A learning agent can be divided into four conceptual components, as shown in Figure

2.15 . The most important distinction is between the learning element, which is

responsible for making improvements, and the performance element, which is responsible

  

 

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124DA.xhtml#P70010171580000000000000000124DA

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B87.xhtml#P7001017158000000000000000012B87

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166D4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001316E.xhtml#P700101715800000000000000001316E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4

for selecting external actions. The performance element is what we have previously

considered to be the entire agent: it takes in percepts and decides on actions. The learning

element uses feedback from the critic on how the agent is doing and determines how the

performance element should be modified to do better in the future.

Figure 2.15

A general learning agent. The “performance element” box represents what we have previously considered

to be the whole agent program. Now, the “learning element” box gets to modify that program to improve

its performance.

Critic

The design of the learning element depends very much on the design of the performance

element. When trying to design an agent that learns a certain capability, the first question is

not “How am I going to get it to learn this?” but “What kind of performance element will my

agent use to do this once it has learned how?” Given a design for the performance element,

learning mechanisms can be constructed to improve every part of the agent.

The critic tells the learning element how well the agent is doing with respect to a fixed

performance standard. The critic is necessary because the percepts themselves provide no

indication of the agent’s success. For example, a chess program could receive a percept

indicating that it has checkmated its opponent, but it needs a performance standard to know

that this is a good thing; the percept itself does not say so. It is important that the

performance standard be fixed. Conceptually, one should think of it as being outside the

agent altogether because the agent must not modify it to fit its own behavior.

The last component of the learning agent is the problem generator. It is responsible for

suggesting actions that will lead to new and informative experiences. If the performance

element had its way, it would keep doing the actions that are best, given what it knows, but

if the agent is willing to explore a little and do some perhaps suboptimal actions in the short

run, it might discover much better actions for the long run. The problem generator’s job is

to suggest these exploratory actions. This is what scientists do when they carry out

experiments. Galileo did not think that dropping rocks from the top of a tower in Pisa was

valuable in itself. He was not trying to break the rocks or to modify the brains of unfortunate

pedestrians. His aim was to modify his own brain by identifying a better theory of the

motion of objects.

Problem generator

The learning element can make changes to any of the “knowledge” components shown in

the agent diagrams (Figures 2.9 , 2.11 , 2.13 , and 2.14 ). The simplest cases involve

learning directly from the percept sequence. Observation of pairs of successive states of the

environment can allow the agent to learn “What my actions do” and “How the world

evolves” in response to its actions. For example, if the automated taxi exerts a certain

braking pressure when driving on a wet road, then it will soon find out how much

deceleration is actually achieved, and whether it skids off the road. The problem generator

might identify certain parts of the model that are in need of improvement and suggest

   

experiments, such as trying out the brakes on different road surfaces under different

conditions.

Improving the model components of a model-based agent so that they conform better with

reality is almost always a good idea, regardless of

,

the external performance standard. (In

some cases, it is better from a computational point of view to have a simple but slightly

inaccurate model rather than a perfect but fiendishly complex model.) Information from the

external standard is needed when trying to learn a reflex component or a utility function.

For example, suppose the taxi-driving agent receives no tips from passengers who have

been thoroughly shaken up during the trip. The external performance standard must inform

the agent that the loss of tips is a negative contribution to its overall performance; then the

agent might be able to learn that violent maneuvers do not contribute to its own utility. In a

sense, the performance standard distinguishes part of the incoming percept as a reward (or

penalty) that provides direct feedback on the quality of the agent’s behavior. Hard-wired

performance standards such as pain and hunger in animals can be understood in this way.

Reward

Penalty

More generally, human choices can provide information about human preferences. For

example, suppose the taxi does not know that people generally don’t like loud noises, and

settles on the idea of blowing its horn continuously as a way of ensuring that pedestrians

know it’s coming. The consequent human behavior—covering ears, using bad language, and

possibly cutting the wires to the horn—would provide evidence to the agent with which to

update its utility function. This issue is discussed further in Chapter 22 .

In summary, agents have a variety of components, and those components can be

represented in many ways within the agent program, so there appears to be great variety

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4

among learning methods. There is, however, a single unifying theme. Learning in intelligent

agents can be summarized as a process of modification of each component of the agent to

bring the components into closer agreement with the available feedback information,

thereby improving the overall performance of the agent.

2.4.7 How the components of agent programs work

We have described agent programs (in very high-level terms) as consisting of various

components, whose function it is to answer questions such as: “What is the world like now?”

“What action should I do now?” “What do my actions do?” The next question for a student of

AI is, “How on Earth do these components work?” It takes about a thousand pages to begin

to answer that question properly, but here we want to draw the reader’s attention to some

basic distinctions among the various ways that the components can represent the

environment that the agent inhabits.

Roughly speaking, we can place the representations along an axis of increasing complexity

and expressive power—atomic, factored, and structured. To illustrate these ideas, it helps to

consider a particular agent component, such as the one that deals with “What my actions

do.” This component describes the changes that might occur in the environment as the

result of taking an action, and Figure 2.16 provides schematic depictions of how those

transitions might be represented.

Figure 2.16

Three ways to represent states and the transitions between them. (a) Atomic representation: a state (such

as B or C) is a black box with no internal structure; (b) Factored representation: a state consists of a

vector of attribute values; values can be Boolean, real-valued, or one of a fixed set of symbols. (c)

Structured representation: a state includes objects, each of which may have attributes of its own as well

as relationships to other objects.

In an atomic representation each state of the world is indivisible—it has no internal

structure. Consider the task of finding a driving route from one end of a country to the other

via some sequence of cities (we address this problem in Figure 3.1 on page 64). For the

purposes of solving this problem, it may suffice to reduce the state of the world to just the

name of the city we are in—a single atom of knowledge, a “black box” whose only

discernible property is that of being identical to or different from another black box. The

standard algorithms underlying search and game-playing (Chapters 3 –5 ), hidden

Markov models (Chapter 14 ), and Markov decision processes (Chapter 17 ) all work with

atomic representations.

Atomic representation

A factored representation splits up each state into a fixed set of variables or attributes,

each of which can have a value. Consider a higher-fidelity description for the same driving

problem, where we need to be concerned with more than just atomic location in one city or

another; we might need to pay attention to how much gas is in the tank, our current GPS

coordinates, whether or not the oil warning light is working, how much money we have for

tolls, what station is on the radio, and so on. While two different atomic states have nothing

in common—they are just different black boxes—two different factored states can share

some attributes (such as being at some particular GPS location) and not others (such as

having lots of gas or having no gas); this makes it much easier to work out how to turn one

state into another. Many important areas of AI are based on factored representations,

including constraint satisfaction algorithms (Chapter 6 ), propositional logic (Chapter 7 ),

planning (Chapter 11 ), Bayesian networks (Chapters 12 –16 ), and various machine

learning algorithms.

Factored representation

Variable

 

 

 

  

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F0A7.xhtml#P700101715800000000000000000F0C0

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F0A7.xhtml#P700101715800000000000000000F0B1

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F097.xhtml#P700101715800000000000000000F097

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F90E.xhtml#P700101715800000000000000000F90E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011529.xhtml#P7001017158000000000000000011529

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124DA.xhtml#P70010171580000000000000000124DA

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FBA5.xhtml#P700101715800000000000000000FBA5

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FE31.xhtml#P700101715800000000000000000FE31

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B83.xhtml#P7001017158000000000000000010B83

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010E81.xhtml#P7001017158000000000000000010E81

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194

Attribute

Value

For many purposes, we need to understand the world as having things in it that are related to

each other, not just variables with values. For example, we might notice that a large truck

ahead of us is reversing into the driveway of a dairy farm, but a loose cow is blocking the

truck’s path. A factored representation is unlikely to be pre-equipped with the attribute

TruckAheadBackingIntoDairyFarmDrivewayBlockedByLooseCow with value true or false.

Instead, we would need a structured representation, in which objects such as cows and

trucks and their various and varying relationships can be described explicitly (see Figure

2.16(c) ). Structured representations underlie relational databases and first-order logic

(Chapters 8 , 9 , and 10 ), first-order probability models (Chapter 15 ), and much of

natural language

,

understanding (Chapters 23 and 24 ). In fact, much of what humans

express in natural language concerns objects and their relationships.

Structured representation

As we mentioned earlier, the axis along which atomic, factored, and structured

representations lie is the axis of increasing expressiveness. Roughly speaking, a more

expressive representation can capture, at least as concisely, everything a less expressive one

can capture, plus some more. Often, the more expressive language is much more concise; for

example, the rules of chess can be written in a page or two of a structured-representation

language such as first-order logic but require thousands of pages when written in a factored-

representation language such as propositional logic and around pages when written in

an atomic language such as that of finite-state automata. On the other hand, reasoning and

   

 

1038

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001039E.xhtml#P700101715800000000000000001039E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001063E.xhtml#P700101715800000000000000001063E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001094A.xhtml#P700101715800000000000000001094A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011E2C.xhtml#P7001017158000000000000000011E2C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001427E.xhtml#P700101715800000000000000001427E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001461E.xhtml#P700101715800000000000000001461E

learning become more complex as the expressive power of the representation increases. To

gain the benefits of expressive representations while avoiding their drawbacks, intelligent

systems for the real world may need to operate at all points along the axis simultaneously.

Expressiveness

Localist representation

Another axis for representation involves the mapping of concepts to locations in physical

memory, whether in a computer or in a brain. If there is a one-to-one mapping between

concepts and memory locations, we call that a localist representation. On the other hand, if

the representation of a concept is spread over many memory locations, and each memory

location is employed as part of the representation of multiple different concepts, we call that

a distributed representation. Distributed representations are more robust against noise and

information loss. With a localist representation, the mapping from concept to memory

location is arbitrary, and if a transmission error garbles a few bits, we might confuse Truck

with the unrelated concept Truce. But with a distributed representation, you can think of

each concept representing a point in multidimensional space, and if you garble a few bits

you move to a nearby point in that space, which will have similar meaning.

Distributed representation

Summary

This chapter has been something of a whirlwind tour of AI, which we have conceived of as

the science of agent design. The major points to recall are as follows:

An agent is something that perceives and acts in an environment. The agent function

for an agent specifies the action taken by the agent in response to any percept sequence.

The performance measure evaluates the behavior of the agent in an environment. A

rational agent acts so as to maximize the expected value of the performance measure,

given the percept sequence it has seen so far.

A task environment specification includes the performance measure, the external

environment, the actuators, and the sensors. In designing an agent, the first step must

always be to specify the task environment as fully as possible.

Task environments vary along several significant dimensions. They can be fully or

partially observable, single-agent or multiagent, deterministic or nondeterministic,

episodic or sequential, static or dynamic, discrete or continuous, and known or

unknown.

In cases where the performance measure is unknown or hard to specify correctly, there

is a significant risk of the agent optimizing the wrong objective. In such cases the agent

design should reflect uncertainty about the true objective.

The agent program implements the agent function. There exists a variety of basic agent

program designs reflecting the kind of information made explicit and used in the

decision process. The designs vary in efficiency, compactness, and flexibility. The

appropriate design of the agent program depends on the nature of the environment.

Simple reflex agents respond directly to percepts, whereas model-based reflex agents

maintain internal state to track aspects of the world that are not evident in the current

percept. Goal-based agents act to achieve their goals, and utility-based agents try to

maximize their own expected “happiness.”

All agents can improve their performance through learning.

Bibliographical and Historical Notes

The central role of action in intelligence—the notion of practical reasoning—goes back at

least as far as Aristotle’s Nicomachean Ethics. Practical reasoning was also the subject of

McCarthy’s influential paper “Programs with Common Sense” (1958). The fields of robotics

and control theory are, by their very nature, concerned principally with physical agents. The

concept of a controller in control theory is identical to that of an agent in AI. Perhaps

surprisingly, AI has concentrated for most of its history on isolated components of agents—

question-answering systems, theorem-provers, vision systems, and so on—rather than on

whole agents. The discussion of agents in the text by Genesereth and Nilsson (1987) was an

influential exception. The whole-agent view is now widely accepted and is a central theme

in recent texts (Padgham and Winikoff, 2004; Jones, 2007; Poole and Mackworth, 2017).

Controller

Chapter 1 traced the roots of the concept of rationality in philosophy and economics. In

AI, the concept was of peripheral interest until the mid-1980s, when it began to suffuse

many discussions about the proper technical foundations of the field. A paper by Jon Doyle

(1983) predicted that rational agent design would come to be seen as the core mission of AI,

while other popular topics would spin off to form new disciplines.

Careful attention to the properties of the environment and their consequences for rational

agent design is most apparent in the control theory tradition—for example, classical control

systems (Dorf and Bishop, 2004; Kirk, 2004) handle fully observable, deterministic

environments; stochastic optimal control (Kumar and Varaiya, 1986; Bertsekas and Shreve,

2007) handles partially observable, stochastic environments; and hybrid control (Henzinger

and Sastry, 1998; Cassandras and Lygeros, 2006) deals with environments containing both

discrete and continuous elements. The distinction between fully and partially observable

environments is also central in the dynamic programming literature developed in the field

of operations research (Puterman, 1994), which we discuss in Chapter 17 .

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016136

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015BB5

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000162BF

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015E2D

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016372

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBA2.xhtml#P700101715800000000000000000EBA2

,

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016372

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015A1C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015ED6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015F97

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000156AE

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015D19

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015813

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016394

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124DA.xhtml#P70010171580000000000000000124DA

Although simple reflex agents were central to behaviorist psychology (see Chapter 1 ),

most AI researchers view them as too simple to provide much leverage. (Rosenschein (1985)

and Brooks (1986) questioned this assumption; see Chapter 26 .) A great deal of work has

gone into finding efficient algorithms for keeping track of complex environments (Bar-

Shalom et al., 2001; Choset et al., 2005; Simon, 2006), most of it in the probabilistic setting.

Goal-based agents are presupposed in everything from Aristotle’s view of practical

reasoning to McCarthy’s early papers on logical AI. Shakey the Robot (Fikes and Nilsson,

1971; Nilsson, 1984) was the first robotic embodiment of a logical, goal-based agent. A full

logical analysis of goal-based agents appeared in Genesereth and Nilsson (1987), and a

goal-based programming methodology called agent-oriented programming was developed

by Shoham (1993). The agent-based approach is now extremely popular in software

engineering (Ciancarini and Wooldridge, 2001). It has also infiltrated the area of operating

systems, where autonomic computing refers to computer systems and networks that

monitor and control themselves with a perceive–act loop and machine learning methods

(Kephart and Chess, 2003). Noting that a collection of agent programs designed to work

well together in a true multiagent environment necessarily exhibits modularity—the

programs share no internal state and communicate with each other only through the

environment—it is common within the field of multiagent systems to design the agent

program of a single agent as a collection of autonomous sub-agents. In some cases, one can

even prove that the resulting system gives the same optimal solutions as a monolithic

design.

Autonomic computing

The goal-based view of agents also dominates the cognitive psychology tradition in the area

of problem solving, beginning with the enormously influential Human Problem Solving

(Newell and Simon, 1972) and running through all of Newell’s later work (Newell, 1990).

Goals, further analyzed as desires (general) and intentions (currently pursued), are central to

the influential theory of agents developed by Michael Bratman (1987).

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBA2.xhtml#P700101715800000000000000000EBA2

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001644C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001578A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C8B.xhtml#P7001017158000000000000000014C8B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001561B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001586F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016584

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AEC

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016279

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015BB5

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016561

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001588E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015EBE

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016252

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P700101715800000000000000001624A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015766

As noted in Chapter 1 , the development of utility theory as a basis for rational behavior

goes back hundreds of years. In AI, early research eschewed utilities in favor of goals, with

some exceptions (Feldman and Sproull, 1977). The resurgence of interest in probabilistic

methods in the 1980s led to the acceptance of maximization of expected utility as the most

general framework for decision making (Horvitz et al., 1988). The text by Pearl (1988) was

the first in AI to cover probability and utility theory in depth; its exposition of practical

methods for reasoning and decision making under uncertainty was probably the single

biggest factor in the rapid shift towards utility-based agents in the 1990s (see Chapter 16 ).

The formalization of reinforcement learning within a decision-theoretic framework also

contributed to this shift (Sutton, 1988). Somewhat remarkably, almost all AI research until

very recently has assumed that the performance measure can be exactly and correctly

specified in the form of a utility function or reward function (Hadfield-Menell et al., 2017a;

Russell, 2019).

The general design for learning agents portrayed in Figure 2.15 is classic in the machine

learning literature (Buchanan et al., 1978; Mitchell, 1997). Examples of the design, as

embodied in programs, go back at least as far as Arthur Samuel’s (1959, 1967) learning

program for playing checkers. Learning agents are discussed in depth in Chapters 19 –22 .

Some early papers on agent-based approaches are collected by Huhns and Singh (1998) and

Wooldridge and Rao (1999). Texts on multiagent systems provide a good introduction to

many aspects of agent design (Weiss, 2000a; Wooldridge, 2009). Several conference series

devoted to agents began in the 1990s, including the International Workshop on Agent

Theories, Architectures, and Languages (ATAL), the International Conference on

Autonomous Agents (AGENTS), and the International Conference on Multi-Agent Systems

(ICMAS). In 2002, these three merged to form the International Joint Conference on

Autonomous Agents and Multi-Agent Systems (AAMAS). From 2000 to 2012 there were

annual workshops on Agent-Oriented Software Engineering (AOSE). The journal

Autonomous Agents and Multi-Agent

,

Systems was founded in 1998. Finally, Dung Beetle Ecology

(Hanski and Cambefort, 1991) provides a wealth of interesting information on the behavior

of dung beetles. YouTube has inspiring video recordings of their activities.

 

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EBA2.xhtml#P700101715800000000000000000EBA2

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AC1

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015D86

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016303

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016633

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015C8C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016482

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EF69.xhtml#P700101715800000000000000000F019

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000157BC

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000161BB

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000164BA

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000164BC

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001316E.xhtml#P700101715800000000000000001316E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015DBC

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167FC

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016767

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167FA

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015CB5

II Problem-solving

Chapter 3

Solving Problems by Searching

In which we see how an agent can look ahead to find a sequence of actions that will eventually

achieve its goal.

When the correct action to take is not immediately obvious, an agent may need to to plan

ahead: to consider a sequence of actions that form a path to a goal state. Such an agent is

called a problem-solving agent, and the computational process it undertakes is called

search.

Problem-solving agent

Search

Problem-solving agents use atomic representations, as described in Section 2.4.7 —that is,

states of the world are considered as wholes, with no internal structure visible to the

problem-solving algorithms. Agents that use factored or structured representations of states

are called planning agents and are discussed in Chapters 7 and 11 .

We will cover several search algorithms. In this chapter, we consider only the simplest

environments: episodic, single agent, fully observable, deterministic, static, discrete, and

known. We distinguish between informed algorithms, in which the agent can estimate how

far it is from the goal, and uninformed algorithms, where no such estimate is available.

Chapter 4 relaxes the constraints on environments, and Chapter 5 considers multiple

agents.

 

 

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EF69.xhtml#P700101715800000000000000000F032

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FE31.xhtml#P700101715800000000000000000FE31

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B83.xhtml#P7001017158000000000000000010B83

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F57F.xhtml#P700101715800000000000000000F57F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F90E.xhtml#P700101715800000000000000000F90E

This chapter uses the concepts of asymptotic complexity (that is, notation). Readers

unfamiliar with these concepts should consult Appendix A .

O(n)

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015390.xhtml#P7001017158000000000000000015390

3.1 Problem-Solving Agents

Imagine an agent enjoying a touring vacation in Romania. The agent wants to take in the

sights, improve its Romanian, enjoy the nightlife, avoid hangovers, and so on. The decision

problem is a complex one. Now, suppose the agent is currently in the city of Arad and has a

nonrefundable ticket to fly out of Bucharest the following day. The agent observes street

signs and sees that there are three roads leading out of Arad: one toward Sibiu, one to

Timisoara, and one to Zerind. None of these are the goal, so unless the agent is familiar with

the geography of Romania, it will not know which road to follow.

1 We are assuming that most readers are in the same position and can easily imagine themselves to be as clueless as our agent. We

apologize to Romanian readers who are unable to take advantage of this pedagogical device.

If the agent has no additional information—that is, if the environment is unknown—then the

agent can do no better than to execute one of the actions at random. This sad situation is

discussed in Chapter 4 . In this chapter, we will assume our agents always have access to

information about the world, such as the map in Figure 3.1 . With that information, the

agent can follow this four-phase problem-solving process:

GOAL FORMULATION: The agent adopts the goal of reaching Bucharest. Goals

organize behavior by limiting the objectives and hence the actions to be considered.

Goal formulation

PROBLEM FORMULATION: The agent devises a description of the states and actions

necessary to reach the goal—an abstract model of the relevant part of the world. For our

agent, one good model is to consider the actions of traveling from one city to an

adjacent city, and therefore the only fact about the state of the world that will change

due to an action is the current city.

1

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F57F.xhtml#P700101715800000000000000000F57F

Problem formulation

SEARCH: Before taking any action in the real world, the agent simulates sequences of

actions in its model, searching until it finds a sequence of actions that reaches the goal.

Such a sequence is called a solution. The agent might have to simulate multiple

sequences that do not reach the goal, but eventually it will find a solution (such as going

from Arad to Sibiu to fa*garas to Bucharest), or it will find that no solution is possible.

Search

Solution

EXECUTION: The agent can now execute the actions in the solution, one at a time.

Execution

Figure 3.1

A simplified road map of part of Romania, with road distances

,

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FC3C.xhtml#P700101715800000000000000000FC3C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FCB9.xhtml#P700101715800000000000000000FCB9

6.4 Local Search for CSPs 197

6.5 The Structure of Problems 199

Summary 203

Bibliographical and Historical Notes 204

III Knowledge, reasoning, and planning

7 Logical Agents 208

7.1 Knowledge-Based Agents 209

7.2 The Wumpus World 210

7.3 Logic 214

7.4 Propositional Logic: A Very Simple Logic 217

7.5 Propositional Theorem Proving 222

7.6 Effective Propositional Model Checking 232

7.7 Agents Based on Propositional Logic 237

Summary 246

Bibliographical and Historical Notes 247

8 First-Order Logic 251

8.1 Representation Revisited 251

8.2 Syntax and Semantics of First-Order Logic 256

8.3 Using First-Order Logic 265

8.4 Knowledge Engineering in First-Order Logic 271

Summary 277

Bibliographical and Historical Notes 278

9 Inference in First-Order Logic 280

9.1 Propositional vs. First-Order Inference 280

9.2 Unification and First-Order Inference 282

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FD34.xhtml#P700101715800000000000000000FD34

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FD76.xhtml#P700101715800000000000000000FD76

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FDEE.xhtml#P700101715800000000000000000FDEE

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FDFE.xhtml#P700101715800000000000000000FDFE

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010395.xhtml#P7001017158000000000000000010395

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FE31.xhtml#P700101715800000000000000000FE31

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FE45.xhtml#P700101715800000000000000000FE45

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FE74.xhtml#P700101715800000000000000000FE74

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FF5C.xhtml#P700101715800000000000000000FF5C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FFB9.xhtml#P700101715800000000000000000FFB9

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001007B.xhtml#P700101715800000000000000001007B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001020A.xhtml#P700101715800000000000000001020A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010278.xhtml#P7001017158000000000000000010278

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010334.xhtml#P7001017158000000000000000010334

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010353.xhtml#P7001017158000000000000000010353

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001039E.xhtml#P700101715800000000000000001039E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000103A9.xhtml#P70010171580000000000000000103A9

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010408.xhtml#P7001017158000000000000000010408

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000104F2.xhtml#P70010171580000000000000000104F2

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010583.xhtml#P7001017158000000000000000010583

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001060C.xhtml#P700101715800000000000000001060C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001061E.xhtml#P700101715800000000000000001061E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001063E.xhtml#P700101715800000000000000001063E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010647.xhtml#P7001017158000000000000000010647

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001066D.xhtml#P700101715800000000000000001066D

9.3 Forward Chaining 286

9.4 Backward Chaining 293

9.5 Resolution 298

Summary 309

Bibliographical and Historical Notes 310

10 Knowledge Representation 314

10.1 Ontological Engineering 314

10.2 Categories and Objects 317

10.3 Events 322

10.4 Mental Objects and Modal Logic 326

10.5 Reasoning Systems for Categories 329

10.6 Reasoning with Default Information 333

Summary 337

Bibliographical and Historical Notes 338

11 Automated Planning 344

11.1 Definition of Classical Planning 344

11.2 Algorithms for Classical Planning 348

11.3 Heuristics for Planning 353

11.4 Hierarchical Planning 356

11.5 Planning and Acting in Nondeterministic Domains 365

11.6 Time, Schedules, and Resources 374

11.7 Analysis of Planning Approaches 378

Summary 379

Bibliographical and Historical Notes 380

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000106D2.xhtml#P70010171580000000000000000106D2

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010777.xhtml#P7001017158000000000000000010777

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000107FB.xhtml#P70010171580000000000000000107FB

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010905.xhtml#P7001017158000000000000000010905

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001091E.xhtml#P700101715800000000000000001091E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001094A.xhtml#P700101715800000000000000001094A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010954.xhtml#P7001017158000000000000000010954

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001097B.xhtml#P700101715800000000000000001097B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000109FC.xhtml#P70010171580000000000000000109FC

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010A40.xhtml#P7001017158000000000000000010A40

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010A7E.xhtml#P7001017158000000000000000010A7E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010ADB.xhtml#P7001017158000000000000000010ADB

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B1F.xhtml#P7001017158000000000000000010B1F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B3A.xhtml#P7001017158000000000000000010B3A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B83.xhtml#P7001017158000000000000000010B83

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010B8C.xhtml#P7001017158000000000000000010B8C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010BFC.xhtml#P7001017158000000000000000010BFC

,

in miles.

It is an important property that in a fully observable, deterministic, known environment, the

solution to any problem is a fixed sequence of actions: drive to Sibiu, then fa*garas, then

Bucharest. If the model is correct, then once the agent has found a solution, it can ignore its

percepts while it is executing the actions—closing its eyes, so to speak—because the solution

is guaranteed to lead to the goal. Control theorists call this an open-loop system: ignoring

the percepts breaks the loop between agent and environment. If there is a chance that the

model is incorrect, or the environment is nondeterministic, then the agent would be safer

using a closed-loop approach that monitors the percepts (see Section 4.4 ).

Open-loop

Closed-loop

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F77C.xhtml#P700101715800000000000000000F77C

In partially observable or nondeterministic environments, a solution would be a branching

strategy that recommends different future actions depending on what percepts arrive. For

example, the agent might plan to drive from Arad to Sibiu but might need a contingency

plan in case it arrives in Zerind by accident or finds a sign saying “Drum Închis” (Road

Closed).

3.1.1 Search problems and solutions

A search problem can be defined formally as follows:

Problem

A set of possible states that the environment can be in. We call this the state space.

States

State space

The initial state that the agent starts in. For example: Arad.

Initial state

A set of one or more goal states. Sometimes there is one goal state (e.g., Bucharest),

sometimes there is a small set of alternative goal states, and sometimes the goal is

defined by a property that applies to many states (potentially an infinite number). For

example, in a vacuum-cleaner world, the goal might be to have no dirt in any location,

regardless of any other facts about the state. We can account for all three of these

possibilities by specifying an IS-GOAL method for a problem. In this chapter we will

sometimes say “the goal” for simplicity, but what we say also applies to “any one of the

possible goal states.”

Goal states

The actions available to the agent. Given a state ACTIONS returns a finite set of

actions that can be executed in We say that each of these actions is applicable in

An example:

2 For problems with an infinite number of actions we would need techniques that go beyond this chapter.

Action

Applicable

A transition model, which describes what each action does. RESULT returns the state

that results from doing action in state For example,

Transition model

s, (s) 2

s. s.

ACTIONS(Arad) = {ToSibiu,ToTimisoara,ToZerind}.

(s,a)

a s.

RESULT(Arad,ToZerind) = Zerind.

An action cost function, denoted by ACTION-COST when we are programming or

when we are doing math, that gives the numeric cost of applying action in

state to reach state A problem-solving agent should use a cost function that reflects

its own performance measure; for example, for route-finding agents, the cost of an

action might be the length in miles (as seen in Figure 3.1 ), or it might be the time it

takes to complete the action.

Action cost function

A sequence of actions forms a path, and a solution is a path from the initial state to a goal

state. We assume that action costs are additive; that is, the total cost of a path is the sum of

the individual action costs. An optimal solution has the lowest path cost among all

solutions. In this chapter, we assume that all action costs will be positive, to avoid certain

complications.

3 In any problem with a cycle of net negative cost, the cost-optimal solution is to go around that cycle an infinite number of times. The

Bellman–Ford and Floyd–Warshall algorithms (not covered here) handle negative-cost actions, as long as there are no negative cycles.

It is easy to accommodate zero-cost actions, as long as the number of consecutive zero-cost actions is bounded. For example, we

might have a robot where there is a cost to move, but zero cost to rotate 90°; the algorithms in this chapter can handle this as long as

no more than three consecutive 90° turns are allowed. There is also a complication with problems that have an infinite number of

arbitrarily small action costs. Consider a version of Zeno’s paradox where there is an action to move half way to the goal, at a cost of

half of the previous move. This problem has no solution with a finite number of actions, but to prevent a search from taking an

unbounded number of actions without quite reaching the goal, we can require that all action costs be at least for some small positive

value

Path

Optimal solution

(s, a, s′)

c(s, a, s′) a

s s

′.

3

ϵ,

ϵ.

The state space can be represented as a graph in which the vertices are states and the

directed edges between them are actions. The map of Romania shown in Figure 3.1 is such

a graph, where each road indicates two actions, one in each direction.

Graph

3.1.2 Formulating problems

Our formulation of the problem of getting to Bucharest is a model—an abstract

mathematical description—and not the real thing. Compare the simple atomic state

description Arad to an actual cross-country trip, where the state of the world includes so

many things: the traveling companions, the current radio program, the scenery out of the

window, the proximity of law enforcement officers, the distance to the next rest stop, the

condition of the road, the weather, the traffic, and so on. All these considerations are left out

of our model because they are irrelevant to the problem of finding a route to Bucharest.

The process of removing detail from a representation is called abstraction. A good problem

formulation has the right level of detail. If the actions were at the level of “move the right

foot forward a centimeter” or “turn the steering wheel one degree left,” the agent would

probably never find its way out of the parking lot, let alone to Bucharest.

Abstraction

Can we be more precise about the appropriate level of abstraction? Think of the abstract

states and actions we have chosen as corresponding to large sets of detailed world states

and detailed action sequences. Now consider a solution to the abstract problem: for

example, the path from Arad to Sibiu to Rimnicu Vilcea to Pitesti to Bucharest. This abstract

solution corresponds to a large number of more detailed paths. For example, we could drive

with the radio on between Sibiu and Rimnicu Vilcea, and then switch it off for the rest of the

trip.

Level of abstraction

The abstraction is valid if we can elaborate any abstract solution into a solution in the more

detailed world; a sufficient condition is that for every detailed state that is “in Arad,” there is

a detailed path to some state that is “in Sibiu,” and so on. The abstraction is useful if

carrying out each of the actions in the solution is easier than the original problem; in our

case, the action “drive from Arad to Sibiu” can be carried out without further search or

planning by a driver with average skill. The choice of a good abstraction thus involves

removing as much detail as possible while retaining validity and ensuring that the abstract

actions are easy to carry out. Were it not for the ability to construct useful abstractions,

intelligent agents would be completely swamped by the real world.

4 See Section 11.4 .

4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010C89.xhtml#P7001017158000000000000000010C89

3.2 Example Problems

The problem-solving approach has been applied to a vast array of task environments. We

list some of the best known here, distinguishing between standardized and real-world

problems. A standardized problem is intended to illustrate or exercise various problem-

solving methods. It can be given a concise, exact description and hence is suitable as a

,

benchmark for researchers to compare the performance of algorithms. A real-world

problem, such as robot navigation, is one whose solutions people actually use, and whose

formulation is idiosyncratic, not standardized, because, for example, each robot has different

sensors that produce different data.

Standardized problem

Real-world problem

3.2.1 Standardized problems

Grid world

A grid world problem is a two-dimensional rectangular array of square cells in which agents

can move from cell to cell. Typically the agent can move to any obstacle-free adjacent cell—

horizontally or vertically and in some problems diagonally. Cells can contain objects, which

the agent can pick up, push, or otherwise act upon; a wall or other impassible obstacle in a

cell prevents an agent from moving into that cell. The vacuum world from Section 2.1 can

be formulated as a grid world problem as follows:

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000EE56.xhtml#P700101715800000000000000000EE56

STATES: A state of the world says which objects are in which cells. For the vacuum

world, the objects are the agent and any dirt. In the simple two-cell version, the agent

can be in either of the two cells, and each call can either contain dirt or not, so there are

states (see Figure 3.2 ). In general, a vacuum environment with cells has

states.

Figure 3.2

The state-space graph for the two-cell vacuum world. There are 8 states and three actions for each

state:

INITIAL STATE: Any state can be designated as the initial state.

ACTIONS: In the two-cell world we defined three actions: Suck, move Left, and move

Right. In a two-dimensional multi-cell world we need more movement actions. We

could add Upward and Downward, giving us four absolute movement actions, or we

could switch to egocentric actions, defined relative to the viewpoint of the agent—for

example, Forward, Backward, TurnRight, and TurnLeft.

TRANSITION MODEL: Suck removes any dirt from the agent’s cell; Forward moves the

agent ahead one cell in the direction it is facing, unless it hits a wall, in which case the

action has no effect. Backward moves the agent in the opposite direction, while

TurnRight and TurnLeft change the direction it is facing by

GOAL STATES: The states in which every cell is clean.

ACTION COST: Each action costs 1.

2 ⋅ 2 ⋅ 2 = 8  n

n ⋅ 2n

L = Left,R = Right,S = Suck.

90∘.

Sokoban puzzle

Another type of grid world is the sokoban puzzle, in which the agent’s goal is to push a

number of boxes, scattered about the grid, to designated storage locations. There can be at

most one box per cell. When an agent moves forward into a cell containing a box and there

is an empty cell on the other side of the box, then both the box and the agent move forward.

The agent can’t push a box into another box or a wall. For a world with non-obstacle cells

and boxes, there are states; for example on an grid with a dozen

boxes, there are over 200 trillion states.

In a sliding-tile puzzle, a number of tiles (sometimes called blocks or pieces) are arranged

in a grid with one or more blank spaces so that some of the tiles can slide into the blank

space. One variant is the Rush Hour puzzle, in which cars and trucks slide around a

grid in an attempt to free a car from the traffic jam. Perhaps the best-known variant is the 8-

puzzle (see Figure 3.3 ), which consists of a grid with eight numbered tiles and one

blank space, and the 15-puzzle on a grid. The object is to reach a specified goal state,

such as the one shown on the right of the figure. The standard formulation of the 8 puzzle is

as follows:

STATES: A state description specifies the location of each of the tiles.

INITIAL STATE: Any state can be designated as the initial state. Note that a parity

property partitions the state space—any given goal can be reached from exactly half of

the possible initial states (see Exercise 3.PART).

ACTIONS: While in the physical world it is a tile that slides, the simplest way of

describing an action is to think of the blank space moving Left, Right, Up, or Down. If the

blank is at an edge or corner then not all actions will be applicable.

TRANSITION MODEL: Maps a state and action to a resulting state; for example, if we

apply Left to the start state in Figure 3.3 , the resulting state has the 5 and the blank

switched.

Figure 3.3

n

b n × n!/(b!(n − b)!) 8 × 8

6 × 6

 3 × 3

4 × 4

A typical instance of the 8-puzzle.

GOAL STATE: Although any state could be the goal, we typically specify a state with

the numbers in order, as in Figure 3.3 .

ACTION COST: Each action costs 1.

Sliding-tile puzzle

8-puzzle

15-puzzle

Note that every problem formulation involves abstractions. The 8-puzzle actions are

abstracted to their beginning and final states, ignoring the intermediate locations where the

tile is sliding. We have abstracted away actions such as shaking the board when tiles get

stuck and ruled out extracting the tiles with a knife and putting them back again. We are left

with a description of the rules, avoiding all the details of physical manipulations.

Our final standardized problem was devised by Donald Knuth (1964) and illustrates how

infinite state spaces can arise. Knuth conjectured that starting with the number 4, a

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015EF4

sequence of square root, floor, and factorial operations can reach any desired positive

integer. For example, we can reach 5 from 4 as follows:

The problem definition is simple:

STATES: Positive real numbers.

INITIAL STATE: 4.

ACTIONS: Apply square root, floor, or factorial operation (factorial for integers only).

TRANSITION MODEL: As given by the mathematical definitions of the operations.

GOAL STATE: The desired positive integer.

ACTION COST: Each action costs 1.

The state space for this problem is infinite: for any integer greater than 2 the factorial

operator will always yield a larger integer. The problem is interesting because it explores

very large numbers: the shortest path to 5 goes through

Infinite state spaces arise frequently in tasks

involving the generation of mathematical expressions, circuits, proofs, programs, and other

recursively defined objects.

3.2.2 Real-world problems

We have already seen how the route-finding problem is defined in terms of specified

locations and transitions along edges between them. Route-finding algorithms are used in a

variety of applications. Some, such as Web sites and in-car systems that provide driving

directions, are relatively straightforward extensions of the Romania example. (The main

complications are varying costs due to traffic-dependent delays, and rerouting due to road

closures.) Others, such as routing video streams in computer networks, military operations

planning, and airline travel-planning systems, involve much more complex specifications.

Consider the airline travel problems that must be solved by a travel-planning Web site:

STATES: Each state obviously includes a location (e.g., an airport) and the current time.

Furthermore, because the cost of an action (a flight segment) may depend on previous

  

 

⎷√√√(4!)!⌋ = 5.

(4!)! = 620,448,401,733,239,439,360,000.

segments, their fare bases, and their status as domestic or international, the state must

record extra information about these “historical” aspects.

INITIAL STATE: The user’s home airport.

ACTIONS: Take any flight from the current location, in any seat class, leaving after the

current time, leaving enough time for within-airport transfer if needed.

TRANSITION MODEL: The state resulting from taking a flight will have the flight’s

destination as the new location and the flight’s arrival time as the new time.

GOAL STATE: A destination city. Sometimes the goal can be more complex, such as

“arrive at the destination

,

on a nonstop flight.”

ACTION COST: A combination of monetary cost, waiting time, flight time, customs and

immigration procedures, seat quality, time of day, type of airplane, frequent-flyer reward

points, and so on.

Commercial travel advice systems use a problem formulation of this kind, with many

additional complications to handle the airlines’ byzantine fare structures. Any seasoned

traveler knows, however, that not all air travel goes according to plan. A really good system

should include contingency plans—what happens if this flight is delayed and the connection

is missed?

Touring problems describe a set of locations that must be visited, rather than a single goal

destination. The traveling salesperson problem (TSP) is a touring problem in which every

city on a map must be visited. The aim is to find a tour with cost (or in the optimization

version, to find a tour with the lowest cost possible). An enormous amount of effort has

been expended to improve the capabilities of TSP algorithms. The algorithms can also be

extended to handle fleets of vehicles. For example, a search and optimization algorithm for

routing school buses in Boston saved $5 million, cut traffic and air pollution, and saved time

for drivers and students (Bertsimas et al., 2019). In addition to planning trips, search

algorithms have been used for tasks such as planning the movements of automatic circuit-

board drills and of stocking machines on shop floors.

Touring problem

< C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000156B0

Traveling salesperson problem (TSP)

A VLSI layout problem requires positioning millions of components and connections on a

chip to minimize area, minimize circuit delays, minimize stray capacitances, and maximize

manufacturing yield. The layout problem comes after the logical design phase and is usually

split into two parts: cell layout and channel routing. In cell layout, the primitive

components of the circuit are grouped into cells, each of which performs some recognized

function. Each cell has a fixed footprint (size and shape) and requires a certain number of

connections to each of the other cells. The aim is to place the cells on the chip so that they

do not overlap and so that there is room for the connecting wires to be placed between the

cells. Channel routing finds a specific route for each wire through the gaps between the

cells. These search problems are extremely complex, but definitely worth solving.

VLSI layout

Robot navigation is a generalization of the route-finding problem described earlier. Rather

than following distinct paths (such as the roads in Romania), a robot can roam around, in

effect making its own paths. For a circular robot moving on a flat surface, the space is

essentially two-dimensional. When the robot has arms and legs that must also be controlled,

the search space becomes many-dimensional—one dimension for each joint angle.

Advanced techniques are required just to make the essentially continuous search space

finite (see Chapter 26 ). In addition to the complexity of the problem, real robots must also

deal with errors in their sensor readings and motor controls, with partial observability, and

with other agents that might alter the environment.

Robot navigation

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C8B.xhtml#P7001017158000000000000000014C8B

Automatic assembly sequencing of complex objects (such as electric motors) by a robot has

been standard industry practice since the 1970s. Algorithms first find a feasible assembly

sequence and then work to optimize the process. Minimizing the amount of manual human

labor on the assembly line can produce significant savings in time and cost. In assembly

problems, the aim is to find an order in which to assemble the parts of some object. If the

wrong order is chosen, there will be no way to add some part later in the sequence without

undoing some of the work already done. Checking an action in the sequence for feasibility is

a difficult geometrical search problem closely related to robot navigation. Thus, the

generation of legal actions is the expensive part of assembly sequencing. Any practical

algorithm must avoid exploring all but a tiny fraction of the state space. One important

assembly problem is protein design, in which the goal is to find a sequence of amino acids

that will fold into a three-dimensional protein with the right properties to cure some

disease.

Automatic assembly sequencing

Protein design

3.3 Search Algorithms

A search algorithm takes a search problem as input and returns a solution, or an indication

of failure. In this chapter we consider algorithms that superimpose a search tree over the

state-space graph, forming various paths from the initial state, trying to find a path that

reaches a goal state. Each node in the search tree corresponds to a state in the state space

and the edges in the search tree correspond to actions. The root of the tree corresponds to

the initial state of the problem.

Search algorithm

Node

It is important to understand the distinction between the state space and the search tree.

The state space describes the (possibly infinite) set of states in the world, and the actions

that allow transitions from one state to another. The search tree describes paths between

these states, reaching towards the goal. The search tree may have multiple paths to (and

thus multiple nodes for) any given state, but each node in the tree has a unique path back to

the root (as in all trees).

Expand

Figure 3.4 shows the first few steps in finding a path from Arad to Bucharest. The root

node of the search tree is at the initial state, Arad. We can expand the node, by considering

the available ACTIONS for that state, using the RESULT function to see where those actions lead

to, and generating a new node (called a child node or successor node) for each of the

resulting states. Each child node has Arad as its parent node.

Figure 3.4

Three partial search trees for finding a route from Arad to Bucharest. Nodes that have been expanded are

lavender with bold letters; nodes on the frontier that have been generated but not yet expanded are in

green; the set of states corresponding to these two types of nodes are said to have been reached. Nodes

that could be generated next are shown in faint dashed lines. Notice in the bottom tree there is a cycle

from Arad to Sibiu to Arad; that can’t be an optimal path, so search should not continue from there.

Generating

Child node

Successor node

Parent node

Now we must choose which of these three child nodes to consider next. This is the essence

of search—following up one option now and putting the others aside for later. Suppose we

choose to expand Sibiu first. Figure 3.4 (bottom) shows the result: a set of 6 unexpanded

nodes (outlined in bold). We call this the frontier of the search tree. We say that any state

that has had a node generated for it has been reached (whether or not that node has been

expanded). Figure 3.5 shows the search tree superimposed on the state-space graph.

5 Some authors call the frontier the open list, which is both geographically less evocative and computationally less appropriate,

because a queue is more efficient than a list here. Those authors use the term closed list to refer to the set of previously expanded

nodes, which in our terminology would be the reached nodes minus the frontier.

Figure 3.5

A sequence of search trees generated by a graph search on the Romania problem of Figure 3.1 . At each

stage, we have expanded every node on the frontier, extending every path with all applicable actions that

don’t result in a state that has already been reached. Notice that at the third stage, the topmost city

(Oradea) has two successors, both of which have already been reached by other paths, so

,

no paths are

extended from Oradea.

5 

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F0A7.xhtml#P700101715800000000000000000F0C0

Frontier

Reached

Note that the frontier separates two regions of the state-space graph: an interior region

where every state has been expanded, and an exterior region of states that have not yet

been reached. This property is illustrated in Figure 3.6 .

Figure 3.6

The separation property of graph search, illustrated on a rectangular-grid problem. The frontier (green)

separates the interior (lavender) from the exterior (faint dashed). The frontier is the set of nodes (and

corresponding states) that have been reached but not yet expanded; the interior is the set of nodes (and

corresponding states) that have been expanded; and the exterior is the set of states that have not been

reached. In (a), just the root has been expanded. In (b), the top frontier node is expanded. In (c), the

remaining successors of the root are expanded in clockwise order.

Separator

3.3.1 Best-first search

How do we decide which node from the frontier to expand next? A very general approach is

called best-first search, in which we choose a node, with minimum value of some

n,

evaluation function, Figure 3.7 shows the algorithm. On each iteration we choose a

node on the frontier with minimum value, return it if its state is a goal state, and

otherwise apply EXPAND to generate child nodes. Each child node is added to the frontier if it

has not been reached before, or is re-added if it is now being reached with a path that has a

lower path cost than any previous path. The algorithm returns either an indication of failure,

or a node that represents a path to a goal. By employing different functions, we get

different specific algorithms, which this chapter will cover.

Figure 3.7

The best-first search algorithm, and the function for expanding a node. The data structures used here are

described in Section 3.3.2 . See Appendix B for yield.

Best-first search

Evaluation function

f(n). 

f(n)

f(n)

 

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001543B.xhtml#P700101715800000000000000001543B

3.3.2 Search data structures

Search algorithms require a data structure to keep track of the search tree. A node in the

tree is represented by a data structure with four components:

node.STATE: the state to which the node corresponds;

node.PARENT: the node in the tree that generated this node;

node.ACTION: the action that was applied to the parent’s state to generate this node;

node.PATH-COST: the total cost of the path from the initial state to this node. In

mathematical formulas, we use as a synonym for PATH-COST.

Following the PARENT pointers back from a node allows us to recover the states and actions

along the path to that node. Doing this from a goal node gives us the solution.

We need a data structure to store the frontier. The appropriate choice is a queue of some

kind, because the operations on a frontier are:

IS-EMPTY(frontier) returns true only if there are no nodes in the frontier.

POP(frontier) removes the top node from the frontier and returns it.

TOP(frontier) returns (but does not remove) the top node of the frontier.

ADD(node, frontier) inserts node into its proper place in the queue.

Queue

Three kinds of queues are used in search algorithms:

A priority queue first pops the node with the minimum cost according to some

evaluation function, It is used in best-first search.

g(node)

f.

Priority queue

A FIFO queue or first-in-first-out queue first pops the node that was added to the queue

first; we shall see it is used in breadth-first search.

FIFO queue

A LIFO queue or last-in-first-out queue (also known as a stack) pops first the most

recently added node; we shall see it is used in depth-first search.

LIFO queue

Stack

The reached states can be stored as a lookup table (e.g. a hash table) where each key is a

state and each value is the node for that state.

3.3.3 Redundant paths

The search tree shown in Figure 3.4 (bottom) includes a path from Arad to Sibiu and back

to Arad again. We say that Arad is a repeated state in the search tree, generated in this case

by a cycle (also known as a loopy path). So even though the state space has only 20 states,

the complete search tree is infinite because there is no limit to how often one can traverse a

loop.

Repeated state

Cycle

Loopy path

A cycle is a special case of a redundant path. For example, we can get to Sibiu via the path

Arad–Sibiu (140 miles long) or the path Arad–Zerind–Oradea–Sibiu (297 miles long). This

second path is redundant—it’s just a worse way to get to the same state—and need not be

considered in our quest for optimal paths.

Redundant path

Consider an agent in a grid world, with the ability to move to any of 8 adjacent

squares. If there are no obstacles, the agent can reach any of the 100 squares in 9 moves or

fewer. But the number of paths of length 9 is almost (a bit less because of the edges of the

grid), or more than 100 million. In other words, the average cell can be reached by over a

million redundant paths of length 9, and if we eliminate redundant paths, we can complete a

search roughly a million times faster. As the saying goes, algorithms that cannot remember the

past are doomed to repeat it. There are three approaches to this issue.

First, we can remember all previously reached states (as best-first search does), allowing us

to detect all redundant paths, and keep only the best path to each state. This is appropriate

for state spaces where there are many redundant paths, and is the preferred choice when

the table of reached states will fit in memory.

10 × 10

89

Graph search

Tree-like search

Second, we can not worry about repeating the past. There are some problem formulations

where it is rare or impossible for two paths to reach the same state. An example would be

an assembly problem where each action adds a part to an evolving assemblage, and there is

an ordering of parts so that it is possible to add and then but not and then For

those problems, we could save memory space if we don’t track reached states and we don’t

check for redundant paths. We call a search algorithm a graph search if it checks for

redundant paths and a tree-like search if it does not check. The BEST-FIRST-SEARCH algorithm

in Figure 3.7 is a graph search algorithm; if we remove all references to reached we get a

tree-like search that uses less memory but will examine redundant paths to the same state,

and thus will run slower.

6 We say “tree-like search” because the state space is still the same graph no matter how we search it; we are just choosing to treat it

as if it were a tree, with only one path from each node back to the root.

Third, we can compromise and check for cycles, but not for redundant paths in general.

Since each node has a chain of parent pointers, we can check for cycles with no need for

additional memory by following up the chain of parents to see if the state at the end of the

path has appeared earlier in the path. Some implementations follow this chain all the way

up, and thus eliminate all cycles; other implementations follow only a few links (e.g., to the

parent, grandparent, and great-grandparent), and thus take only a constant amount of time,

while eliminating all short cycles (and relying on other mechanisms to deal with long

cycles).

3.3.4 Measuring problem-solving performance

Before we get into the design of various search algorithms, we will consider the criteria used

to choose among them. We can evaluate an algorithm’s performance in four ways:

A B, B A.

6

COMPLETENESS: Is the algorithm guaranteed to find a solution when there is one, and

to correctly report failure when there is not?

Completeness

COST OPTIMALITY: Does it find a solution with the lowest path cost of all solutions?

,

7 Some authors use the term “admissibility” for the property of finding the lowest-cost solution, and some use just “optimality,”

but that can be confused with other types of optimality.

Cost optimality

TIME COMPLEXITY: How long does it take to find a solution? This can be measured in

seconds, or more abstractly by the number of states and actions considered.

Time complexity

SPACE COMPLEXITY: How much memory is needed to perform the search?

Space complexity

To understand completeness, consider a search problem with a single goal. That goal could

be anywhere in the state space; therefore a complete algorithm must be capable of

systematically exploring every state that is reachable from the initial state. In finite state

7

spaces that is straightforward to achieve: as long as we keep track of paths and cut off ones

that are cycles (e.g. Arad to Sibiu to Arad), eventually we will reach every reachable state.

In infinite state spaces, more care is necessary. For example, an algorithm that repeatedly

applied the “factorial” operator in Knuth’s “4” problem would follow an infinite path from 4

to 4! to (4!)!, and so on. Similarly, on an infinite grid with no obstacles, repeatedly moving

forward in a straight line also follows an infinite path of new states. In both cases the

algorithm never returns to a state it has reached before, but is incomplete because wide

expanses of the state space are never reached.

To be complete, a search algorithm must be systematic in the way it explores an infinite

state space, making sure it can eventually reach any state that is connected to the initial

state. For example, on the infinite grid, one kind of systematic search is a spiral path that

covers all the cells that are steps from the origin before moving out to cells that are

steps away. Unfortunately, in an infinite state space with no solution, a sound algorithm

needs to keep searching forever; it can’t terminate because it can’t know if the next state will

be a goal.

Systematic

Time and space complexity are considered with respect to some measure of the problem

difficulty. In theoretical computer science, the typical measure is the size of the state-space

graph, where is the number of vertices (state nodes) of the graph and is the

number of edges (distinct state/action pairs). This is appropriate when the graph is an

explicit data structure, such as the map of Romania. But in many AI problems, the graph is

represented only implicitly by the initial state, actions, and transition model. For an implicit

state space, complexity can be measured in terms of the depth or number of actions in an

optimal solution; the maximum number of actions in any path; and the branching

factor or number of successors of a node that need to be considered.

Depth

s s + 1

|V | + |E|, |V | |E|

d,

m, b,

Branching factor

3.4 Uninformed Search Strategies

An uninformed search algorithm is given no clue about how close a state is to the goal(s).

For example, consider our agent in Arad with the goal of reaching Bucharest. An

uninformed agent with no knowledge of Romanian geography has no clue whether going to

Zerind or Sibiu is a better first step. In contrast, an informed agent (Section 3.5 ) who

knows the location of each city knows that Sibiu is much closer to Bucharest and thus more

likely to be on the shortest path.

3.4.1 Breadth-first search

When all actions have the same cost, an appropriate strategy is breadth-first search, in

which the root node is expanded first, then all the successors of the root node are expanded

next, then their successors, and so on. This is a systematic search strategy that is therefore

complete even on infinite state spaces. We could implement breadth-first search as a call to

BEST-FIRST-SEARCH where the evaluation function is the depth of the node—that is, the

number of actions it takes to reach the node.

Breadth-first search

However, we can get additional efficiency with a couple of tricks. A first-in-first-out queue

will be faster than a priority queue, and will give us the correct order of nodes: new nodes

(which are always deeper than their parents) go to the back of the queue, and old nodes,

which are shallower than the new nodes, get expanded first. In addition, reached can be a set

of states rather than a mapping from states to nodes, because once we’ve reached a state, we

can never find a better path to the state. That also means we can do an early goal test,

checking whether a node is a solution as soon as it is generated, rather than the late goal test

that best-first search uses, waiting until a node is popped off the queue. Figure 3.8 shows

the progress of a breadth-first search on a binary tree, and Figure 3.9 shows the algorithm

with the early-goal efficiency enhancements.

f(n)

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F32C.xhtml#P700101715800000000000000000F32C

Figure 3.8

Breadth-first search on a simple binary tree. At each stage, the node to be expanded next is indicated by

the triangular marker.

Figure 3.9

Breadth-first search and uniform-cost search algorithms.

Early goal test

Late goal test

Breadth-first search always finds a solution with a minimal number of actions, because

when it is generating nodes at depth it has already generated all the nodes at depth

so if one of them were a solution, it would have been found. That means it is cost-optimal

for problems where all actions have the same cost, but not for problems that don’t have that

property. It is complete in either case. In terms of time and space, imagine searching a

uniform tree where every state has successors. The root of the search tree generates

nodes, each of which generates more nodes, for a total of at the second level. Each of

these generates more nodes, yielding nodes at the third level, and so on. Now suppose

that the solution is at depth Then the total number of nodes generated is

All the nodes remain in memory, so both time and space complexity are Exponential

bounds like that are scary. As a typical real-world example, consider a problem with

branching factor processing speed 1 million nodes/second, and memory

requirements of 1 Kbyte/node. A search to depth would take less than 3 hours, but

would require 10 terabytes of memory. The memory requirements are a bigger problem for

breadth-first search than the execution time. But time is still an important factor. At depth

even with infinite memory, the search would take 3.5 years. In general, exponential-

complexity search problems cannot be solved by uninformed search for any but the smallest

instances.

3.4.2 Dijkstra’s algorithm or uniform-cost search

When actions have different costs, an obvious choice is to use best-first search where the

evaluation function is the cost of the path from the root to the current node. This is called

Dijkstra’s algorithm by the theoretical computer science community, and uniform-cost

search by the AI community. The idea is that while breadth-first search spreads out in

waves of uniform depth—first depth 1, then depth 2, and so on—uniform-cost search spreads

out in waves of uniform path-cost. The algorithm can be implemented as a call to BEST-FIRST-

SEARCH with PATH-COST as the evaluation function, as shown in Figure 3.9 .

Uniform-cost search

d, d − 1,

b b

b b2

b b3

d.

1 + b + b2 + b3 + ⋯ + bd = O (bd)

O(bd).

b = 10,

d = 10

d = 14,

Consider Figure 3.10 , where the problem is to get from Sibiu to Bucharest. The successors

of Sibiu are Rimnicu Vilcea and fa*garas, with costs 80 and 99, respectively. The least-cost

node, Rimnicu Vilcea, is expanded next, adding Pitesti with cost The least-

cost node is now fa*garas, so it is expanded, adding Bucharest with cost

Bucharest is the goal, but the algorithm tests for goals only when it expands a node, not

when it generates a node, so it has not yet detected that this is a path

,

to the goal.

Figure 3.10

Part of the Romania state space, selected to illustrate uniform-cost search.

The algorithm continues on, choosing Pitesti for expansion next and adding a second path

to Bucharest with cost It has a lower cost, so it replaces the previous

path in reached and is added to the frontier. It turns out this node now has the lowest cost, so

it is considered next, found to be a goal, and returned. Note that if we had checked for a

goal upon generating a node rather than when expanding the lowest-cost node, then we

would have returned a higher-cost path (the one through fa*garas).

The complexity of uniform-cost search is characterized in terms of the cost of the

optimal solution, and a lower bound on the cost of each action, with Then the

algorithm’s worst-case time and space complexity is which can be much greater

than This is because uniform-cost search can explore large trees of actions with low costs

before exploring paths involving a high-cost and perhaps useful action. When all action

costs are equal, is just and uniform-cost search is similar to breadth-first

search.

80 + 97 = 177.

99 + 211 = 310.

80 + 97 + 101 = 278.

C ∗,

8 ϵ, ϵ > 0.

O(b1+⌊C ∗/ϵ⌋),

bd.

b1+⌊C ∗/ϵ⌋ bd+1,

8 Here, and throughout the book, the “star” in means an optimal value for

Uniform-cost search is complete and is cost-optimal, because the first solution it finds will

have a cost that is at least as low as the cost of any other node in the frontier. Uniform-cost

search considers all paths systematically in order of increasing cost, never getting caught

going down a single infinite path (assuming that all action costs are ).

3.4.3 Depth-first search and the problem of memory

Depth-first search

Depth-first search always expands the deepest node in the frontier first. It could be

implemented as a call to BEST-FIRST-SEARCH where the evaluation function is the negative of

the depth. However, it is usually implemented not as a graph search but as a tree-like search

that does not keep a table of reached states. The progress of the search is illustrated in

Figure 3.11 ; search proceeds immediately to the deepest level of the search tree, where the

nodes have no successors. The search then “backs up” to the next deepest node that still has

unexpanded successors. Depth-first search is not cost-optimal; it returns the first solution it

finds, even if it is not cheapest.

Figure 3.11

C ∗ C.

> ϵ > 0

f

A dozen steps (left to right, top to bottom) in the progress of a depth-first search on a binary tree from

start state A to goal M. The frontier is in green, with a triangle marking the node to be expanded next.

Previously expanded nodes are lavender, and potential future nodes have faint dashed lines. Expanded

nodes with no descendants in the frontier (very faint lines) can be discarded.

For finite state spaces that are trees it is efficient and complete; for acyclic state spaces it

may end up expanding the same state many times via different paths, but will (eventually)

systematically explore the entire space.

In cyclic state spaces it can get stuck in an infinite loop; therefore some implementations of

depth-first search check each new node for cycles. Finally, in infinite state spaces, depth-first

search is not systematic: it can get stuck going down an infinite path, even if there are no

cycles. Thus, depth-first search is incomplete.

With all this bad news, why would anyone consider using depth-first search rather than

breadth-first or best-first? The answer is that for problems where a tree-like search is

feasible, depth-first search has much smaller needs for memory. We don’t keep a reached

table at all, and the frontier is very small: think of the frontier in breadth-first search as the

surface of an ever-expanding sphere, while the frontier in depth-first search is just a radius

of the sphere.

For a finite tree-shaped state-space like the one in Figure 3.11 , a depth-first tree-like

search takes time proportional to the number of states, and has memory complexity of only

where is the branching factor and is the maximum depth of the tree. Some

problems that would require exabytes of memory with breadth-first search can be handled

with only kilobytes using depth-first search. Because of its parsimonious use of memory,

depth-first tree-like search has been adopted as the basic workhorse of many areas of AI,

including constraint satisfaction (Chapter 6 ), propositional satisfiability (Chapter 7 ), and

logic programming (Chapter 9 ).

A variant of depth-first search called backtracking search uses even less memory. (See

Chapter 6 for more details.) In backtracking, only one successor is generated at a time

rather than all successors; each partially expanded node remembers which successor to

generate next. In addition, successors are generated by modifying the current state

description directly rather than allocating memory for a brand-new state. This reduces the

memory requirements to just one state description and a path of actions; a significant

savings over states for depth-first search. With backtracking we also have the option

of maintaining an efficient set data structure for the states on the current path, allowing us

to check for a cyclic path in time rather than For backtracking to work, we must

be able to undo each action when we backtrack. Backtracking is critical to the success of

many problems with large state descriptions, such as robotic assembly.

Backtracking search

3.4.4 Depth-limited and iterative deepening search

To keep depth-first search from wandering down an infinite path, we can use depth-limited

search, a version of depth-first search in which we supply a depth limit, and treat all

nodes at depth as if they had no successors (see Figure 3.12 ). The time complexity is

O(bm), b m

 

O(m)

O(bm)

O(1) O(m).

ℓ,

ℓ 

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FBA5.xhtml#P700101715800000000000000000FBA5

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FE31.xhtml#P700101715800000000000000000FE31

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001063E.xhtml#P700101715800000000000000001063E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000FBA5.xhtml#P700101715800000000000000000FBA5

and the space complexity is Unfortunately, if we make a poor choice for the

algorithm will fail to reach the solution, making it incomplete again.

Figure 3.12

Iterative deepening and depth-limited tree-like search. Iterative deepening repeatedly applies depth-

limited search with increasing limits. It returns one of three different types of values: either a solution

node; or failure, when it has exhausted all nodes and proved there is no solution at any depth; or cutoff,

to mean there might be a solution at a deeper depth than This is a tree-like search algorithm that does

not keep track of reached states, and thus uses much less memory than best-first search, but runs the risk

of visiting the same state multiple times on different paths. Also, if the IS-CYCLE check does not check all

cycles, then the algorithm may get caught in a loop.

Depth-limited search

Since depth-first search is a tree-like search, we can’t keep it from wasting time on

redundant paths in general, but we can eliminate cycles at the cost of some computation

time. If we look only a few links up in the parent chain we can catch most cycles; longer

cycles are handled by the depth limit.

O(bℓ) O(bℓ). ℓ

ℓ.

Sometimes a good depth limit can be chosen based on knowledge of the problem. For

example, on the map of Romania there are 20 cities. Therefore, is a valid limit. But if

we studied the map carefully, we would discover that any city can be reached from any

other city in at most 9 actions. This number, known as the

,

diameter of the state-space

graph, gives us a better depth limit, which leads to a more efficient depth-limited search.

However, for most problems we will not know a good depth limit until we have solved the

problem.

Diameter

Iterative deepening search solves the problem of picking a good value for by trying all

values: first 0, then 1, then 2, and so on—until either a solution is found, or the depth-

limited search returns the failure value rather than the cutoff value. The algorithm is shown

in Figure 3.12 . Iterative deepening combines many of the benefits of depth-first and

breadth-first search. Like depth-first search, its memory requirements are modest:

when there is a solution, or on finite state spaces with no solution. Like breadth-first

search, iterative deepening is optimal for problems where all actions have the same cost,

and is complete on finite acyclic state spaces, or on any finite state space when we check

nodes for cycles all the way up the path.

Iterative deepening search

The time complexity is when there is a solution, or when there is none. Each

iteration of iterative deepening search generates a new level, in the same way that breadth-

first search does, but breadth-first does this by storing all nodes in memory, while iterative-

deepening does it by repeating the previous levels, thereby saving memory at the cost of

more time. Figure 3.13 shows four iterations of iterative-deepening search on a binary

search tree, where the solution is found on the fourth iteration.

ℓ = 19

O(bd)

O(bm)

O(bd) O(bm)

Figure 3.13

Four iterations of iterative deepening search for goal on a binary tree, with the depth limit varying

from 0 to 3. Note the interior nodes form a single path. The triangle marks the node to expand next;

green nodes with dark outlines are on the frontier; the very faint nodes provably can’t be part of a

solution with this depth limit.

Iterative deepening search may seem wasteful because states near the top of the search tree

are re-generated multiple times. But for many state spaces, most of the nodes are in the

bottom level, so it does not matter much that the upper levels are repeated. In an iterative

deepening search, the nodes on the bottom level (depth ) are generated once, those on the

next-to-bottom level are generated twice, and so on, up to the children of the root, which

are generated times. So the total number of nodes generated in the worst case is

M

d

d

which gives a time complexity of —asymptotically the same as breadth-first search. For

example, if and the numbers are

If you are really concerned about the repetition, you can use a hybrid approach that runs

breadth-first search until almost all the available memory is consumed, and then runs

iterative deepening from all the nodes in the frontier. In general, iterative deepening is the

preferred uninformed search method when the search state space is larger than can fit in memory

and the depth of the solution is not known.

3.4.5 Bidirectional search

The algorithms we have covered so far start at an initial state and can reach any one of

multiple possible goal states. An alternative approach called bidirectional search

simultaneously searches forward from the initial state and backwards from the goal state(s),

hoping that the two searches will meet. The motivation is that is much less than

(e.g., 50,000 times less when ).

Bidirectional search

For this to work, we need to keep track of two frontiers and two tables of reached states, and

we need to be able to reason backwards: if state is a successor of in the forward

direction, then we need to know that is a successor of in the backward direction. We

have a solution when the two frontiers collide.

9 In our implementation, the reached data structure supports a query asking whether a given state is a member, and the frontier data

structure (a priority queue) does not, so we check for a collision using reached; but conceptually we are asking if the two frontiers have

met up. The implementation can be extended to handle multiple goal states by loading the node for each goal state into the backwards

frontier and backwards reached table.

N(IDS) = (d)b1 + (d − 1)b2 + (d − 2)b3 … + bd,

O(bd)

b = 10 d = 5,

N(IDS) = 50 + 400 + 3,000 + 20,000 + 100,000 = 123,450

N(BFS) = 10 + 100 + 1,000 + 10,000 + 100,000 = 111,110.

bd/2 + bd/2

bd b = d = 10

s' s

s s'

9

There are many different versions of bidirectional search, just as there are many different

unidirectional search algorithms. In this section, we describe bidirectional best-first search.

Although there are two separate frontiers, the node to be expanded next is always one with

a minimum value of the evaluation function, across either frontier. When the evaluation

function is the path cost, we get bidirectional uniform-cost search, and if the cost of the

optimal path is then no node with cost will be expanded. This can result in a

considerable speedup.

The general best-first bidirectional search algorithm is shown in Figure 3.14 . We pass in

two versions of the problem and the evaluation function, one in the forward direction

(subscript ) and one in the backward direction (subscript ). When the evaluation

function is the path cost, we know that the first solution found will be an optimal solution,

but with different evaluation functions that is not necessarily true. Therefore, we keep track

of the best solution found so far, and might have to update that several times before the

TERMINATED test proves that there is no possible better solution remaining.

Figure 3.14

C ∗, > C ∗

2

F B

Bidirectional best-first search keeps two frontiers and two tables of reached states. When a path in one

frontier reaches a state that was also reached in the other half of the search, the two paths are joined (by

the function JOIN-NODES) to form a solution. The first solution we get is not guaranteed to be the best; the

function TERMINATED determines when to stop looking for new solutions.

3.4.6 Comparing uninformed search algorithms

Figure 3.15 compares uninformed search algorithms in terms of the four evaluation criteria

set forth in Section 3.3.4 . This comparison is for tree-like search versions which don’t

check for repeated states. For graph searches which do check, the main differences are that

depth-first search is complete for finite state spaces, and the space and time complexities are

bounded by the size of the state space (the number of vertices and edges, ).

Figure 3.15

|V | + |E|

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F194.xhtml#P700101715800000000000000000F235

Evaluation of search algorithms. is the branching factor; is the maximum depth of the search tree; is

the depth of the shallowest solution, or is when there is no solution; is the depth limit. Superscript

caveats are as follows: complete if is finite, and the state space either has a solution or is finite.

complete if all action costs are cost-optimal if action costs are all identical; if both directions

are breadth-first or uniform-cost.

b m d

m ℓ

1 b 2

≥ ε > 0; 3 4

3.5 Informed (Heuristic) Search Strategies

This section shows how an informed search strategy—one that uses domain-specific hints

about the location of goals—can find solutions more efficiently than an uninformed strategy.

The hints come in the form of a heuristic function, denoted

10 It may seem odd that the heuristic function operates on a node, when all it really needs is the node’s state. It is traditional to use

rather than to be consistent with the evaluation function and the path cost

Informed search

Heuristic function

For example, in route-finding problems, we can estimate the distance from the current state

to a goal by computing the straight-line distance on the map between the two points. We

study heuristics and where they come from in more detail in Section 3.6

,

.

3.5.1 Greedy best-first search

Greedy best-first search is a form of best-first search that expands first the node with the

lowest value—the node that appears to be closest to the goal—on the grounds that this

is likely to lead to a solution quickly. So the evaluation function

Greedy best-first search

h(n):10

h (n) h (s) f(n) g (n) .

h(n) = estimated cost of the cheapest path from the state at nodento a goal state.

h(n)

f(n) = h(n).

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F450.xhtml#P700101715800000000000000000F450

Let us see how this works for route-finding problems in Romania; we use the straight-line

distance heuristic, which we will call If the goal is Bucharest, we need to know the

straight-line distances to Bucharest, which are shown in Figure 3.16 . For example,

Notice that the values of cannot be computed from the problem

description itself (that is, the ACTIONS and RESULT functions). Moreover, it takes a certain

amount of world knowledge to know that is correlated with actual road distances and

is, therefore, a useful heuristic.

Figure 3.16

Values of —straight-line distances to Bucharest.

Straight-line distance

Figure 3.17 shows the progress of a greedy best-first search using to find a path from

Arad to Bucharest. The first node to be expanded from Arad will be Sibiu because the

heuristic says it is closer to Bucharest than is either Zerind or Timisoara. The next node to

be expanded will be fa*garas because it is now closest according to the heuristic. fa*garas in

turn generates Bucharest, which is the goal. For this particular problem, greedy best-first

search using finds a solution without ever expanding a node that is not on the solution

path. The solution it found does not have optimal cost, however: the path via Sibiu and

fa*garas to Bucharest is 32 miles longer than the path through Rimnicu Vilcea and Pitesti.

This is why the algorithm is called “greedy”—on each iteration it tries to get as close to a

goal as it can, but greediness can lead to worse results than being careful.

hSLD.

hSLD(Arad) = 366. hSLD

hSLD

hSLD

 hSLD

hSLD

Figure 3.17

Stages in a greedy best-first tree-like search for Bucharest with the straight-line distance heuristic

Nodes are labeled with their -values.

Greedy best-first graph search is complete in finite state spaces, but not in infinite ones. The

worst-case time and space complexity is With a good heuristic function, however,

the complexity can be reduced substantially, on certain problems reaching

3.5.2 A* search

The most common informed search algorithm is A* search (pronounced “A-star search”), a

best-first search that uses the evaluation function

hSLD.

h

O(|V |).

O(bm).

f(n) = g(n) + h(n)

A* search

where is the path cost from the initial state to node and is the estimated cost of

the shortest path from to a goal state, so we have

In Figure 3.18 , we show the progress of an A* search with the goal of reaching Bucharest.

The values of are computed from the action costs in Figure 3.1 , and the values of

are given in Figure 3.16 . Notice that Bucharest first appears on the frontier at step (e), but

it is not selected for expansion (and thus not detected as a solution) because at it is

not the lowest-cost node on the frontier—that would be Pitesti, at Another way to

say this is that there might be a solution through Pitesti whose cost is as low as 417, so the

algorithm will not settle for a solution that costs 450. At step (f), a different path to

Bucharest is now the lowest-cost node, at so it is selected and detected as the

optimal solution.

Figure 3.18

g(n) n, h(n)

n

f(n) = estimated cost of the best path that continues fromnto a goal.

g  hSLD

f = 450

f = 417.

f = 418,

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F0A7.xhtml#P700101715800000000000000000F0C0

Stages in an A* search for Bucharest. Nodes are labeled with The values are the straight-line

distances to Bucharest taken from Figure 3.16 .

Admissible heuristic

A* search is complete. Whether A* is cost-optimal depends on certain properties of the

heuristic. A key property is admissibility: an admissible heuristic is one that never

overestimates the cost to reach a goal. (An admissible heuristic is therefore optimistic.) With

an admissible heuristic, A* is cost-optimal, which we can show with a proof by

contradiction. Suppose the optimal path has cost but the algorithm returns a path with

cost Then there must be some node which is on the optimal path and is

unexpanded (because if all the nodes on the optimal path had been expanded, then we

would have returned that optimal solution). So then, using the notation to mean the

cost of the optimal path from the start to and to mean the cost of the optimal path

from to the nearest goal, we have:

11 Again, assuming all action costs are and the state space either has a solution or is finite.

The first and last lines form a contradiction, so the supposition that the algorithm could

return a suboptimal path must be wrong—it must be that A* returns only cost-optimal

paths.

A slightly stronger property is called consistency. A heuristic is consistent if, for every

node and every successor of generated by an action we have:

f = g + h. h

11

C ∗,

C > C ∗. n

g∗(n)

n, h∗(n)

n

>∈> 0,

f (n) > C ∗ (otherwisenwouldhavebeenexpanded)

f (n) = g (n) + h (n) (bydefinition)

f (n) = g∗ (n) + h (n) (becausen isonanoptimalpath)

f (n) ≤ g∗ (n) + h∗ (n) (because ofadmissibility,h (n) ≤ h∗ (n))

f (n) ≤ C ∗ (by definition,C ∗ = g∗ (n) + h∗ (n))

h(n)

n n′ n a,

h(n) ≤ c(n, a,n′) + h(n′).

Consistency

This is a form of the triangle inequality, which stipulates that a side of a triangle cannot be

longer than the sum of the other two sides (see Figure 3.19 ). An example of a consistent

heuristic is the straight-line distance that we used in getting to Bucharest.

Figure 3.19

Triangle inequality: If the heuristic is consistent, then the single number will be less than the sum

of the cost of the action from to plus the heuristic estimate

Triangle inequality

Every consistent heuristic is admissible (but not vice versa), so with a consistent heuristic,

A* is cost-optimal. In addition, with a consistent heuristic, the first time we reach a state it

will be on an optimal path, so we never have to re-add a state to the frontier, and never

have to change an entry in reached. But with an inconsistent heuristic, we may end up with

multiple paths reaching the same state, and if each new path has a lower path cost than the

previous one, then we will end up with multiple nodes for that state in the frontier, costing

us both time and space. Because of that, some implementations of A* take care to only enter

a state into the frontier once, and if a better path to the state is found, all the successors of

the state are updated (which requires that nodes have child pointers as well as parent

pointers). These complications have led many implementers to avoid inconsistent heuristics,

but Felner et al. (2011) argues that the worst effects rarely happen in practice, and one

shouldn’t be afraid of inconsistent heuristics.

hSLD

h h(n)

c(n, a, a′) n n′ h(n′).

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015AD1

With an inadmissible heuristic, A* may or may not be cost-optimal. Here are two cases

where it is: First, if there is even one cost-optimal path on which is admissible for all

nodes on the path, then that path will be found, no matter what the heuristic says for

states off the path. Second, if the optimal solution has cost and the second-best has cost

and if overestimates some costs, but never by more than then A* is

guaranteed to return cost-optimal solutions.

3.5.3 Search contours

,

A useful way to visualize a search is to draw contours in the state space, just like the

contours in a topographic map. Figure 3.20 shows an example. Inside the contour labeled

400, all nodes have and so on. Then, because A* expands the

frontier node of lowest -cost, we can see that an A* search fans out from the start node,

adding nodes in concentric bands of increasing -cost.

Figure 3.20

Map of Romania showing contours at and with Arad as the start state. Nodes

inside a given contour have costs less than or equal to the contour value.

h(n)

n

C ∗,

C2, h(n) C2 − C ∗,

f(n) = g(n) + h(n) ≤ 400,

f

f

f = 380, f = 400, f = 420,

f = g + h

Contour

With uniform-cost search, we also have contours, but of -cost, not The contours with

uniform-cost search will be “circular” around the start state, spreading out equally in all

directions with no preference towards the goal. With A* search using a good heuristic, the

bands will stretch toward a goal state (as in Figure 3.20 ) and become more narrowly

focused around an optimal path.

It should be clear that as you extend a path, the costs are monotonic: the path cost always

increases as you go along a path, because action costs are always positive. Therefore you

get concentric contour lines that don’t cross each other, and if you choose to draw the lines

fine enough, you can put a line between any two nodes on any path.

12 Technically, we say “strictly monotonic” for costs that always increase, and “monotonic” for costs that never decrease, but might

remain the same.

Monotonic

But it is not obvious whether the cost will monotonically increase. As you extend

a path from to the cost goes from to Canceling

out the term, we see that the path’s cost will be monotonically increasing if and only if

in other words if and only if the heuristic is consistent. But note

that a path might contribute several nodes in a row with the same score; this

will happen whenever the decrease in is exactly equal to the action cost just taken (for

example, in a grid problem, when is in the same row as the goal and you take a step

towards the goal, is increased by 1 and is decreased by 1). If is the cost of the optimal

solution path, then we can say the following:

13 In fact, the term “monotonic heuristic” is a synonym for “consistent heuristic.” The two ideas were developed independently, and

then it was proved that they are equivalent (Pearl, 1984).

g g + h.

g + h 

g

12

f = g + h

n n′, g (n) + h (n) g(n) + c(n, a,n′) + h(n′).

g(n)

h(n) ≤ c(n, a,n′) + h(n′); 13

g(n) + h(n)

h

n

g h C ∗

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000162FB

A* expands all nodes that can be reached from the initial state on a path where every

node on the path has We say these are surely expanded nodes.

Surely expanded nodes

A* might then expand some of the nodes right on the “goal contour” (where )

before selecting a goal node.

A* expands no nodes with

We say that A* with a consistent heuristic is optimally efficient in the sense that any

algorithm that extends search paths from the initial state, and uses the same heuristic

information, must expand all nodes that are surely expanded by A* (because any one of

them could have been part of an optimal solution). Among the nodes with one

algorithm could get lucky and choose the optimal one first while another algorithm is

unlucky; we don’t consider this difference in defining optimal efficiency.

Optimally efficient

A* is efficient because it prunes away search tree nodes that are not necessary for finding

an optimal solution. In Figure 3.18(b) we see that Timisoara has and Zerind has

Even though they are children of the root and would be among the first nodes

expanded by uniform-cost or breadth-first search, they are never expanded by A* search

because the solution with is found first. The concept of pruning—eliminating

possibilities from consideration without having to examine them—is important for many

areas of AI.

Pruning

f(n) < C ∗.

f(n) = C ∗

f(n) > C ∗.

f(n) = C ∗,

 f = 447

f = 449.

f = 418

That A* search is complete, cost-optimal, and optimally efficient among all such algorithms

is rather satisfying. Unfortunately, it does not mean that A* is the answer to all our

searching needs. The catch is that for many problems, the number of nodes expanded can

be exponential in the length of the solution. For example, consider a version of the vacuum

world with a super-powerful vacuum that can clean up any one square at a cost of 1 unit,

without even having to visit the square; in that scenario, squares can be cleaned in any

order. With initially dirty squares, there are states where some subset has been

cleaned; all of those states are on an optimal solution path, and hence satisfy so

all of them would be visited by A*.

3.5.4 Satisficing search: Inadmissible heuristics and weighted

A*

Inadmissible heuristic

A* search has many good qualities, but it expands a lot of nodes. We can explore fewer

nodes (taking less time and space) if we are willing to accept solutions that are suboptimal,

but are “good enough”—what we call satisficing solutions. If we allow A* search to use an

inadmissible heuristic—one that may overestimate—then we risk missing the optimal

solution, but the heuristic can potentially be more accurate, thereby reducing the number of

nodes expanded. For example, road engineers know the concept of a detour index, which is

a multiplier applied to the straight-line distance to account for the typical curvature of roads.

A detour index of 1.3 means that if two cities are 10 miles apart in straight-line distance, a

good estimate of the best path between them is 13 miles. For most localities, the detour

index ranges between 1.2 and 1.6.

Detour index

N 2N

f(n) < C ∗,

We can apply this idea to any problem, not just ones involving roads, with an approach

called weighted A* search where we weight the heuristic value more heavily, giving us the

evaluation function for some

Weighted A* search

Figure 3.21 shows a search problem on a grid world. In (a), an A* search finds the optimal

solution, but has to explore a large portion of the state space to find it. In (b), a weighted A*

search finds a solution that is slightly costlier, but the search time is much faster. We see

that the weighted search focuses the contour of reached states towards a goal. That means

that fewer states are explored, but if the optimal path ever strays outside of the weighted

search’s contour (as it does in this case), then the optimal path will not be found. In general,

if the optimal solution costs a weighted A* search will find a solution that costs

somewhere between and but in practice we usually get results much closer to

than

Figure 3.21

Two searches on the same grid: (a) an A* search and (b) a weighted A* search with weight The

gray bars are obstacles, the purple line is the path from the green start to red goal, and the small dots are

states that were reached by each search. On this particular problem, weighted A* explores 7 times fewer

states and finds a path that is 5% more costly.

f(n) = g(n) + W × h(n), W > 1.

C ∗,

C ∗ W × C ∗;

C ∗ W × C ∗.

W = 2.

We have considered searches that evaluate states by combining and in various ways;

weighted A* can be seen as a generalization of the others:

You could call weighted A* “somewhat-greedy search”: like greedy best-first search, it

focuses the search towards a goal; on the other hand, it won’t ignore the path cost

completely, and will suspend a path that is making little progress at great cost.

There are a variety of suboptimal search algorithms, which can be characterized by the

criteria for what counts as “good enough.” In bounded suboptimal search, we look for a

solution that is guaranteed to be within a constant factor of the optimal cost. Weighted

A*

,

provides this guarantee. In bounded-cost search, we look for a solution whose cost is

less than some constant And in unbounded-cost search, we accept a solution of any cost,

as long as we can find it quickly.

Bounded suboptimal search

Bounded-cost search

Unbounded-cost search

An example of an unbounded-cost search algorithm is speedy search, which is a version of

greedy best-first search that uses as a heuristic the estimated number of actions required to

reach a goal, regardless of the cost of those actions. Thus, for problems where all actions

g h

A* search: g(n) + h(n) (W = 1)

Uniform-cost search: g(n) (W = 0)

Greedy best-first search: h(n) (W = ∞)

Weighted A* search: g(n) + W × h(n) (1 < W < ∞)

W

C.

have the same cost it is the same as greedy best-first search, but when actions have different

costs, it tends to lead the search to find a solution quickly, even if it might have a high cost.

Speedy search

3.5.5 Memory-bounded search

The main issue with A* is its use of memory. In this section we’ll cover some

implementation tricks that save space, and then some entirely new algorithms that take

better advantage of the available space.

Memory is split between the frontier and the reached states. In our implementation of best-

first search, a state that is on the frontier is stored in two places: as a node in the frontier (so

we can decide what to expand next) and as an entry in the table of reached states (so we

know if we have visited the state before). For many problems (such as exploring a grid), this

duplication is not a concern, because the size of frontier is much smaller than reached, so

duplicating the states in the frontier requires a comparatively trivial amount of memory. But

some implementations keep a state in only one of the two places, saving a bit of space at the

cost of complicating (and perhaps slowing down) the algorithm.

Another possibility is to remove states from reached when we can prove that they are no

longer needed. For some problems, we can use the separation property (Figure 3.6 on

page 72), along with the prohibition of U-turn actions, to ensure that all actions either move

outwards from the frontier or onto another frontier state. In that case, we need only check

the frontier for redundant paths, and we can eliminate the reached table.

For other problems, we can keep reference counts of the number of times a state has been

reached, and remove it from the reached table when there are no more ways to reach the

state. For example, on a grid world where each state can be reached only from its four

neighbors, once we have reached a state four times, we can remove it from the table.

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F194.xhtml#P700101715800000000000000000F1C1

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000000F194.xhtml#P700101715800000000000000000F1A0

Reference count

Now let’s consider new algorithms that are designed to conserve memory usage.

Beam search limits the size of the frontier. The easiest approach is to keep only the nodes

with the best -scores, discarding any other expanded nodes. This of course makes the

search incomplete and suboptimal, but we can choose to make good use of available

memory, and the algorithm executes fast because it expands fewer nodes. For many

problems it can find good near-optimal solutions. You can think of uniform-cost or A*

search as spreading out everywhere in concentric contours, and think of beam search as

exploring only a focused portion of those contours, the portion that contains the best

candidates.

Beam search

An alternative version of beam search doesn’t keep a strict limit on the size of the frontier

but instead keeps every node whose -score is within of the best -score. That way, when

there are a few strong-scoring nodes only a few will be kept, but if there are no strong nodes

then more will be kept until a strong one emerges.

Iterative-deepening A* search (IDA*) is to A* what iterative-deepening search is to depth-

first: IDA* gives us the benefits of A* without the requirement to keep all reached states in

memory, at a cost of visiting some states multiple times. It is a very important and

commonly used algorithm for problems that do not fit in memory.

Iterative-deepening A* search

k

f

k

k

f δ f

In standard iterative deepening the cutoff is the depth, which is increased by one each

iteration. In IDA* the cutoff is the -cost ( ); at each iteration, the cutoff value is the

smallest -cost of any node that exceeded the cutoff on the previous iteration. In other

words, each iteration exhaustively searches an -contour, finds a node just beyond that

contour, and uses that node’s -cost as the next contour. For problems like the 8-puzzle

where each path’s -cost is an integer, this works very well, resulting in steady progress

towards the goal each iteration. If the optimal solution has cost then there can be no

more than iterations (for example, no more than 31 iterations on the hardest 8-puzzle

problems). But for a problem where every node has a different -cost, each new contour

might contain only one new node, and the number of iterations could be equal to the

number of states.

Recursive best-first search (RBFS) (Figure 3.22 ) attempts to mimic the operation of

standard best-first search, but using only linear space. RBFS resembles a recursive depth-

first search, but rather than continuing indefinitely down the current path, it uses the _limit

variable to keep track of the -value of the best alternative path available from any ancestor

of the current node. If the current node exceeds this limit, the recursion unwinds back to the

alternative path. As the recursion unwinds, RBFS replaces the -value of each node along

the path with a backed-up value—the best -value of its children. In this way, RBFS

remembers the -value of the best leaf in the forgotten subtree and can therefore decide

whether it’s worth reexpanding the subtree at some later time. Figure 3.23 shows how

RBFS reaches Bucharest.

Figure 3.22

f g + h

f

f

f

f

C ∗,

C ∗

f

f

f

f

f

f

The algorithm for recursive best-first search.

Figure 3.23

Stages in an RBFS search for the shortest route to Bucharest. The -limit value for each recursive call is

shown on top of each current node, and every node is labeled with its -cost. (a) The path via Rimnicu

Vilcea is followed until the current best leaf (Pitesti) has a value that is worse than the best alternative

path (fa*garas). (b) The recursion unwinds and the best leaf value of the forgotten subtree (417) is backed

up to Rimnicu Vilcea; then fa*garas is expanded, revealing a best leaf value of 450. (c) The recursion

unwinds and the best leaf value of the forgotten subtree (450) is backed up to fa*garas; then Rimnicu

Vilcea is expanded. This time, because the best alternative path (through Timisoara) costs at least 447,

the expansion continues to Bucharest.

Recursive best-first search

f

f

Backed-up value

RBFS is somewhat more efficient than IDA*, but still suffers from excessive node

regeneration. In the example in Figure 3.23 , RBFS follows the path via Rimnicu Vilcea,

then “changes its mind” and tries fa*garas, and then changes its mind back again. These

mind changes occur because every time the current best path is extended, its -value is

likely to increase— is usually less optimistic for nodes closer to a goal. When this happens,

the second-best path might become the best path, so the search has to backtrack to follow it.

Each mind change corresponds to an iteration of IDA* and could require many

reexpansions of forgotten nodes to recreate the best path and extend it one more node.

RBFS is optimal if the heuristic function is admissible. Its space complexity is linear in

the depth of the deepest optimal solution, but its time complexity is rather difficult to

characterize:

,

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010C51.xhtml#P7001017158000000000000000010C51

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010C89.xhtml#P7001017158000000000000000010C89

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010D39.xhtml#P7001017158000000000000000010D39

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010DBD.xhtml#P7001017158000000000000000010DBD

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010E1F.xhtml#P7001017158000000000000000010E1F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010E28.xhtml#P7001017158000000000000000010E28

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010E3F.xhtml#P7001017158000000000000000010E3F

IV Uncertain knowledge and reasoning

12 Quantifying Uncertainty 385

12.1 Acting under Uncertainty 385

12.2 Basic Probability Notation 388

12.3 Inference Using Full Joint Distributions 395

12.4 Independence 397

12.5 Bayes’ Rule and Its Use 399

12.6 Naive Bayes Models 402

12.7 The Wumpus World Revisited 404

Summary 407

Bibliographical and Historical Notes 408

13 Probabilistic Reasoning 412

13.1 Representing Knowledge in an Uncertain Domain 412

13.2 The Semantics of Bayesian Networks 414

13.3 Exact Inference in Bayesian Networks 427

13.4 Approximate Inference for Bayesian Networks 435

13.5 Causal Networks 449

Summary 453

Bibliographical and Historical Notes 454

14 Probabilistic Reasoning over Time 461

14.1 Time and Uncertainty 461

14.2 Inference in Temporal Models 465

14.3 Hidden Markov Models 473

14.4 Kalman Filters 479

14.5 Dynamic Bayesian Networks 485

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000110B3.xhtml#P70010171580000000000000000110B3

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010E81.xhtml#P7001017158000000000000000010E81

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010E89.xhtml#P7001017158000000000000000010E89

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010EDD.xhtml#P7001017158000000000000000010EDD

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010F59.xhtml#P7001017158000000000000000010F59

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010F94.xhtml#P7001017158000000000000000010F94

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010FB2.xhtml#P7001017158000000000000000010FB2

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000010FEF.xhtml#P7001017158000000000000000010FEF

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001100B.xhtml#P700101715800000000000000001100B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001105E.xhtml#P700101715800000000000000001105E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011077.xhtml#P7001017158000000000000000011077

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000110BE.xhtml#P70010171580000000000000000110BE

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000110C7.xhtml#P70010171580000000000000000110C7

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011115.xhtml#P7001017158000000000000000011115

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011246.xhtml#P7001017158000000000000000011246

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011339.xhtml#P7001017158000000000000000011339

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011485.xhtml#P7001017158000000000000000011485

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000114C8.xhtml#P70010171580000000000000000114C8

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000114DD.xhtml#P70010171580000000000000000114DD

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011529.xhtml#P7001017158000000000000000011529

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011533.xhtml#P7001017158000000000000000011533

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011598.xhtml#P7001017158000000000000000011598

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011682.xhtml#P7001017158000000000000000011682

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011813.xhtml#P7001017158000000000000000011813

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011953.xhtml#P7001017158000000000000000011953

Summary 496

Bibliographical and Historical Notes 497

15 Probabilistic Programming 500

15.1 Relational Probability Models 501

15.2 Open-Universe Probability Models 507

15.3 Keeping Track of a Complex World 514

15.4 Programs as Probability Models 519

Summary 523

Bibliographical and Historical Notes 524

16 Making Simple Decisions 528

16.1 Combining Beliefs and Desires under Uncertainty 528

16.2 The Basis of Utility Theory 529

16.3 Utility Functions 532

16.4 Multiattribute Utility Functions 540

16.5 Decision Networks 544

16.6 The Value of Information 547

16.7 Unknown Preferences 553

Summary 557

Bibliographical and Historical Notes 557

17 Making Complex Decisions 562

17.1 Sequential Decision Problems 562

17.2 Algorithms for MDPs 572

17.3 Bandit Problems 581

17.4 Partially Observable MDPs 588

17.5 Algorithms for Solving POMDPs 590

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011DE8.xhtml#P7001017158000000000000000011DE8

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011DF9.xhtml#P7001017158000000000000000011DF9

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011E2C.xhtml#P7001017158000000000000000011E2C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011E3D.xhtml#P7001017158000000000000000011E3D

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000011F26.xhtml#P7001017158000000000000000011F26

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012063.xhtml#P7001017158000000000000000012063

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000120BB.xhtml#P70010171580000000000000000120BB

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012148.xhtml#P7001017158000000000000000012148

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012157.xhtml#P7001017158000000000000000012157

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012194.xhtml#P7001017158000000000000000012194

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001219F.xhtml#P700101715800000000000000001219F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000121AF.xhtml#P70010171580000000000000000121AF

,

it depends both on the accuracy of the heuristic function and on how often the

best path changes as nodes are expanded. It expands nodes in order of increasing -score,

even if is nonmonotonic.

IDA* and RBFS suffer from using too little memory. Between iterations, IDA* retains only a

single number: the current -cost limit. RBFS retains more information in memory, but it

uses only linear space: even if more memory were available, RBFS has no way to make use

of it. Because they forget most of what they have done, both algorithms may end up

reexploring the same states many times over.

It seems sensible, therefore, to determine how much memory we have available, and allow

an algorithm to use all of it. Two algorithms that do this are MA* (memory-bounded A*)

and SMA* (simplified MA*). SMA* is—well—simpler, so we will describe it. SMA*

proceeds just like A*, expanding the best leaf until memory is full. At this point, it cannot

add a new node to the search tree without dropping an old one. SMA* always drops the

worst leaf node—the one with the highest -value. Like RBFS, SMA* then backs up the value

of the forgotten node to its parent. In this way, the ancestor of a forgotten subtree knows the

quality of the best path in that subtree. With this information, SMA* regenerates the subtree

only when all other paths have been shown to look worse than the path it has forgotten.

Another way of saying this is that if all the descendants of a node are forgotten, then we

f

h

h(n)

f

f

f

f

n

will not know which way to go from but we will still have an idea of how worthwhile it is

to go anywhere from

MA*

SMA*

The complete algorithm is described in the online code repository accompanying this book.

There is one subtlety worth mentioning. We said that SMA* expands the best leaf and

deletes the worst leaf. What if all the leaf nodes have the same -value? To avoid selecting

the same node for deletion and expansion, SMA* expands the newest best leaf and deletes

the oldest worst leaf. These coincide when there is only one leaf, but in that case, the current

search tree must be a single path from root to leaf that fills all of memory. If the leaf is not a

goal node, then even if it is on an optimal solution path, that solution is not reachable with the

available memory. Therefore, the node can be discarded exactly as if it had no successors.

SMA* is complete if there is any reachable solution—that is, if the depth of the shallowest

goal node, is less than the memory size (expressed in nodes). It is optimal if any optimal

solution is reachable; otherwise, it returns the best reachable solution. In practical terms,

SMA* is a fairly robust choice for finding optimal solutions, particularly when the state

space is a graph, action costs are not uniform, and node generation is expensive compared

to the overhead of maintaining the frontier and the reached set.

On very hard problems, however, it will often be the case that SMA* is forced to switch

back and forth continually among many candidate solution paths, only a small subset of

which can fit in memory. (This resembles the problem of thrashing in disk paging systems.)

Then the extra time required for repeated regeneration of the same nodes means that

problems that would be practically solvable by A*, given unlimited memory, become

intractable for SMA*. That is to say, memory limitations can make a problem intractable from

the point of view of computation time. Although no current theory explains the tradeoff

n,

n.

f

d,

between time and memory, it seems that this is an inescapable problem. The only way out is

to drop the optimality requirement.

Thrashing

3.5.6 Bidirectional heuristic search

With unidirectional best-first search, we saw that using as the evaluation

function gives us an A* search that is guaranteed to find optimal-cost solutions (assuming

an admissible ) while being optimally efficient in the number of nodes expanded.

With bidirectional best-first search we could also try using but

unfortunately there is no guarantee that this would lead to an optimal-cost solution, nor that

it would be optimally efficient, even with an admissible heuristic. With bidirectional search,

it turns out that it is not individual nodes but rather pairs of nodes (one from each frontier)

that can be proved to be surely expanded, so any proof of efficiency will have to consider

pairs of nodes (Eckerle et al., 2017).

We’ll start with some new notation. We use for nodes going in the

forward direction (with the initial state as root) and for nodes in the

backward direction (with a goal state as root). Although both forward and backward

searches are solving the same problem, they have different evaluation functions because, for

example, the heuristics are different depending on whether you are striving for the goal or

for the initial state. We’ll assume admissible heuristics.

Consider a forward path from the initial state to a node and a backward path from the

goal to a node We can define a lower bound on the cost of a solution that follows the

path from the initial state to then somehow gets to then follows the path to the goal as

In other words, the cost of such a path must be at least as large as the sum of the path costs

of the two parts (because the remaining connection between them must have nonnegative

f(n) = g(n) + h(n)

h

f(n) = g(n) + h(n),

fF (n) = gF (n) + hF (n)

fB(n) = gB(n) + hB(n)

m

n.

m, n,

lb(m,n) = max(gF (m) + gB(n),fF (m),fB(n))

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015A67

cost), and the cost must also be at least as much as the estimated cost of either part

(because the heuristic estimates are optimistic). Given that, the theorem is that for any pair

of nodes with less than the optimal cost we must expand either or

because the path that goes through both of them is a potential optimal solution. The

difficulty is that we don’t know for sure which node is best to expand, and therefore no

bidirectional search algorithm can be guaranteed to be optimally efficient—any algorithm

might expand up to twice the minimum number of nodes if it always chooses the wrong

member of a pair to expand first. Some bidirectional heuristic search algorithms explicitly

manage a queue of pairs, but we will stick with bidirectional best-first search (Figure

3.14 ), which has two frontier priority queues, and give it an evaluation function that

mimics the criteria:

The node to expand next will be the one that minimizes this value; the node can come

from either frontier. This function guarantees that we will never expand a node (from

either frontier) with We say the two halves of the search “meet in the middle” in

the sense that when the two frontiers touch, no node inside of either frontier has a path cost

greater than the bound Figure 3.24 works through an example bidirectional search.

Figure 3.24

Bidirectional search maintains two frontiers: on the left, nodes A and B are successors of Start; on the

right, node F is an inverse successor of Goal. Each node is labeled with values and the

value. (The values are the sum of the action costs as shown on each arrow; the

values are arbitrary and cannot be derived from anything in the figure.) The optimal solution, Start-A-F-

Goal, has cost so that means that a meet-in-the-middle bidirectional algorithm

should not expand any node with and indeed the next node to be expanded would be A or F

(each with ), leading us to an optimal solution. If we expanded the node with lowest cost first,

then B and C would come next, and D and E would be tied with A, but they all have and thus are

never expanded when is the evaluation function.

f

m,n lb(m,n) C ∗, m n,

(m,n)

lb

f2(n) = max(2g(n), g(n) + h(n))

f2

f2

g(n) > .C ∗

2

.C ∗

2

f = g + h

f2 = max(2g, g + h) g h

C ∗ = 4 + 2 + 4 = 10,

g > = 5;C ∗

2

,

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012207.xhtml#P7001017158000000000000000012207

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012299.xhtml#P7001017158000000000000000012299

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012309.xhtml#P7001017158000000000000000012309

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001238F.xhtml#P700101715800000000000000001238F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001240F.xhtml#P700101715800000000000000001240F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001247A.xhtml#P700101715800000000000000001247A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001248D.xhtml#P700101715800000000000000001248D

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124DA.xhtml#P70010171580000000000000000124DA

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000124E5.xhtml#P70010171580000000000000000124E5

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000126C9.xhtml#P70010171580000000000000000126C9

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000128CB.xhtml#P70010171580000000000000000128CB

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000129F3.xhtml#P70010171580000000000000000129F3

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012A17.xhtml#P7001017158000000000000000012A17

Summary 595

Bibliographical and Historical Notes 596

18 Multiagent Decision Making 599

18.1 Properties of Multiagent Environments 599

18.2 Non-Cooperative Game Theory 605

18.3 Cooperative Game Theory 626

18.4 Making Collective Decisions 632

Summary 645

Bibliographical and Historical Notes 646

V Machine Learning

19 Learning from Examples 651

19.1 Forms of Learning 651

19.2 Supervised Learning 653

19.3 Learning Decision Trees 657

19.4 Model Selection and Optimization 665

19.5 The Theory of Learning 672

19.6 Linear Regression and Classification 676

19.7 Nonparametric Models 686

19.8 Ensemble Learning 696

19.9 Developing Machine Learning Systems 704

Summary 714

Bibliographical and Historical Notes 715

20 Learning Probabilistic Models 721

20.1 Statistical Learning 721

20.2 Learning with Complete Data 724

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B47.xhtml#P7001017158000000000000000012B47

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B58.xhtml#P7001017158000000000000000012B58

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B87.xhtml#P7001017158000000000000000012B87

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012B8F.xhtml#P7001017158000000000000000012B8F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012C2A.xhtml#P7001017158000000000000000012C2A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012F30.xhtml#P7001017158000000000000000012F30

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000012FFF.xhtml#P7001017158000000000000000012FFF

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001312B.xhtml#P700101715800000000000000001312B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001313A.xhtml#P700101715800000000000000001313A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013AC4.xhtml#P7001017158000000000000000013AC4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001316E.xhtml#P700101715800000000000000001316E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001317B.xhtml#P700101715800000000000000001317B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000131B6.xhtml#P70010171580000000000000000131B6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000132DC.xhtml#P70010171580000000000000000132DC

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001344C.xhtml#P700101715800000000000000001344C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001356A.xhtml#P700101715800000000000000001356A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013602.xhtml#P7001017158000000000000000013602

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000136F6.xhtml#P70010171580000000000000000136F6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013855.xhtml#P7001017158000000000000000013855

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001399B.xhtml#P700101715800000000000000001399B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013A47.xhtml#P7001017158000000000000000013A47

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013A6B.xhtml#P7001017158000000000000000013A6B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013ACD.xhtml#P7001017158000000000000000013ACD

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013AD7.xhtml#P7001017158000000000000000013AD7

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013B12.xhtml#P7001017158000000000000000013B12

20.3 Learning with Hidden Variables: The EM Algorithm 737

Summary 746

Bibliographical and Historical Notes 747

21 Deep Learning 750

21.1 Simple Feedforward Networks 751

21.2 Computation Graphs for Deep Learning 756

21.3 Convolutional Networks 760

21.4 Learning Algorithms 765

21.5 Generalization 768

21.6 Recurrent Neural Networks 772

21.7 Unsupervised Learning and Transfer Learning 775

21.8 Applications 782

Summary 784

Bibliographical and Historical Notes 785

22 Reinforcement Learning 789

22.1 Learning from Rewards 789

22.2 Passive Reinforcement Learning 791

22.3 Active Reinforcement Learning 797

22.4 Generalization in Reinforcement Learning 803

22.5 Policy Search 810

22.6 Apprenticeship and Inverse Reinforcement Learning 812

22.7 Applications of Reinforcement Learning 815

Summary 818

Bibliographical and Historical Notes 819

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013C08.xhtml#P7001017158000000000000000013C08

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013CC0.xhtml#P7001017158000000000000000013CC0

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013CD2.xhtml#P7001017158000000000000000013CD2

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013CF5.xhtml#P7001017158000000000000000013CF5

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013D19.xhtml#P7001017158000000000000000013D19

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013DA8.xhtml#P7001017158000000000000000013DA8

,

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013DCF.xhtml#P7001017158000000000000000013DCF

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013E44.xhtml#P7001017158000000000000000013E44

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013E84.xhtml#P7001017158000000000000000013E84

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013ECB.xhtml#P7001017158000000000000000013ECB

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013F1B.xhtml#P7001017158000000000000000013F1B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013F86.xhtml#P7001017158000000000000000013F86

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013F9D.xhtml#P7001017158000000000000000013F9D

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FAB.xhtml#P7001017158000000000000000013FAB

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FE4.xhtml#P7001017158000000000000000013FE4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000013FED.xhtml#P7001017158000000000000000013FED

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014016.xhtml#P7001017158000000000000000014016

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000140BB.xhtml#P70010171580000000000000000140BB

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014148.xhtml#P7001017158000000000000000014148

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000141B6.xhtml#P70010171580000000000000000141B6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000141E1.xhtml#P70010171580000000000000000141E1

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014211.xhtml#P7001017158000000000000000014211

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001422B.xhtml#P700101715800000000000000001422B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001424F.xhtml#P700101715800000000000000001424F

VI Communicating, perceiving, and acting

23 Natural Language Processing 823

23.1 Language Models 823

23.2 Grammar 833

23.3 Parsing 835

23.4 Augmented Grammars 841

23.5 Complications of Real Natural Language 845

23.6 Natural Language Tasks 849

Summary 850

Bibliographical and Historical Notes 851

24 Deep Learning for Natural Language Processing 856

24.1 Word Embeddings 856

24.2 Recurrent Neural Networks for NLP 860

24.3 Sequence-to-Sequence Models 864

24.4 The Transformer Architecture 868

24.5 Pretraining and Transfer Learning 871

24.6 State of the art 875

Summary 878

Bibliographical and Historical Notes 878

25 Computer Vision 881

25.1 Introduction 881

25.2 Image Formation 882

25.3 Simple Image Features 888

25.4 Classifying Images 895

25.5 Detecting Objects 899

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014614.xhtml#P7001017158000000000000000014614

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001427E.xhtml#P700101715800000000000000001427E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014291.xhtml#P7001017158000000000000000014291

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000143CC.xhtml#P70010171580000000000000000143CC

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001441D.xhtml#P700101715800000000000000001441D

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000144C7.xhtml#P70010171580000000000000000144C7

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001455E.xhtml#P700101715800000000000000001455E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000145AD.xhtml#P70010171580000000000000000145AD

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000145C6.xhtml#P70010171580000000000000000145C6

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000145DF.xhtml#P70010171580000000000000000145DF

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001461E.xhtml#P700101715800000000000000001461E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014628.xhtml#P7001017158000000000000000014628

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014699.xhtml#P7001017158000000000000000014699

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000146F9.xhtml#P70010171580000000000000000146F9

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000147A9.xhtml#P70010171580000000000000000147A9

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014809.xhtml#P7001017158000000000000000014809

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014848.xhtml#P7001017158000000000000000014848

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001489C.xhtml#P700101715800000000000000001489C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000148AA.xhtml#P70010171580000000000000000148AA

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000148CB.xhtml#P70010171580000000000000000148CB

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000148D4.xhtml#P70010171580000000000000000148D4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000148E3.xhtml#P70010171580000000000000000148E3

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001497D.xhtml#P700101715800000000000000001497D

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014A08.xhtml#P7001017158000000000000000014A08

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014A53.xhtml#P7001017158000000000000000014A53

25.6 The 3D World 901

25.7 Using Computer Vision 906

Summary 919

Bibliographical and Historical Notes 920

26 Robotics 925

26.1 Robots 925

26.2 Robot Hardware 926

26.3 What kind of problem is robotics solving? 930

26.4 Robotic Perception 931

26.5 Planning and Control 938

26.6 Planning Uncertain Movements 956

26.7 Reinforcement Learning in Robotics 958

26.8 Humans and Robots 961

26.9 Alternative Robotic Frameworks 968

26.10 Application Domains 971

Summary 974

Bibliographical and Historical Notes 975

VII Conclusions

27 Philosophy, Ethics, and Safety of AI 981

27.1 The Limits of AI 981

27.2 Can Machines Really Think? 984

27.3 The Ethics of AI 986

Summary 1005

Bibliographical and Historical Notes 1006

28 The Future of AI 1012

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014A8E.xhtml#P7001017158000000000000000014A8E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014AE7.xhtml#P7001017158000000000000000014AE7

,

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C49.xhtml#P7001017158000000000000000014C49

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C5A.xhtml#P7001017158000000000000000014C5A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C8B.xhtml#P7001017158000000000000000014C8B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014C92.xhtml#P7001017158000000000000000014C92

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014CA2.xhtml#P7001017158000000000000000014CA2

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014D15.xhtml#P7001017158000000000000000014D15

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014D29.xhtml#P7001017158000000000000000014D29

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014DD4.xhtml#P7001017158000000000000000014DD4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014F53.xhtml#P7001017158000000000000000014F53

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014F85.xhtml#P7001017158000000000000000014F85

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000014FAC.xhtml#P7001017158000000000000000014FAC

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015052.xhtml#P7001017158000000000000000015052

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001508A.xhtml#P700101715800000000000000001508A

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000150C4.xhtml#P70010171580000000000000000150C4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000150E7.xhtml#P70010171580000000000000000150E7

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000152F4.xhtml#P70010171580000000000000000152F4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001512B.xhtml#P700101715800000000000000001512B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015135.xhtml#P7001017158000000000000000015135

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015167.xhtml#P7001017158000000000000000015167

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015183.xhtml#P7001017158000000000000000015183

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000152A5.xhtml#P70010171580000000000000000152A5

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000152B8.xhtml#P70010171580000000000000000152B8

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000152FF.xhtml#P70010171580000000000000000152FF

28.1 AI Components 1012

28.2 AI Architectures 1018

A Mathematical Background 1023

A.1 Complexity Analysis and O() Notation 1023

A.2 Vectors, Matrices, and Linear Algebra 1025

A.3 Probability Distributions 1027

Bibliographical and Historical Notes 1029

B Notes on Languages and Algorithms 1030

B.1 Defining Languages with Backus–Naur Form (BNF) 1030

B.2 Describing Algorithms with Pseudocode 1031

B.3 Online Supplemental Material 1032

Bibliography 1033

Index 1069

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015309.xhtml#P7001017158000000000000000015309

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015352.xhtml#P7001017158000000000000000015352

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015390.xhtml#P7001017158000000000000000015390

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015390.xhtml#P7001017158000000000000000015393

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000153C1.xhtml#P70010171580000000000000000153C1

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000153E4.xhtml#P70010171580000000000000000153E4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001542F.xhtml#P700101715800000000000000001542F

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001543B.xhtml#P700101715800000000000000001543B

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001543B.xhtml#P700101715800000000000000001543E

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001545C.xhtml#P700101715800000000000000001545C

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015482.xhtml#P7001017158000000000000000015482

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000015495

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP70010171580000000000000000168A8.xhtml#P70010171580000000000000000168A8

I Artificial Intelligence

Chapter 1

Introduction

In which we try to explain why we consider artificial intelligence to be a subject most worthy of

study, and in which we try to decide what exactly it is, this being a good thing to decide before

embarking.

We call ourselves hom*o sapiens—man the wise—because our intelligence is so important to

us. For thousands of years, we have tried to understand how we think and act—that is, how

our brain, a mere handful of matter, can perceive, understand, predict, and manipulate a

world far larger and more complicated than itself. The field of artificial intelligence, or AI, is

concerned with not just understanding but also building intelligent entities—machines that

can compute how to act effectively and safely in a wide variety of novel situations.

Intelligence

Artificial Intelligence

Surveys regularly rank AI as one of the most interesting and fastest-growing fields, and it is

already generating over a trillion dollars a year in revenue. AI expert Kai-Fu Lee predicts

that its impact will be “more than anything in the history of mankind.” Moreover, the

intellectual frontiers of AI are wide open. Whereas a student of an older science such as

physics might feel that the best ideas have already been discovered by Galileo, Newton,

Curie, Einstein, and the rest, AI still has many openings for full-time masterminds.

AI currently encompasses a huge variety of subfields, ranging from the general (learning,

reasoning, perception, and so on) to the specific, such as playing chess, proving

mathematical theorems, writing poetry, driving a car, or diagnosing diseases. AI is relevant

to any intellectual task; it is truly a universal field.

1.1 What Is AI?

We have claimed that AI is interesting, but we have not said what it is. Historically,

researchers have pursued several different versions of AI. Some have defined intelligence in

terms of fidelity to human performance, while others prefer an abstract, formal definition of

intelligence called rationality—loosely speaking, doing the “right thing.” The subject matter

itself also varies: some consider intelligence to be a property of internal thought processes and

reasoning, while others focus on intelligent behavior, an external characterization.

1 In the public eye, there is sometimes confusion between the terms “artificial intelligence” and “machine learning.” Machine

,

learning

is a subfield of AI that studies the ability to improve performance based on experience. Some AI systems use machine learning

methods to achieve competence, but some do not.

Rationality

From these two dimensions—human vs. rational and thought vs. behavior—there are four

possible combinations, and there have been adherents and research programs for all four.

The methods used are necessarily different: the pursuit of human-like intelligence must be

in part an empirical science related to psychology, involving observations and hypotheses

about actual human behavior and thought processes; a rationalist approach, on the other

hand, involves a combination of mathematics and engineering, and connects to statistics,

control theory, and economics. The various groups have both disparaged and helped each

other. Let us look at the four approaches in more detail.

2 We are not suggesting that humans are “irrational” in the dictionary sense of “deprived of normal mental clarity.” We are merely

conceding that human decisions are not always mathematically perfect.

1.1.1 Acting humanly: The Turing test approach

Turing test

1

2

The Turing test, proposed by Alan Turing (1950), was designed as a thought experiment

that would sidestep the philosophical vagueness of the question “Can a machine think?” A

computer passes the test if a human interrogator, after posing some written questions,

cannot tell whether the written responses come from a person or from a computer. Chapter

27 discusses the details of the test and whether a computer would really be intelligent if it

passed. For now, we note that programming a computer to pass a rigorously applied test

provides plenty to work on. The computer would need the following capabilities:

natural language processing to communicate successfully in a human language;

knowledge representation to store what it knows or hears;

automated reasoning to answer questions and to draw new conclusions;

machine learning to adapt to new circ*mstances and to detect and extrapolate patterns.

Natural language processing

Knowledge representation

Automated reasoning

Machine learning

Total Turing test

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000166D4

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP700101715800000000000000001512B.xhtml#P700101715800000000000000001512B

Turing viewed the physical simulation of a person as unnecessary to demonstrate

intelligence. However, other researchers have proposed a total Turing test, which requires

interaction with objects and people in the real world. To pass the total Turing test, a robot

will need

computer vision and speech recognition to perceive the world;

robotics to manipulate objects and move about.

Computer vision

Robotics

These six disciplines compose most of AI. Yet AI researchers have devoted little effort to

passing the Turing test, believing that it is more important to study the underlying principles

of intelligence. The quest for “artificial flight” succeeded when engineers and inventors

stopped imitating birds and started using wind tunnels and learning about aerodynamics.

Aeronautical engineering texts do not define the goal of their field as making “machines that

fly so exactly like pigeons that they can fool even other pigeons.”

1.1.2 Thinking humanly: The cognitive modeling approach

To say that a program thinks like a human, we must know how humans think. We can learn

about human thought in three ways:

introspection—trying to catch our own thoughts as they go by;

psychological experiments—observing a person in action;

brain imaging—observing the brain in action.

Introspection

Psychological experiments

Brain imaging

Once we have a sufficiently precise theory of the mind, it becomes possible to express the

theory as a computer program. If the program’s input–output behavior matches

corresponding human behavior, that is evidence that some of the program’s mechanisms

could also be operating in humans.

For example, Allen Newell and Herbert Simon, who developed GPS, the “General Problem

Solver” (Newell and Simon 1961), were not content merely to have their program solve

problems correctly. They were more concerned with comparing the sequence and timing of

its reasoning steps to those of human subjects solving the same problems. The

interdisciplinary field of cognitive science brings together computer models from AI and

experimental techniques from psychology to construct precise and testable theories of the

human mind.

Cognitive science

Cognitive science is a fascinating field in itself, worthy of several textbooks and at least one

encyclopedia (Wilson and Keil 1999). We will occasionally comment on similarities or

differences between AI techniques and human cognition. Real cognitive science, however, is

necessarily based on experimental investigation of actual humans or animals. We will leave

that for other books, as we assume the reader has only a computer for experimentation.

In the early days of AI there was often confusion between the approaches. An author would

argue that an algorithm performs well on a task and that it is therefore a good model of

human performance, or vice versa. Modern authors separate the two kinds of claims; this

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P7001017158000000000000000016250

https://jigsaw.vitalsource.com/books/9780134671932/epub/OPS/xhtml/fileP7001017158000000000000000015495.xhtml#P70010171580000000000000000167C9

distinction has allowed both AI and cognitive science to develop more rapidly. The two

fields fertilize each other, most notably in computer vision, which incorporates

neurophysiological evidence into computational models. Recently, the combination of

neuroimaging methods combined with machine learning techniques for analyzing such data

has led to the beginnings of a capability to “read minds”—that is, to ascertain the semantic

content of a person’s inner thoughts. This capability could, in turn, shed further light on

how human cognition works.

1.1.3 Thinking rationally: The “laws of thought” approach

The Greek philosopher Aristotle was one of the first to attempt to codify “right thinking”—

that is, irrefutable reasoning processes. His syllogisms provided patterns for argument

structures that always yielded correct conclusions when given correct premises. The

canonical example starts with Socrates is a man and all men are mortal and concludes that

Socrates is mortal. (This example is probably due to Sextus Empiricus rather than Aristotle.)

These laws of thought were supposed to govern the operation of the mind; their study

initiated the field called logic.

Syllogisms

Logicians in the 19th century developed a precise notation for statements about objects in

the world and the relations among them. (Contrast this with ordinary arithmetic notation,

which provides only for statements about numbers.) By 1965, programs could, in principle,

solve any solvable problem described in logical notation. The so-called logicist tradition

within artificial intelligence hopes to build on such programs to create intelligent systems.

Logicist

Logic as conventionally understood requires knowledge of the world that is certain—a

condition that, in reality, is seldom achieved. We simply don’t know the rules of, say,

politics or warfare in the same way that we know the rules of chess or arithmetic. The

theory of probability fills this gap, allowing rigorous reasoning with uncertain information.

In principle, it allows the construction of a comprehensive model of rational thought,

leading from raw perceptual information to an understanding of how the world works to

predictions about the future. What it does not do, is generate intelligent behavior. For that,

we need a theory of rational action. Rational thought, by

Inteligência Artificial Uma Abordagem Moderna - Inteligência Artificial (2024)
Top Articles
Latest Posts
Article information

Author: Rev. Leonie Wyman

Last Updated:

Views: 5782

Rating: 4.9 / 5 (79 voted)

Reviews: 94% of readers found this page helpful

Author information

Name: Rev. Leonie Wyman

Birthday: 1993-07-01

Address: Suite 763 6272 Lang Bypass, New Xochitlport, VT 72704-3308

Phone: +22014484519944

Job: Banking Officer

Hobby: Sailing, Gaming, Basketball, Calligraphy, Mycology, Astronomy, Juggling

Introduction: My name is Rev. Leonie Wyman, I am a colorful, tasty, splendid, fair, witty, gorgeous, splendid person who loves writing and wants to share my knowledge and understanding with you.