S. Russell and E. Wefald (1991). Do the right thing : studies in limited rationality (Chapter 1: Limited Rationality), MIT Press
@Book{Russell91,
author = "Stuart Russell and Eric Wefald",
title = "Do the right thing : studies in limited rationality",
publisher = "MIT Press",
year = "1991",
series = "Artificial intelligence",
address = "Cambridge, Mass",
}
Bounded rational agent design as the simultaneous minimization of
deliberation time and maximization of decision quality and speed of
adaptation to the environment.
View of AI as the optimization problem of agent design over all
possible behaviors enabled by the architecture (i.e., constrained by
limited computational resources).
Learning: the need for non-representational feedback from the
environment.
Need for a variety of epresentation schemes in an agent architecture
(from declarative, decision-theoretic models of belief and utility, to
production rules).
Motivation for metareasoning architectures: deliberation control
enables efficient thinking, acting and learning.
Computations as actions to be decided upon based on the utility of
their outcome, namely their influence on the choice of external actions as
well as the time they require.
Summary
This chapter defines bounded optimal agents as agents that act optimally
given their limited computational resources. The authors propose to
redefine the whole AI discipline as the design of such agents,
as opposed to the traditional logicist and decision-theoretic approaches
that do not take resource limitations into account.
"The keystone of [the former] approach is the ability of reasoning
systems to reason about their own deliberations." Using an array of
representational schemes (from declarative to production-like) as well as
knowledge compilation methods, a metareasoning architecture should aim at
achieving bounded rationality as "an equilibrium point of the forces that
tend
to maximize the quality of decisions, minimize the time to make those
decisions, and maximize the speed of adaptation to the environment."
Detailed outline
Agents
An agent is a system that senses its environment and acts upon it.
An agent can be described by a mapping from percept sequences to actions.
It is important to distinguish:
the abstract mapping,
its implementation (the program), and
the behavior of the running program.
The program implements the mapping. When run on an architecture (a fixed
interpreter for a class of programs), it will create the agent's behavior.
AI as designing agents that "do the right thing"
What does it mean to do the right thing? For logicists, it means deriving
from the agent's beliefs an action sequence that will provably achieve
its goals. For decision-theorists, it means finding an action sequence
that will maximize expected payoff or utility.
The main problem with these prescriptive approaches is that they judge
"rightness" only on the product of reasoning (namely the action sequence)
as opposed to evaluating the whole "deliberating plus acting" process.
In other words, they focus on the abstract mapping, not on the
behavior. But a program that implements the optimal mapping may be completely
useless in practice if it takes forever to run.
An alternative approach is to judge "rightness" based on the behavior of
the agent, not the abstract mapping. This implies taking into account
running the program on a given architecture and therefore leads to the
consideration of the available, limited resources. In a way, they favor
"acting rationally" over "thinking rationally."
Different types of rationality
Perfect or "Type I" rationality
This is the decision-theoretic principle that an agent should always act so
as to maximize its expected utility. Not realistic for complex environments.
Good's "Type II" rationality
Same as before while taking into account deliberation costs. Requires
perfect management of deliberations and still seems unrealistic.
Bounded optimality
This means acting optimally taking into account the agent's limited
computational resources. AI = Designing bounded rational agents, i.e.
maximizing the agent's utility over all the behaviors enabled by the
architecture. Such an agent must be able to trade off action quality
against urgency, in particular through metareasoning.
Constraints on the design of bounded optimal agents
The need to learn: an autonomous agent should be able to live in any
environment, given enough time to adapt.
The source of learning: perception of the state of the environment,
of the result of actions and of the environment's feedback.
Efficiency of learning and deliberation: seems to require compiled
knowledge, whereas learnability seems to benefit from declarative knowledge.
In conclusion, the architecture should allow for different types of knowledge
and the agent's program should contain compilation mechanisms from
declarative to production-like knowledge structures. (Cf. chapter 2)
Metareasoning
Computations are actions with utilities or expected values. The latter
depend on the effects
in terms of the passage of time and the difference between the external
actions they lead to and those that were favored before the deliberation.
Metareasoning is the problem of deciding whether or not to deliberate
(at the object level), of deciding what to think about and
of allocating resources for deliberation.
The authors argue that metareasoning is domain-independent. It does not need
to know what the object level decisions are about. It need only compute
what their expected utility is. The key to metareasoning is to take
advantage of structural regularity in the domain, namely that not all
computations will have the same expected utility. This
non-uniformity must be learnable/predictable.