《Artificial Intelligence：A Modern Approach》教学资源（教材，英文版）Part IV Planning 11 Planning.pdf_大学文库

11 PLANNING In which we see how an agent can take advantage of the structure of a problem to construct complex plans of action. The task of coming up with a sequence of actions that will achieve a goal is called planning. We have seen two examples of planning agents so far: the search-based problem-solving agent of Chapter 3 and the logical planning agent of Chapter 10. This chapter is concerned primarily with scaling up to complex planning problems that defeat the approaches we have seen so far. Section 11.1 develops an expressive yet carefully constrained language for representing planning problems, including actions and states. The language is closely related to the propositional and first-order representations of actions in Chapters 7 and 10. Section 11.2 shows how forward and backward search algorithms can take advantage of this representation, primarily through accurate heuristics that can be derived automatically from the structure of the representation. (This is analogous to the way in which effective heuristics were constructed for constraint satisfaction problems in Chapter 5.) Sections 11.3 through 11.5 describe planning algorithms that go beyond forward and backward search, taking advantage of the representation of the problem. In particular, we explore approaches that are not constrained to consider only totally ordered sequences of actions. For this chapter, we consider only environments that are fully observable, deterministic, finite, static (change happens only when the agent acts), and discrete (in time, action, objects, and effects). These are called classical planning environments. In contrast, nonclassical CLASSICAL PLANNING planning is for partially observable or stochastic environments and involves a different set of algorithms and agent designs, outlined in Chapters 12 and 17. 11.1 THE PLANNING PROBLEM Let us consider what can happen when an ordinary problem-solving agent using standard search algorithms—depth-first, A ∗ , and so on—comes up against large, real-world problems. That will help us design better planning agents. 375

376 Chapter 11.Planning The most obvious difficulty is that the problem-olving agent can be overwhelmed by Consid buying a copy from ar of 10 bi Suppose there is one all 1o acto nnd one that sat the goal whic 0137903952 A sensible planning agent, ab SN)dirsetly.Todo ths the amcedthe general knowed t goal de ption sucn as 903952 e the a that Buy()results in Have().Given this knowledge and the goal,the planner can decide in a single unification step that Buy(ISBN0137903952)is the right action. The next difficulty is finding a good heuristic function.Suppose the agent's goal is to buy four different books online.Then there will be 10 plans of just four steps,so searching without an accurate heuristic is out of the question.It is obvious to a human that a good heuristic estimate for the cost of a state is the number of books that remain to be bought unfortunately,this insight is not obvious to a problem-solving agent,because it sees the goal test only as a black box that returns true or false for each state.Therefore.the problem solving agent lacks autonomy;it requires a human to supply a heuristic function for each new problem.On the other hand,if a planning agent has access to an explicit representation of the goal as a conjunction of subgoals,then it can use a single domain-independent heuristic:the number of unsatisfied conjuncts.For the book-buying problem,the goal would be Have(A)A Have(B)A Have(C)A Have(D),and a state containing Have(A)A Have(C)would have cost 2.Thus,the agent automatically gets the right heuristic for this problem,and for many others.We shall see later in the chapter how to construct more sophisticated heuristics that examine the available actions as well as the structure of the goal. Finally,the problem solver might be inefficient because it cannot take advantage of problem decomposition.Consider the problem of delivering a set of overnight packages to their respective destinations,which are scattered across Australia.It makes sense to find out the nearest airport for each destination and divide the overall problem into several subprob- lems,one for each airport.Within the set of packages routed through a given airport,whether further decomposition is possible depends on the destination city.We saw in Chapter 5 that the ability to do this kind of decomposition contributes to the efficiency of constraint satisfac tion problem solvers.The same holds true for planners:in the worst case,it can take O(n!) time to find the best plan to deliver n packages,but only O((n/)!)time if the problem can be decomposed into k equal parts As we noted in Chapter 5,perfectly decomposable problems are delicious but rare. The design of many planning systems- particularly the partial-order planners described in Section 11.3-is based on the assumption that most real-world problems are nearly decom- posable.That is,the planner can work on subgoals independently,but might need to do some additional work to combine the resulting subplans.For some problems,this assump- Notice that even the deliv kage is not erfectly dec cases in which i nt ai

376 Chapter 11. Planning The most obvious difficulty is that the problem-solving agent can be overwhelmed by irrelevant actions. Consider the task of buying a copy of AI: A Modern Approach from an online bookseller. Suppose there is one buying action for each 10-digit ISBN number, for a total of 10 billion actions. The search algorithm would have to examine the outcome states of all 10 billion actions to find one that satisfies the goal, which is to own a copy of ISBN 0137903952. A sensible planning agent, on the other hand, should be able to work back from an explicit goal description such as Have(ISBN 0137903952) and generate the action Buy(ISBN 0137903952) directly. To do this, the agent simply needs the general knowledge that Buy(x) results in Have(x). Given this knowledge and the goal, the planner can decide in a single unification step that Buy(ISBN 0137903952) is the right action. The next difficulty is finding a good heuristic function. Suppose the agent’s goal is to buy four different books online. Then there will be 1040 plans of just four steps, so searching without an accurate heuristic is out of the question. It is obvious to a human that a good heuristic estimate for the cost of a state is the number of books that remain to be bought; unfortunately, this insight is not obvious to a problem-solving agent, because it sees the goal test only as a black box that returns true or false for each state. Therefore, the problemsolving agent lacks autonomy; it requires a human to supply a heuristic function for each new problem. On the other hand, if a planning agent has access to an explicit representation of the goal as a conjunction of subgoals, then it can use a single domain-independent heuristic: the number of unsatisfied conjuncts. For the book-buying problem, the goal would be Have(A)∧ Have(B) ∧ Have(C) ∧ Have(D), and a state containing Have(A) ∧ Have(C) would have cost 2. Thus, the agent automatically gets the right heuristic for this problem, and for many others. We shall see later in the chapter how to construct more sophisticated heuristics that examine the available actions as well as the structure of the goal. Finally, the problem solver might be inefficient because it cannot take advantage of problem decomposition. Consider the problem of delivering a set of overnight packages to PROBLEM DECOMPOSITION their respective destinations, which are scattered across Australia. It makes sense to find out the nearest airport for each destination and divide the overall problem into several subproblems, one for each airport. Within the set of packages routed through a given airport, whether further decomposition is possible depends on the destination city. We saw in Chapter 5 that the ability to do this kind of decomposition contributes to the efficiency of constraint satisfaction problem solvers. The same holds true for planners: in the worst case, it can take O(n!) time to find the best plan to deliver n packages, but only O((n/k)! × k) time if the problem can be decomposed into k equal parts. As we noted in Chapter 5, perfectly decomposable problems are delicious but rare.1 The design of many planning systems—particularly the partial-order planners described in Section 11.3—is based on the assumption that most real-world problems are nearly decomposable. That is, the planner can work on subgoals independently, but might need to do NEARLY DECOMPOSABLE some additional work to combine the resulting subplans. For some problems, this assump- 1 Notice that even the delivery of a package is not perfectly decomposable. There may be cases in which it is better to assign packages to a more distant airport if that renders a flight to the nearest airport unnecessary. Nevertheless, most delivery companies prefer the computational and organizational simplicity of sticking with decomposed solutions

Section 11.1.The Planning Problem 377 tion breaks down because working on one subgoal is likely to undo another subgoal.These hat nong sut s pu zles (ike c 8-puzzle) The language of planning problems The preceding discus the -statc 0 e for planr ing algorithms e pr ey is to be a wide but nd a language that is ty of pro . ient algo n thi guage of clas planners s the STRIPS language. Later,we point out some of the many possible variations in STR Representation of states. Planners decompose the world into logical condit ons and represent a state as a conjunction of positive literal s.We will consider propositional lite for example nknown might represent the state of a haple agent we will als use first-order literals:for example,At(Plane,Melbourne)A At(Plane2.Sydney)migh represent a state in the package delivery problem.Literals in first-order state descriptions must be ground and function-free.Literals such as At(z,y)or At(Father(Fred),Sydney are not allowed.The closed-world assumption is used,meaning that any conditions that are not mentioned in a state are assumed false. Representation of goals.A goal is a partially specified state,represented as a conjunc tion of positive ground literals.such as Rich A Famous or At(P2.Tahiti).A propositional state s satisfies a goal g ifs contains all the atoms in g(and possibly others).For example, the state Rich A Famous A Miserable satisfies the goal Rich A Famous. Representation of actions.An action is specified in terms of the preconditions that must hold before it can be executed and the effects that ensue when it is executed.For example,an action for flying a plane from one location to another is: Action(Fly(p,from,to) PRECOND:At(D.from)A Plane(D)A Airport(from)A Airport(to) EFFECT:-At(p,from)A At(p,to)) ACTION SCHEMA This is more properly called an action schema,meaning that it represents a number of dif ferent actions that can be derived by instantiating the variables p.from.and toto different constants.In general,an action schema consists of three parts: .The action name and parameter list-for example,Fly(p,from,to)-serves to identify the action. uetion of function-fre action can be must also a ppear in the action's parameter list. EFFECT The effect is a conjunction of function-free literals describing how the state changes when the action is executed.A positive literal P in the effect is asserted to be true in 2 STRIPs stands for STanford Research Institute Problem Solver

Section 11.1. The Planning Problem 377 tion breaks down because working on one subgoal is likely to undo another subgoal. These interactions among subgoals are what makes puzzles (like the 8-puzzle) puzzling. The language of planning problems The preceding discussion suggests that the representation of planning problems—states, actions, and goals—should make it possible for planning algorithms to take advantage of the logical structure of the problem. The key is to find a language that is expressive enough to describe a wide variety of problems, but restrictive enough to allow efficient algorithms to operate over it. In this section, we first outline the basic representation language of classical planners, known as the STRIPS language.2 Later, we point out some of the many possible variations in STRIPS-like languages. Representation of states. Planners decompose the world into logical conditions and represent a state as a conjunction of positive literals. We will consider propositional literals; for example, Poor ∧ Unknown might represent the state of a hapless agent. We will also use first-order literals; for example, At(Plane1, Melbourne) ∧ At(Plane2, Sydney) might represent a state in the package delivery problem. Literals in first-order state descriptions must be ground and function-free. Literals such as At(x, y) or At(Father(Fred), Sydney) are not allowed. The closed-world assumption is used, meaning that any conditions that are not mentioned in a state are assumed false. Representation of goals. A goal is a partially specified state, represented as a conjunction of positive ground literals, such as Rich ∧ Famous or At(P2, Tahiti). A propositional GOAL SATISFACTION state s satisfies a goal g if s contains all the atoms in g (and possibly others). For example, the state Rich ∧ Famous ∧ Miserable satisfies the goal Rich ∧ Famous. Representation of actions. An action is specified in terms of the preconditions that must hold before it can be executed and the effects that ensue when it is executed. For example, an action for flying a plane from one location to another is: Action(Fly(p, from,to), PRECOND:At(p, from) ∧ Plane(p) ∧ Airport(from) ∧ Airport(to) EFFECT:¬At(p, from) ∧ At(p,to)) ACTION SCHEMA This is more properly called an action schema, meaning that it represents a number of different actions that can be derived by instantiating the variables p, from, and to to different constants. In general, an action schema consists of three parts: • The action name and parameter list—for example, Fly(p, from, to)—serves to identify the action. PRECONDITION • The precondition is a conjunction of function-free positive literals stating what must be true in a state before the action can be executed. Any variables in the precondition must also appear in the action’s parameter list. EFFECT • The effect is a conjunction of function-free literals describing how the state changes when the action is executed. A positive literal P in the effect is asserted to be true in 2 STRIPS stands for STanford Research Institute Problem Solver

378 Chapter 11.Planning the state resulting from the action,whereas a negative literalPis asserted to be false Variables in the effect must also appear in the action's parameter list the semantics The most straightfor states.(An alternative method is to specify a direct translation into s or-state axioms nes from firs rcise 11.3.)First. say that an action the first-order ac establish g app licability will in olve substitu ion for the variables in the precondition.For example,suppose the current state is des ribed bv At(P,JFK)A At(P2,SFO)A Plane(P)A Piane(P2) A Airport(JFK)A Airport(SFO). This state satisfies the precondition At(p,from)A Plane(p)A Airport(from)AAirport(to) with substitution p/P,from/JFK,to/SFO}(among others-see Exercise 11.2).Thus, the concrete action Flu(pIEK SEO)is applicable FESL灯 Starting in state s the result of executing an applicable action a is a state s'that is the same as s except that any positive literal P in the effect of a is added to s'and any negative literal-P is removed from s'.Thus,after Fly(P,JFK,SFO),the current state becomes At(P,SFO)A At(P2,SFO)A Plane(P)A Plane(P2) A Airport(JFK)A Airport(SFO). e山ornd ne下N ot in s then hat t of the effect i mbodies the called strips ed in the effect r s un ged.In this way napter 1 I an a that,wh plest form,this is en e ed in initia results ina apte allow s every actio ons to be parti ordered ets of actions t respects the part order Is a so Expressiveness and extensions The various restrictions imposed by the STRIPs representation were chosen in the hope of making planning algorithms simpler and more efficient,without making it too difficult to deseribe real problems One of the most imnortant restrictions is that literals be fucton- free.With this restriction,we can be sure that any action schema for a given problem can be propositionalized-that is,turned into a finite collection of purely p onositional action representations with no variables (See Chapter 9 for more on this topic For example.in the air cargo domain for a problem with 10 planes and five airports.we could translate the Fly(p,from,to)schema into 1055=250 purely propositional actions.The planners

378 Chapter 11. Planning the state resulting from the action, whereas a negative literal ¬P is asserted to be false. Variables in the effect must also appear in the action’s parameter list. ADD LIST To improve readability, some planning systems divide the effect into the add list for positive DELETE LIST literals and the delete list for negative literals. Having defined the syntax for representations of planning problems, we can now define the semantics. The most straightforward way to do this is to describe how actions affect states. (An alternative method is to specify a direct translation into successor-state axioms, whose semantics comes from first-order logic; see Exercise 11.3.) First, we say that an action APPLICABLE is applicable in any state that satisfies the precondition; otherwise, the action has no effect. For a first-order action schema, establishing applicability will involve a substitution θ for the variables in the precondition. For example, suppose the current state is described by At(P1, JFK) ∧ At(P2, SFO) ∧ Plane(P1) ∧ Plane(P2) ∧ Airport(JFK) ∧ Airport(SFO) . This state satisfies the precondition At(p, from) ∧ Plane(p) ∧ Airport(from) ∧ Airport(to) with substitution {p/P1, from/JFK,to/SFO} (among others—see Exercise 11.2). Thus, the concrete action Fly(P1, JFK, SFO) is applicable. Starting in state s, the result of executing an applicable action a is a state s 0 RESULT that is the same as s except that any positive literal P in the effect of a is added to s 0 and any negative literal ¬P is removed from s 0 . Thus, after Fly(P1, JFK, SFO), the current state becomes At(P1, SFO) ∧ At(P2, SFO) ∧ Plane(P1) ∧ Plane(P2) ∧ Airport(JFK) ∧ Airport(SFO) . Note that if a positive effect is already in s it is not added twice, and if a negative effect is not in s, then that part of the effect is ignored. This definition embodies the so-called STRIPS STRIPS ASSUMPTION assumption: that every literal not mentioned in the effect remains unchanged. In this way, STRIPS avoids the representational frame problem described in Chapter 10. SOLUTION Finally, we can define the solution for a planning problem. In its simplest form, this is just an action sequence that, when executed in the initial state, results in a state that satisfies the goal. Later in the chapter, we will allow solutions to be partially ordered sets of actions, provided that every action sequence that respects the partial order is a solution. Expressiveness and extensions The various restrictions imposed by the STRIPS representation were chosen in the hope of making planning algorithms simpler and more efficient, without making it too difficult to describe real problems. One of the most important restrictions is that literals be functionfree. With this restriction, we can be sure that any action schema for a given problem can be propositionalized—that is, turned into a finite collection of purely propositional action representations with no variables. (See Chapter 9 for more on this topic.) For example, in the air cargo domain for a problem with 10 planes and five airports, we could translate the Fly(p, from,to) schema into 10 × 5 × 5 = 250 purely propositional actions. The planners

Section11.1. The Planning Problem 379 STRIPS Language ADL Language Only positive literals in states Positive and negative literals in states PoorA Unknown -Rich A-Famous Closed World Assumption: Open World Assumption: Unmentioned literals are false. Unmentioned literals are unknown. Effect PAQ means add P and delete Q. Effect PA-Q means add P and-Q and delete-P and Q. Only ground literals in goals: Quantified variables in goals: Rich a Famous At(P)A At(P)is the goal of having乃and乃in the same place. Goals are conjunctions: Goals allow conjunction and disjunction: Rich a Famous -Poor A (Famous V Smart) Effects are conjunctions. Conditional effects allowed: when P:E means E is an effect only if Pis satisfied. No support for equality. Equality predicate(=)is built in. No support for types. Variables can have types,as in(p:Plane). Figure 11.1 Comparison of STRIPS and ADL languages for representing planning prob- lems.In both cases.goals behave as the preconditions of an action with no parameters. in Sections 11.4 and 11.5 work directly with propositionalized descriptions.If we allow function symbols,then infinitely many states and actions can be constructed. In recent years,it has become clear that STRIPs is insufficiently expressive for some real domains.As a result,many language variants have been developed.Figure 11.1 briefly describes one important one,the Action Description Language or ADL,by comparing it with the basic STRIPS language.In ADL,the Fly action could be written as Action(Fly(p:Plane,from:Airport,to:Airport), PRECOND:At(p,from)A(from#to) EFFECT:-At(p,from)A At(p,to)). The notation p:Plane in the parameter list is an abbreviation for Plane(p)in the precondi- tion,this adds no expressive power,but can be easier to read.(It also cuts down on the number of possible propositional actions that can be constructed.)The precondition(from to)ex- presses the fact that a flight cannot be made from an airport to itself.This could not be expressed succinctly in STRIPS. The various planning formalisms used in AI have been systematized within a standard syntax called thethe Planning Domain Definition Language,or PDDL.This language allows researchers to exchange becnchmark problems and compare results.PDDL includes sublan- guages for STRIPS,ADL,and the hierarchical task networks we will see in Chapter 12

Section 11.1. The Planning Problem 379 STRIPS Language ADL Language Only positive literals in states: Positive and negative literals in states: Poor ∧ Unknown ¬Rich ∧ ¬Famous Closed World Assumption: Open World Assumption: Unmentioned literals are false. Unmentioned literals are unknown. Effect P ∧ ¬Q means add P and delete Q. Effect P ∧ ¬Q means add P and ¬Q and delete ¬P and Q. Only ground literals in goals: Quantified variables in goals: Rich ∧ Famous ∃xAt(P1, x) ∧ At(P2, x) is the goal of having P1 and P2 in the same place. Goals are conjunctions: Goals allow conjunction and disjunction: Rich ∧ Famous ¬Poor ∧ (Famous ∨ Smart) Effects are conjunctions. Conditional effects allowed: when P: E means E is an effect only if P is satisfied. No support for equality. Equality predicate (x = y) is built in. No support for types. Variables can have types, as in (p : Plane). Figure 11.1 Comparison of STRIPS and ADL languages for representing planning problems. In both cases, goals behave as the preconditions of an action with no parameters. in Sections 11.4 and 11.5 work directly with propositionalized descriptions. If we allow function symbols, then infinitely many states and actions can be constructed. In recent years, it has become clear that STRIPS is insufficiently expressive for some real domains. As a result, many language variants have been developed. Figure 11.1 briefly ADL describes one important one, the Action Description Language or ADL, by comparing it with the basic STRIPS language. In ADL, the Fly action could be written as Action(Fly(p : Plane, from : Airport,to : Airport), PRECOND:At(p, from) ∧ (from 6= to) EFFECT:¬At(p, from) ∧ At(p, to)) . The notation p : Plane in the parameter list is an abbreviation for Plane(p) in the precondition; this adds no expressive power, but can be easier to read. (It also cuts down on the number of possible propositional actions that can be constructed.) The precondition (from 6= to) expresses the fact that a flight cannot be made from an airport to itself. This could not be expressed succinctly in STRIPS. The various planning formalisms used in AI have been systematized within a standard syntax called the the Planning Domain Definition Language, or PDDL. This language allows researchers to exchange becnchmark problems and compare results. PDDL includes sublanguages for STRIPS, ADL, and the hierarchical task networks we will see in Chapter 12

380 Chapter 11. Planning Init(At(C1, SFO) ∧ At(C2, JFK) ∧ At(P1, SFO) ∧ At(P2, JFK) ∧ Cargo(C1) ∧ Cargo(C2) ∧ Plane(P1) ∧ Plane(P2) ∧ Airport(JFK) ∧ Airport(SFO)) Goal(At(C1, JFK) ∧ At(C2, SFO)) Action(Load(c, p, a), PRECOND: At(c, a) ∧ At(p, a) ∧ Cargo(c) ∧ Plane(p) ∧ Airport(a) EFFECT: ¬ At(c, a) ∧ In(c, p)) Action(Unload(c, p, a), PRECOND: In(c, p) ∧ At(p, a) ∧ Cargo(c) ∧ Plane(p) ∧ Airport(a) EFFECT: At(c, a) ∧ ¬ In(c, p)) Action(Fly(p, from, to), PRECOND: At(p, from) ∧ Plane(p) ∧ Airport(from) ∧ Airport(to) EFFECT: ¬ At(p, from) ∧ At(p, to)) Figure 11.2 A STRIPS problem involving transportation of air cargo between airports. The STRIPS and ADL notations are adequate for many real domains. The subsections that follow show some simple examples. There are still some significant restrictions, however. The most obvious is that they cannot represent in a natural way the ramifications of actions. For example, if there are people, packages, or dust motes in the airplane, then they too change location when the plane flies. We can represent these changes as the direct effects of flying, whereas it seems more natural to represent the location of the plane’s contents as a logical consequence of the location of the plane. We will see more examples of such STATE CONSTRAINTS state constraints in Section 11.5. Classical planning systems do not even attempt to address the qualification problem: the problem of unrepresented circumstances that could cause an action to fail. We will see how to address qualifications in Chapter 12. Example: Air cargo transport Figure 11.2 shows an air cargo transport problem involving loading and unloading cargo onto and off of planes and flying it from place to place. The problem can be defined with three actions: Load, Unload, and Fly. The actions affect two predicates: In(c, p) means that cargo c is inside plane p, and At(x, a) means that object x (either plane or cargo) is at airport a. Note that cargo is not At anywhere when it is In a plane, so At really means “available for use at a given location.” It takes some experience with action definitions to handle such details consistently. The following plan is a solution to the problem: [Load(C1, P1, SFO), Fly(P1, SFO, JFK), Load(C2, P2, JFK), Fly(P2, JFK, SFO)] . Our representation is pure STRIPS. In particular, it allows a plane to fly to and from the same airport. Inequality literals in ADL could prevent this

Section 11.1. The Planning Problem 381 Example: The spare tire problem Consider the problem of changing a flat tire. More precisely, the goal is to have a good spare tire properly mounted onto the car’s axle, where the initial state has a flat tire on the axle and a good spare tire in the trunk. To keep it simple, our version of the problem is a very abstract one, with no sticky lug nuts or other complications. There are just four actions: removing the spare from the trunk, removing the flat tire from the axle, putting the spare on the axle, and leaving the car unattended overnight. We assume that the car is in a particularly bad neighborhood, so that the effect of leaving it overnight is that the tires disappear. The ADL description of the problem is shown in Figure 11.3. Notice that it is purely propositional. It goes beyond STRIPS in that it uses a negated precondition, ¬At(Flat, Axle), for the PutOn(Spare, Axle) action. This could be avoided by using Clear (Axle) instead, as we will see in the next example. Init(At(Flat, Axle) ∧ At(Spare, Trunk )) Goal(At(Spare, Axle)) Action(Remove(Spare, Trunk ), PRECOND: At(Spare, Trunk ) EFFECT: ¬ At(Spare, Trunk ) ∧ At(Spare, Ground)) Action(Remove(Flat, Axle), PRECOND: At(Flat, Axle) EFFECT: ¬ At(Flat, Axle) ∧ At(Flat, Ground)) Action(PutOn(Spare, Axle), PRECOND: At(Spare, Ground) ∧ ¬ At(Flat, Axle) EFFECT: ¬ At(Spare, Ground) ∧ At(Spare, Axle)) Action(LeaveOvernight, PRECOND: EFFECT: ¬ At(Spare, Ground) ∧ ¬ At(Spare, Axle) ∧ ¬ At(Spare, Trunk ) ∧ ¬ At(Flat, Ground) ∧ ¬ At(Flat, Axle)) Figure 11.3 The simple spare tire problem. Example: The blocks world BLOCKS WORLD One of the most famous planning domains is known as the blocks world. This domain consists of a set of cube-shaped blocks sitting on a table.3 The blocks can be stacked, but only one block can fit directly on top of another. A robot arm can pick up a block and move it to another position, either on the table or on top of another block. The arm can pick up only one block at a time, so it cannot pick up a block that has another one on it. The goal will always be to build one or more stacks of blocks, specified in terms of what blocks are on top of what other blocks. For example, a goal might be to get block A on B and block C on D. 3 The blocks world used in planning research is much simpler than SHRDLU’s version, shown on page 20

382 Chapter 11.Planning We will use On(to indicate that m the top of: the top ofy will be love(b,r,y) no oth er ble e on it On(,or,al tively, precona s in AD We can stay within the STRIPS language,however,by introducing a new predicate,Clear(),that is true when nothing ison The action Move moves a block from to y if b hband y are clear.After the move is made,x is clear but y is not.A formal description of Move in STRIPS is Action(Moue(b.x.u)」 PRECOND:On(b,)A Clear(b)A Clear(y), EFFECT:On(b,y)A Clear()AOn(b,)AClear(y)) is the table.When tion has effect Clear(Table tion C able ome To fx this to be cl to move ,we do two things.First,we introd luce another action to om the EFFECT:On(b,Table)AClear(r)AOn(b,)). Second,we take the interpretation of Clear(b)to be"there is a clear space on b to hold a block."Under this interpretation,Clear(Table)will always be true.The only problem is that nothing prevents the planner from using Move(b.,Table)instead of MoveToTable(b,) We could live with this problem-it will lead to a larger-than-necessary search space,but will not lead to incorrect answers-or we could introduce the predicate Block and add Block(b)A Block(y)to the precondition of Move. Finally,there is the problem of spurious actions such as Move(B,C.C),which should be a no-op,but which has contradictory effects.It is common to ignore such problems. because they seldom cause incorrect plans to be produced.The correct approach is add in- equality preconditions as shown in Figure 11.4. 11.2 PLANNING WITH STATE-SPACE SEARCH Now we tu our to planning algorithms The most straightforward approach is to state-space searc a planning problem specify btprendiiondeffetiposirrci se Because the des either forward from the initial state or backward from the goal,as shown in Figure 11.5.We can also use the explicit action and goal representations to derive effective heuristics automatically. Forward state-space search Planning with forward state-space search is similar to the problem-solving approach of Chap ter 3.It is sometimes called pr essionplanning.because it moves in the forward direction

382 Chapter 11. Planning We will use On(b, x) to indicate that block b is on x, where x is either another block or the table. The action for moving block b from the top of x to the top of y will be Move(b, x, y). Now, one of the preconditions on moving b is that no other block be on it. In first-order logic, this would be ¬∃ x On(x, b) or, alternatively, ∀ x ¬On(x, b). These could be stated as preconditions in ADL. We can stay within the STRIPS language, however, by introducing a new predicate, Clear(x), that is true when nothing is on x. The action Move moves a block b from x to y if both b and y are clear. After the move is made, x is clear but y is not. A formal description of Move in STRIPS is Action(Move(b, x, y), PRECOND:On(b, x) ∧ Clear(b) ∧ Clear(y), EFFECT:On(b, y) ∧ Clear(x) ∧ ¬On(b, x) ∧ ¬Clear(y)) . Unfortunately, this action does not maintain Clear properly when x or y is the table. When x = Table, this action has the effect Clear(Table), but the table should not become clear, and when y = Table, it has the precondition Clear(Table), but the table does not have to be clear to move a block onto it. To fix this, we do two things. First, we introduce another action to move a block b from x to the table: Action(MoveToTable(b, x), PRECOND:On(b, x) ∧ Clear(b)), EFFECT:On(b, Table) ∧ Clear(x) ∧ ¬On(b, x)) . Second, we take the interpretation of Clear(b) to be “there is a clear space on b to hold a block.” Under this interpretation, Clear(Table) will always be true. The only problem is that nothing prevents the planner from using Move(b, x, Table) instead of MoveToTable(b, x). We could live with this problem—it will lead to a larger-than-necessary search space, but will not lead to incorrect answers—or we could introduce the predicate Block and add Block(b)∧ Block(y) to the precondition of Move. Finally, there is the problem of spurious actions such as Move(B, C, C), which should be a no-op, but which has contradictory effects. It is common to ignore such problems, because they seldom cause incorrect plans to be produced. The correct approach is add inequality preconditions as shown in Figure 11.4. 11.2 PLANNING WITH STATE-SPACE SEARCH Now we turn our attention to planning algorithms. The most straightforward approach is to use state-space search. Because the descriptions of actions in a planning problem specify both preconditions and effects, it is possible to search in either direction: either forward from the initial state or backward from the goal, as shown in Figure 11.5. We can also use the explicit action and goal representations to derive effective heuristics automatically. Forward state-space search Planning with forward state-space search is similar to the problem-solving approach of ChapPROGRESSION ter 3. It is sometimes called progression planning, because it moves in the forward direction

384 Chapter 11.Planning ulting from an action ed by ad (i t nrst-orde case.we must apply the unifier from the preconditions to the effect literals.)Note that a single succ function works for all planning problems-a consequence of using an explicit action representation The goal test checks whether the state satisfies the goal of the planning problem. .The step cost ofeach action is typically 1.Although it would be easy to allow different costs for different actions,this is seldom done by STRIPS planners. problem is finit A pianingalg rom th days of planning re assumed that for und1998) state-spa is n come up 11.1.First,f ddre problen app m each approa a go n ar cargo problem with nas d 2 the cargo at airpor airport e Is mp pro ny the p the cargo But finding the solutio ecause the average b ng facto each of the scan ny to othe er airports,and each of the 200 packages eithe paded (if it is loaded),or ade any plane at its airport (if it is unloaded).Or average,let's say the re are about 1000 possible actions,so the search tree up to the dept the obviouss has about 100 nodes.It is clear that a very accurate heuristic will be ed to ma this kind of search efficient.We will discuss some possible heuristics after looking at backward search Backward state-space search Backward state-space search was described briefy as part of bidirectional search in Chapter 3 We noted there that backward search can be difficult to implement when the goal states are described by a set of constraints rather than being listed explicitly.In particular,it is not always obvious how to generate a description of the possible predecesse ofthesctofgal states.We will sce that the STRIPS representation makes this quite easy be ause sets of states an be described by the literals that must be true in those states The main adya of backward search is that it allows us to consider only relevant actions al For ample the al in our 10-airport ai cargo pro ble is to hav e20 es of arg re pr iselv At(C1.B)A At(C2.B)A...A At(C20.B) Now consider the coniunct At(C B)Working backwards.we can seek actions that have this as an effect.There is only one:Unload(Cp,B),where plane p is unspecified

384 Chapter 11. Planning • The actions that are applicable to a state are all those whose preconditions are satisfied. The successor state resulting from an action is generated by adding the positive effect literals and deleting the negative effect literals. (In the first-order case, we must apply the unifier from the preconditions to the effect literals.) Note that a single successor function works for all planning problems—a consequence of using an explicit action representation. • The goal test checks whether the state satisfies the goal of the planning problem. • The step cost of each action is typically 1. Although it would be easy to allow different costs for different actions, this is seldom done by STRIPS planners. Recall that, in the absence of function symbols, the state space of a planning problem is finite. Therefore, any graph search algorithm that is complete—for example, A ∗—will be a complete planning algorithm. From the earliest days of planning research (around 1961) until recently (around 1998) it was assumed that forward state-space search was too inefficient to be practical. It is not hard to come up with reasons why—just refer back to the start of Section 11.1. First, forward search does not address the irrelevant action problem—all applicable actions are considered from each state. Second, the approach quickly bogs down without a good heuristic. Consider an air cargo problem with 10 airports, where each airport has 5 planes and 20 pieces of cargo. The goal is to move all the cargo at airport A to airport B. There is a simple solution to the problem: load the 20 pieces of cargo into one of the planes at A, fly the plane to B, and unload the cargo. But finding the solution can be difficult because the average branching factor is huge: each of the 50 planes can fly to 9 other airports, and each of the 200 packages can be either unloaded (if it is loaded), or loaded into any plane at its airport (if it is unloaded). On average, let’s say there are about 1000 possible actions, so the search tree up to the depth of the obvious solution has about 100041 nodes. It is clear that a very accurate heuristic will be needed to make this kind of search efficient. We will discuss some possible heuristics after looking at backward search. Backward state-space search Backward state-space search was described briefly as part of bidirectionalsearch in Chapter 3. We noted there that backward search can be difficult to implement when the goal states are described by a set of constraints rather than being listed explicitly. In particular, it is not always obvious how to generate a description of the possible predecessors of the set of goal states. We will see that the STRIPS representation makes this quite easy because sets of states can be described by the literals that must be true in those states. RELEVANCE The main advantage of backward search is that it allows us to consider only relevant actions. An action is relevant to a conjunctive goal if it achieves one of the conjuncts of the goal. For example, the goal in our 10-airport air cargo problem is to have 20 pieces of cargo at airport B, or more precisely, At(C1, B) ∧ At(C2, B) ∧ . . . ∧ At(C20, B) . Now consider the conjunct At(C1, B). Working backwards, we can seek actions that have this as an effect. There is only one: Unload(C1, p, B), where plane p is unspecified