Planning as Heuristic Forward search Brian c。 Willams March31. 2004 16412J6834J Slides with help from: Prof maria Fox Readings in Planning as Forward Heuristic Search The ff planning system Fast Plan Generation Through Heuristic Search, by Jorg Hoffmann and Bernhard Nebel Journalof Artificial Intelligence Research 2001 “" Planning as heuristic search,” by blai Bonet and Hector Geffner. Artificial Intelligence Journal, 2001
Planning as Heuristic Forward Search Brian C. Williams March31, 2004 16.412J/6.834J Slides with help from: Prof. Maria Fox Readings in Planning as Forward Heuristic Search “The FF Planning System: Fast Plan Generation Through Heuristic Search,” by Jorg Hoffmann and Bernhard Nebel, Journal of Artificial Intelligence Research, 2001. “Planning as Heuristic Search,” by Blai Bonet and Hector Geffner, Artificial Intelligence Journal, 2001
Outline Introduction to ff FF Search Algorithm FF Heuristic Fn FF Example Appendix: HSP Example: Polish Move from room x to room y Polish from room x to room y pre: robot is in x, door open pre: door open add: robot is in y add: floor polished del: robot is in x Initial State Open door n In(a) pre: door is closed Closed add door is open del: door is closed Final State (B) Close door Closed door is open Polished add: door is closed del: door is open
Outline Introduction to FF FF Search Algorithm FF Heuristic Fn FF Example Appendix: HSP Example: Polish Move from room x to room y pre: robot is in x, door open add: robot is in y del: robot is in x Open door pre: door is closed add: door is open del: door is closed Close door pre: door is open add: door is closed del: door is open Polish from room x to room y pre: door open add: floor polished Initial State In(A) Closed Final State In(B) Closed Polished
Planning as Forward heuristic search Planning can be seen as a state space search for a path from the initial state to a goal state Planning research has largely not been concerned with finding optimal solutions Although heuristic preference to shorter plans Planning research has largely used incomplete or uninformed search methods Breadth first search Meta search rules >The size of most state spaces requires informative heuristics to guide the search Review: Search Strategies Breadth first search (Uninformed) systematic search of state space in layers A* search (Informed) Expands search node with best estimated cost Estimated cost =cost-so-far optimistic-cost-to-go Greedy search Expands search node closest to the goal according to a heuristic function Hill-climbing search Move towards goal by random selection from the best children To apply informed search to planning need heuristic fn
Planning as Forward Heuristic Search Planning can be seen as a state space search, for a path from the initial state to a goal state. Planning research has largely not been concerned with finding optimal solutions. Although heuristic preference to shorter plans. Planning research has largely used incomplete or uninformed search methods. Breadth first search Meta search rules ÎThe size of most state spaces requires informative heuristics to guide the search. Review: Search Strategies Breadth first search (Uninformed) systematic search of state space in layers. A* search (Informed) Expands search node with best estimated cost. Estimated cost = cost-so-far + optimistic-cost-to-go Greedy search Expands search node closest to the goal according to a heuristic function. Hill-climbing search Move towards goal by random selection from the best children. Î To apply informed search to planning need heuristic fn
Fast Forward (FF) Forward-chaining heuristic search planner Basic principle: Hill-climb through the space of problem states, starting at the initial state Each child state results from apply a single plan operator Always moves to the first child state found that is closer to the goal Records the operators applied along the path E The operators leading to the goal constitute a plan Outline Introduction to ff FF Search Algorithm FF Heuristic Fn FF Example AppendiX: HSP
Fast Forward (FF) Forward-chaining heuristic search planner Basic principle: Hill-climb through the space of problem states, starting at the initial state. Each child state results from apply a single plan operator. Always moves to the first child state found that is closer to the goal. Records the operators applied along the path. ÖThe operators leading to the goal constitute a plan. Outline Introduction to FF FF Search Algorithm FF Heuristic Fn FF Example Appendix: HSP
Planning Problem and State Space A planning problem is a tuple Propositions P Ground actions A are instantiated operators Initial state I is a subset of P, and Goal state G is a subset of P The state space of a problem consists of all subsets of propositions P a transition between two states is any valid application of an action, that is, its preconditions are satisfied FF Search Strategy FF uses a strategy called enforced hill-climbing Obtain heuristic estimate of the value of the current state Find action(s)transitioning to a better state Move to the better state Append actions to plan head E Never backtrack over any choice
Planning Problem and State Space A planning problem is a tuple : Propositions P Ground actions A are instantiated operators Initial state I is a subset of P, and Goal state G is a subset of P. The state space of a problem consists of all subsets of propositions P. A transition between two states is any valid application of an action, that is, its preconditions are satisfied. FF Search Strategy FF uses a strategy called enforced hill-climbing: Obtain heuristic estimate of the value of the current state. Find action(s) transitioning to a better state. Move to the better state. Append actions to plan head. Ö Never backtrack over any choice
h(S1)<h(S4)<h(int<h(S2)<h(S3)<h(S5)=h(S6) Maximize Utility h h(init) Init A h(s1) h(S2) h(S3) S1 S2 S3 B h(S4) S5) h( s6 S4 S5 S6 Plan head: A. B Finding a better state: Plateaus h(S7)<h(S6)=h(S8)..=h(S10)<h(S11)<h(S12) S1S6/". C S7 Sa h(s8)(sob(s9) h(S10 h(s11) h(S12) S10 12 Perform breadth first search from the current state to states reachable by action applications, stopping as soon as a strictly better one is found
Init h(init) S1 h(S1) S2 h(S2) S3 h(S3) S4 h(S4) S5 h(S5) S6 h(S6) h(S1) < h(S4) <h(init) < h(S2) < h(S3) < h(S5) = h(S6) A B Plan Head: A, B Maximize Utility h S10 h(S10) S11 h(S11) S12 h(S12) Finding a better state: Plateaus Perform breadth first search from the current state, to states reachable by action applications, stopping as soon as a strictly better one is found. S7 h(S7) S8 h(S8) S9 h(S9) C S6 h(S6) D h(S7) < h(S6) = h(S8) . . . = h(S10) < h(S11) < h(S12)
Enforced Hill-climbing (cont) The success of this strategy depends on how informative the heuristic is FF uses a heuristic found to be informative in a large class of bench mark planning domains The strategy is not complete Never backtracking means that some parts of the search space are lost If FF fails to find a solution using this strategy it switches to standard best-first search (e. g, Greedy or A* search) Outline Introduction to ff FF Search Algorithm FF Heuristic Fn FF Example AppendiX: HSP
Enforced Hill-Climbing (cont.) The success of this strategy depends on how informative the heuristic is. FF uses a heuristic found to be informative in a large class of bench mark planning domains. The strategy is not complete. Never backtracking means that some parts of the search space are lost. If FF fails to find a solution using this strategy it switches to standard best-first search. (e. g., Greedy or A* search). Outline Introduction to FF FF Search Algorithm FF Heuristic Fn FF Example Appendix: HSP
FFs Heuristic Estimate The value of a state is a measure of how close it is to a goal state This cannot be determined exactly(too hard but can be approximated One way of approximating is to solve a relaxed problem Relaxation is achieved by ignoring the negative effects of the actions The relaxed action set. a' is defined by iI a in Distance Estimate Extracted From A Relaxed Plan Graph Current: In(A), Closed Goal: In(B) In(a) loop n (A) In(A) noop Closed noop Closed Move In B) ned noop Closed Opened Ope Layer 3 Layers correspond to successive time points layers indicate minimum time to achieve goals
FF’s Heuristic Estimate The value of a state is a measure of how close it is to a goal state. This cannot be determined exactly (too hard), but can be approximated. One way of approximating is to solve a relaxed problem. Relaxation is achieved by ignoring the negative effects of the actions. The relaxed action set, A', is defined by: A' = { | a in A} Distance Estimate Extracted From A Relaxed Plan Graph Layers correspond to successive time points, # layers indicate minimum time to achieve goals. In(A) Closed Layer 1 Current: In(A), Closed Goal: In(B) Open noop noop In(A) Closed Opened Layer 2 In(B) noop noop Move Open noop Close In(A) Closed Opened Layer 3
Building the Relaxed Plan Graph Start at the initial state Repeatedly apply all relaxed actions whose preconditions are satisfied Assert their (positive)effects in the next layer If all actions are applied and the goals are not all present in the final graph layer, Then the problem is unsolvable Distance Estimate Extracted From A Relaxed Plan Graph Current: In(A), Closed Goal: In(B) In(a) noop In(A) In(A) noop Closed noop Closed Move In B) ned noop Closed Opened Close Ope Layer 3 Layers correspond to successive time points layers indicate minimum time to achieve goals
Building the Relaxed Plan Graph Start at the initial state. Repeatedly apply all relaxed actions whose preconditions are satisfied. Assert their (positive) effects in the next layer. If all actions are applied and the goals are not all present in the final graph layer, Then the problem is unsolvable. Distance Estimate Extracted From A Relaxed Plan Graph Layers correspond to successive time points, # layers indicate minimum time to achieve goals. In(A) Closed Layer 1 Current: In(A), Closed Goal: In(B) Open noop noop In(A) Closed Opened Layer 2 In(B) noop noop Move Open noop Close In(A) Closed Opened Layer 3
Extracting a Relaxed soln When a layer containing all of the goals is reached, FF searches backwards for a plan The first possible achiever found is always used to achieve each goal This maximizes the possibility for exploiting actions in the relaxed plan The relaxed plan might contain many actions happening concurrently at a layer The number of actions in the relaxed plan is an estimate of the true cost of achieving the goals Distance Estimate Extracted From A Relaxed Plan Graph Current: In(A), Closed Goal: In(B) In(a) noop In(A) In(A) noop Closed noop Closed Move In B) Opened Closed Op noop Opened noo Close Ope Layer 3 Layers correspond to successive time points layers indicate minimum time to achieve goals
Extracting a Relaxed Soln When a layer containing all of the goals is reached, FF searches backwards for a plan. The first possible achiever found is always used to achieve each goal. This maximizes the possibility for exploiting actions in the relaxed plan. The relaxed plan might contain many actions happening concurrently at a layer. The number of actions in the relaxed plan is an estimate of the true cost of achieving the goals. Distance Estimate Extracted From A Relaxed Plan Graph Layers correspond to successive time points, # layers indicate minimum time to achieve goals. In(A) Closed Layer 1 Current: In(A), Closed Goal: In(B) Open noop noop In(A) Closed Opened Layer 2 In(B) noop noop Move Open noop Close In(A) Closed Opened Layer 3