LPG: Local search for Planning Graphs Seung H. Chung 16412J Cognitive Robotics Outline Temporal Action Graph lalksat: Stochastic Local Search Better Neighbor Relaxed Plan A Gerevini, A Saetti, L. Serina"Planning through Stochastic Local Search and Temporal Action Graphs", to appear in Journal of Artificial Intelligence Research JAIR)
LPG: Local search for Planning Graphs Seung H. Chung 16.412J Cognitive Robotics 2 Outline • • • • • Stochastic Local Search and Temporal Action Graphs”, to appear in Journal of Artificial Intelligence Research (JAIR). Temporal Action Graph Walksat: Stochastic Local Search Better Neighbor Relaxed Plan A. Gerevini, A. Saetti, I. Serina “Planning through
Overview Uses Strips-like operator but adds time and metric resources to the planning description: Planning Domain Definition Language(PDDL)2.1 · Features ses Temporal Action graphs(TA-graphs): Similar to a plan graph, but adds temporal representation ses stochastic local search: Similar to Walksat Uses relaxed plan for heuristic to guide the search: similar to fe LPG outperformed all general purpose planners in the time and metric resource domains (3d IPC) Linear Action Graph(LA-graph NiT Level 1 Level 2 Level 3 Level 4 One action er layer @一网
3 Overview • resources to the planning description: Planning Domain Definition Language (PDDL) 2.1 • – plan graph, but adds temporal representation – – to FF • the time and metric resource domains (3rd IPC) 4 INIT Level 1 Level 2 Level 3 Level 4 Linear Action Graph (LA-graph) f1 f2 f3 f4 f5 no-op f6 f3 f4 f5 f6 f9 f5 f6 f12 f5 f13 f12 f5 no-op no-op no-op no-op no-op a1 no-op no-op f7 f7 f7 no-op no-op a4 f10 a2 a no-op 3 One action per layer Uses Strips-like operator but adds time and metric Features: Uses Temporal Action graphs (TA-graphs): Similar to a Uses stochastic local search: Similar to Walksat Uses relaxed plan for heuristic to guide the search: similar LPG outperformed all general purpose planners in
Temporal Action Graph (TA-Graph) INIT Level 1 Level 2 Level 3 Level 4 .澹回回 0 50 (-) (-) Inconsistent 0 120 precondition 100 220 220 Ordering Constraint NiT Level 1 Level 2 Level 3 Level 4 f6 H(6-ooo6 a 50 160 (-) (-) I Causal Constraint 0 120 a1<a4 Exclusive Constraints 0 (-) g a2< (-)
5 INIT Level 1 Level 2 Level 3 Level 4 Temporal Action Graph (TA-Graph) f1 f2 f3 f4 f5 no-op f6 f3 f4 f5 f6 f9 f5 f6 f12 f5 f13 f12 f5 no-op no-op no-op no-op no-op a1 no-op no-op f7 f7 f7 no-op no-op a4 f10 a2 a no-op 3 0 0 0 0 0 [50] 50 [70] 120 [100] 220 [40] 160 50 50 50 (–) (–) (–) (–) 50 160 220 160 0 220 220 120 Inconsistent precondition 6 INIT Level 1 Level 2 Level 3 Level 4 Ordering Constraint f1 f2 f3 f4 f5 no-op f6 f3 f4 f5 f6 f9 f5 f6 f12 f5 f13 f12 f5 no-op no-op no-op no-op no-op a1 no-op no-op f7 f7 f7 no-op no-op a4 f10 a2 a no-op 3 0 0 0 0 0 [50] 50 [70] 120 [100] 220 [40] 160 50 50 50 (–) (–) (–) (–) 50 160 220 160 0 220 220 120 Causal Constraint • a1< a4 • a2< a3 Exclusive Constraints • a1< a2 • a2< a3 • a2< a4 Causal Constraint • a1< a4 • a2< a3 Exclusive Constraints • a1< a2 • a2< a3 • a2< a4
Walkplan: Stochastic Local Search Walkplan(Il, max steps, max restarts, p) II: Planning problem description max steps: Maximum number of search max restarts Maximum number of restart p: Noise factor Solution TA-graph ea With probability p use stochastic local search to find a plan Search the plan space max steps number of times If no plan is found, try restarting the search from the beginning up to max restarts number of time Walkplan Algorithm Walkplan(l, max steps, max restarts, p) for i =1 to max restarts de A= an initial TA-graph derived from n Set of TA-graphs in which an action for 3 =1 to max steps do was inserted or fa is solution then return A removed to o= an inconsistency in A resolve the N(o, A)= neighborhood of A for o inconsistency if彐A′∈N(o,A) such that A′ is no worse than a then else if random p then A=A′∈N(,A) A= best A′∈N(σ,A) What is a better neighbor? return fail
7 Walkplan: Stochastic Local Search • Walkplan(Π,max_steps,max_restarts,p) – Input • Π : Planning problem description • max_steps : Maximum number of search • max_restarts : Maximum number of restart • p : Noise factor – Output • Solution TA-graph • Idea: – With probability p use stochastic local search to find a plan – Search the plan space max_steps number of times – If no plan is found, try restarting the search from the beginning up to max_restarts number of time 8 Walkplan Algorithm Walkplan(Π,max_steps,max_restarts,p) for i = 1 to max_restarts do A = an initial TA-graph derived from Π for j = 1 to max_steps do if A is solution then return A σ = an inconsistency in A N(σ,A) = neighborhood of A for σ if ∃A’∈ N(σ,A) such that A’ is no worse than A then A = A’ else if random < p then A = A’∈ N(σ,A) else A = best A’∈ N(σ,A) return fail Set of TA-graphs in which an action was inserted or removed to resolve the inconsistency What is a better neighbor? What is a better neighbor?
Better Neighbor A neighbor A'E N(o, A)resolves the inconsistency o by inserting or deleting an action Use evaluation function E(a) E(a=a Execution cost (a +β Temporal cost(a) +y Search cost(a) E(a)= E(a=aEXecution cost(a +β Temporal cost(a) +r Search cost(a) Relaxed plan Idea: Don't consider the mutex relation and perform reachability analysis Insert action a Find all actions that is required to support the preconditions of a Compute the maximum time duration required for all actions Return Set of actions added: Aset(EvalAdd(a)) Max time duration of the actions: End time(EvalAdd(a)) Remove action a Find all actions that is required to support all preconditions that were unsupported due to removal of a Compute the maximum time duration required for all newly inserted actions Return Set of actions added: Aset(EvalDel(a Max time duration of the actions: End time(EvalDel(a))
9 Better Neighbor? • A’∈ N(σ,A) resolves the inconsistency σ by inserting or deleting an action. • E(a) = E(a)i = α⋅Execution_cost (a)i + β⋅Temporal_cost(a)i + γ⋅Search_cost(a)i E(a)r = α⋅Execution_cost (a)r + β⋅Temporal_cost(a)r + γ⋅Search_cost(a)r 10 Relaxed Plan • • – – – • • End_time(EvalAdd(a)) • – unsupported due to removal of a – actions – • • End_time(EvalDel(a)) A neighbor Use evaluation function E(a) Idea: Don’t consider the mutex relation and perform reachability analysis. Insert action a Find all actions that is required to support the preconditions of a Compute the maximum time duration required for all actions Return: Set of actions added: Aset(EvalAdd(a)) Max time duration of the actions: Remove action a Find all actions that is required to support all preconditions that were Compute the maximum time duration required for all newly inserted Return: Set of actions added: Aset(EvalDel(a)) Max time duration of the actions:
Better Neighbor Insert an action EXecution_cost (a)'=2a'EAset(EvalAdd(a)Cost(a') Temporal cost(a)) End time(EvalAdd(a)) Search cost(a)' =Aset(EvalAdd(a) 十 a'∈Aset( EvalAdd(a) Threats(a) Remove an action Execution_cost(a)「 a'∈Aset( EvalDel(a) Cost(a') Temporal cost(a) =End time(EvalAdd(a)) Search cost(a) = Aset(EvalAdd(a) a'∈Aset( EvadE(a) Threats(a) Advantages and disadvantages Pros One of the fastest domain-independent planners Relatively expressive domain description languages Can easily be extended to be anytime algorithm ·Cons Algorithm is not guaranteed to be complete No guarantee on the quality of the plan Does not allow flexible time bounds
11 Better Neighbor • • Execution_cost (a)i = Σa’∈Aset(EvalAdd(a))Cost(a’) Temporal_cost(a)i = End_time(EvalAdd(a)) Search_cost(a)i = |Aset(EvalAdd(a))| + Σa’∈Aset(EvalAdd(a))Threats(a’) Execution_cost (a)r = Σa’∈ ))Cost(a’) Temporal_cost(a)r = End_time(EvalAdd(a)) Search_cost(a)r = |Aset(EvalAdd(a))| + Σa’∈ ))Threats(a’) 12 Advantages and Disadvantages – – – – – – Insert an action Remove an action Aset(EvalDel(a Aset(EvalDel(a • Pros One of the fastest domain-independent planners Relatively expressive domain description languages Can easily be extended to be anytime algorithm • Cons Algorithm is not guaranteed to be complete No guarantee on the quality of the plan Does not allow flexible time bounds