Planning in Artificial Intelligence PDF

Planning Artificial intelligence Kristóf Karacs PPKE-ITK Recap n What is intelligence? n Agent model n Problem solving by search ¨ Non-informed search strategies ¨ Informed search strategies n Logic ¨ Propositional logic ¨ Predicate logic 1 Outline n Planning and search n Situation calculus n Partial order planning n Graphplan Planning n Planning ¨ Initial state ¨ Goal state ¨ Set of actions n Can be described as a search problem 2 Planning vs. search Problems of using search for planning n Description of actions ¨ By defining follower states n Description of states ¨ Every state has to be exactly given n Description of goals ¨ Only by defining goal states (and the heuristic) n Description of a plan ¨ Fixed order of actions, can only be started from the start or the goal state 3 Undefined starting state n What if initial state is not known exactly? ¨ E.g. “Start in bottom row, with goal being C” n Search over “sets” of underlying (atomic) states n Inefficient approach ¨ Exponential blowup in the number of sets of atomic states Planning as logic search n A classic approach to planning: situation calculus n It uses ¨ FOPL descriptions of the relevant sets of states and actions ¨ ATP to find a plan 4 Situation Calculus n Reification – treat situations as objects and use them as predicate arguments ¨ At(Agent, Room 13, s8) where s8 refers to a particular situation n Result function – gives the new situation resulting from taking an action in another situation ¨ Result(StandUp, s1) = s3 n Effect Axioms – what is the effect of taking an action in the world ¨ " x.s. Present(x,s) Ù Portable(x) → Holding(x, Result(Grab, s)) ¨ " x.s. ¬ Holding(x, Result(Drop, s)) n Frame Axioms – what does NOT change ¨ " x.s. color(x,s) = color(x, Result(Grab, s)) ¨ Can be included among effect axioms Planning in situation calculus n Use theorem proving to find a plan n Goal state: $s. At(Home, s) Ù Holding(Gold, s) n Initial state: At(Home, s0) Ù ¬ Holding(Gold, s0) Ù Holding(Rope, s0) … n Plan: Result(North, Result(Grab, Result(South, s0))) ¨ A situation that satisfies the requirements ¨ Course of actions can be read out ¨ First, move South, then Grab and then move North 5 Problems of using situation calculus for planning n Reducing specific planning problem to general problem of theorem proving is not efficient ¨ Exponential complexity ¨ Optimality of plan is difficult to assess n A more specialized approach can exploit special properties of planning problems Special properties of planning n Connect action descriptions and state descriptions (focus searching) ¨ If goal contains Holding(Gold) and Grab(Gold) causes Holding(Gold) to be true, then plan should include Grab(Gold) n Add actions to a plan in any order n Sub-problem independence n Restrict language for describing goals, states and actions 6 STRIPS: Stanford Research Institute Problem Solver n ~1971: The first real planning system n Pushing boxes between rooms STRIPS representation n States: conjunctions of ground literals ¨ In(robot, r3) Ù Closed(door6) Ù … n Goals: conjunctions of literals ¨ (implicit $ r) In(Robot, r) Ù In(Charger, r) n Actions (operators) ¨ Name (implicit "): Go(r1, r2) ¨ Preconditions: conjunction of literals n At(r1) Ù Path(r1, r2) ¨ Effects: conjunctions of literals (aka add-list & delete-list) n At(r2) Ù ¬ At(r1) ¨ Assumes no inference in relating predicates (only equality) 7 STRIPS example n Action ¨ Buy(x, store) n Pre: At(store), Sells(store, x) n Eff: Have(x) ¨ Go(x, y) n Pre: At(x) n Eff: At(y), ¬At(x) n Goal ¨ Have(Milk) Ù Have(Banana) Ù Have(Drill) n Start ¨ At(Home) Ù Sells(SM, Milk) Ù Sells(SM, Banana) Ù Sells(HW, Drill) Planning algorithms n Progression planners: consider the effect of all possible actions in a given state n Regression planners: to achieve a goal, what must have been true in previous state ¨ Have(M) Ù Have(B) Ù Have(D) ¨ Buy(M, store) At(store) Ù Sells(store, M) Ù Have(B) Ù Have(D) n Both have the problem of lack of direction – what action or goal to pursue next 8 Search in plan space n Situation space – both progressive and regressive planners plan in space of situations n Plan space – start with null plan and add steps to plan until it achieves the goal ¨ Much smaller complexity ¨ Planning order independent from execution order ¨ Least-commitment n “what actions” before “what order” ¨ Means-ends analysis – Try to match the available means to the current ends Partially ordered plan n Set of steps (instance of an operator) n Set of ordering constraints Si < Sj n Set of variable binding constraints v = x ¨v is a variable in a step; x is a constant or another variable n Set of causal links Si ®c Sj ¨ Step i achieves precondition c for step j 9 Initial plan n Steps: {start, finish} n Ordering: {start < finish} n start ¨ Pre: none ¨ Eff: start conditions n finish ¨ Pre: goal conditions ¨ Eff: none Completeness and consistency n A plan is complete iff every precondition of every step is achieved by some other step n Si ®c Sj (“step i achieves c for step j”) iff ¨ Si < Sj ¨ c Î effects(Si) ¨ ¬$ Sk. ¬c Î effects(Sk)and Si < Sk < Sj is consistent with the ordering constraints n A plan is consistent iff ¨ the ordering constraints are consistent and ¨ the variable binding constraints are consistent 10 Partially Ordered Plan (POP) n Plan ¨ Steps ¨ Ordering constraints ¨ Variable binding constraints ¨ Causal links n POP Algorithm ¨ Make initial plan ¨ Loop until plan is a complete n Select a subgoal n Choose an operator n Resolve threats Choosing an operator n Choose operator(c, Sneeds) ¨ Choose a step S from the plan or a new step S by instantiating an operator that has c as an effect ¨ If there’s no such step, then fail (backtrack) ¨ Add causal link S ®c Sneeds ¨ Add ordering constraint S < Sneeds ¨ Add variable binding constraints if necessary ¨ Add S to steps if necessary 11 Resolving threats n A step S threatens a causal link Si ®c Sj iff ¬ c Î effects(S) and it’s possible that Si < S < Sj n For each threat ¨ Choose n Promote S : S < Si < Sj n Demote S : Si < Sj < S ¨ If resulting plan is inconsistent, then Fail (backtrack) n Threats with variables ¨ S is a threat if there is any instantiation of the variables that makes ¬c Î effects(S) ¨ Negative binding STRIPS example n Action ¨ Buy(x, store) n Pre: At(store), Sells(store, x) n Eff: Have(x) ¨ Go(x, y) n Pre: At(x) n Eff: At(y), ¬At(x) n Goal ¨ Have(Milk) Ù Have(Banana) Ù Have(Drill) n Start ¨ At(Home) Ù Sells(SM, Milk) Ù Sells(SM, Banana) Ù Sells(HW, Drill) 12 Start At(HO) Ù S(HW,D) Ù S(SM,M) Ù S(SM,B) H(D) Ù H(M) Ù H(B) Finish Start At(HO) Ù S(HW,D) Ù S(SM,M) Ù S(SM,B) At(s1) Ù S(s1,D) At(s2) Ù S(s2,M) At(s3) Ù S(s3,B) Buy(D,s1) Buy(M,s2) Buy(B,s3) H(D) H(M) H(B) H(D) Ù H(M) Ù H(B) Finish 13 Start At(HO) Ù S(HW,D) Ù S(SM,M) Ù S(SM,B) s1/HW s2/SM s3/SM At(HW) Ù S(HW,D) At(SM) Ù S(SM,M) At(SM) Ù S(SM,B) Buy(D,HW) Buy(M,SM) Buy(B,SM) H(D) H(M) H(B) H(D) Ù H(M) Ù H(B) Finish Start At(HO) Ù S(HW,D) Ù S(SM,M) Ù S(SM,B) s1/HW s2/SM s3/SM At(HW) Ù S(HW,D) At(SM) Ù S(SM,M) At(SM) Ù S(SM,B) Buy(D,HW) Buy(M,SM) Buy(B,SM) H(D) H(M) H(B) H(D) Ù H(M) Ù H(B) Finish 14 Start At(HO) Ù S(HW,D) Ù S(SM,M) Ù S(SM,B) At(x1) At(x2) Go(x1,HW) Go(x2,SM) At(HW) Ù ØAt(x1) At(SM) Ù ØAt(x2) At(HW) Ù S(HW,D) At(SM) Ù S(SM,M) At(SM) Ù S(SM,B) Buy(D,HW) Buy(M,SM) Buy(B,SM) H(D) H(M) H(B) H(D) Ù H(M) Ù H(B) Finish Start x1/HO At(HO) Ù S(HW,D) Ù S(SM,M) Ù S(SM,B) x2/HO At(HO) At(HO) Go(HO,HW) Go(HO,SM) At(HW) Ù ØAt(HO) At(SM) Ù ØAt(HO) At(HW) Ù S(HW,D) At(SM) Ù S(SM,M) At(SM) Ù S(SM,B) Buy(D,HW) Buy(M,SM) Buy(B,SM) H(D) H(M) H(B) H(D) Ù H(M) Ù H(B) Are we ready? Finish 15 Start At(HO) Ù S(HW,D) Ù S(SM,M) Ù S(SM,B) At(HO) At(HO) Go(HO,HW) Go(HO,SM) At(HW) Ù ØAt(HO) At(SM) Ù ØAt(HO) At(HW) Ù S(HW,D) At(SM) Ù S(SM,M) At(SM) Ù S(SM,B) Buy(D,HW) Buy(M,SM) Buy(B,SM) H(D) H(M) H(B) H(D) Ù H(M) Ù H(B) Finish Start At(HO) Ù S(HW,D) Ù S(SM,M) Ù S(SM,B) At(HO) At(HO) Go(HO,HW) Go(HO,SM) At(HW) Ù ØAt(HO) At(SM) Ù ØAt(HO) At(HW) Ù S(HW,D) At(SM) Ù S(SM,M) At(SM) Ù S(SM,B) Buy(D,HW) Buy(M,SM) Buy(B,SM) H(D) H(M) H(B) H(D) Ù H(M) Ù H(B) Finish 16 Start At(HO) Ù S(HW,D) Ù S(SM,M) Ù S(SM,B) At(HO) At(x2) Go(HO,HW) Go(x2,SM) At(HW) Ù ØAt(HO) At(SM) Ù ØAt(x2) At(HW) Ù S(HW,D) At(SM) Ù S(SM,M) At(SM) Ù S(SM,B) Buy(D,HW) Buy(M,SM) Buy(B,SM) H(D) H(M) H(B) H(D) Ù H(M) Ù H(B) Finish Start x2/HW At(HO) Ù S(HW,D) Ù S(SM,M) Ù S(SM,B) At(HO) At(HW) Go(HO,HW) Go(HW,SM) At(HW) Ù ØAt(HO) At(SM) Ù ØAt(HW) At(HW) Ù S(HW,D) At(SM) Ù S(SM,M) At(SM) Ù S(SM,B) Buy(D,HW) Buy(M,SM) Buy(B,SM) H(D) H(M) H(B) H(D) Ù H(M) Ù H(B) Finish 17 Start At(HO) Ù S(HW,D) Ù S(SM,M) Ù S(SM,B) At(HO) At(HW) Go(HO,HW) Go(HW,SM) At(HW) Ù ØAt(HO) At(SM) Ù ØAt(HW) At(HW) Ù S(HW,D) At(SM) Ù S(SM,M) At(SM) Ù S(SM,B) Buy(D,HW) Buy(M,SM) Buy(B,SM) H(D) H(M) H(B) H(D) Ù H(M) Ù H(B) Finish Start At(HO) Ù S(HW,D) Ù S(SM,M) Ù S(SM,B) At(HO) At(HW) Go(HO,HW) Go(HW,SM) At(HW) Ù ØAt(HO) At(SM) Ù ØAt(HW) At(HW) Ù S(HW,D) At(SM) Ù S(SM,M) At(SM) Ù S(SM,B) Buy(D,HW) Buy(M,SM) Buy(B,SM) H(D) H(M) H(B) H(D) Ù H(M) Ù H(B) Finish 18 Sussman anomaly n Subgoal dependence C ¨ Goal: on(A,B) Ù on(B,C) A B n Exercise ¨ Objects: A, B, C, T ¨ Predicates n on(x,y), clear(x) A ¨ Operators B n move(x,y,z) C Sussman anomaly n Subgoal dependence C ¨ Goal: on(A,B) Ù on(B,C) A B n Exercise ¨ Objects: A, B, C, T ¨ Predicates B n on(x,y), clear(x) A C B C A ¨ Operators n move(x,y,z) 19 Operators n Move(x,y,z) ¨ Pre: on(x,y), clear(x), clear(z) ¨ Eff: on(x,z), clear(y), ¬on(x,y), ¬clear(z) n How do we move to the table? n Move2T(x,y) ¨ Pre: on(x,y), clear(x) ¨ Eff: on(x,T), clear(y), ¬on(x,y) Operators n Move(x,y,z) ¨ Pre: on(x,y), clear(x), clear(z), block(z) ¨ Eff: on(x,z), clear(y), ¬on(x,y), ¬clear(z) n How do we move to the table? n Move2T(x,y) ¨ Pre: on(x,y), clear(x) ¨ Eff: on(x,T), clear(y), ¬on(x,y) 20 Limitations of the STRIPS language n Hierarchical planning ¨ Generating complex plans often requires abstract planning over increasingly detailed search spaces n Complex state conditions ¨ STRIPS variables are limited in their complexity ¨ There is no quantification and no conditional statements n Representing time ¨ The STRIPS framework assumes that everything happens instantly ¨ Not possible to represent durations, deadlines, time windows, etc. n Resource limitations ¨ There is no way to represent the amount of available workers, equipment, money, etc. or constraints on them Planning graphs n POP ¨ “Human-like”, but very slow ¨ Efficiency hard to evaluate n GraphPlan ¨ Simplified planning model n propositional planner (no variables ® no matching) ¨ Bigger – separate propositions are needed for every combination of arguments ¨ Efficient algorithm ¨ Complexity between scheduling and planning 21 Planning graph Planning graph n Main idea ¨ Construct a graph of possible outcomes … … … 22 GraphPlan algorithm n Resembles iterative DFS 1. Make a plan graph of depth k 2. Search for a solution 3. If succeed, return a plan 4. Else k := k + 1 5. Go to step 1 Mutually exclusive actions n Two action instances at level i are mutex if ¨ Inconsistent effects n effect of one action is negation of effect of another ¨ Interference n one action deletes the precondition of the other ¨ Competing needs n the actions have preconditions that are mutex at level i - 1 23 Mutually exclusive propositions n Two propositions at level i are mutex if ¨ Negation n they are negations of one another ¨ Inconsistent support n all ways of achieving the propositions at level i - 1 are pairwise mutex Overview of nontrivial mutual exclusion classes Inconsistent Effects Interference (Precond-Effect) Competing Needs Inconsistent Support 24 Trends with new layers n Propositions monotonically increase n Actions monotonically increase n Proposition mutex relationships monotonically decrease n Action mutex relationships monotonically decrease p p p p A A A ¬q q q q ¬r ¬q ¬q ¬q B B ¬r r r ¬r ¬r Solution extraction n If all the literals in the goal appear at the deepest level and not mutex, then search for a solution for each subgoal at level i ¨ For each subgoal at level i n Choose an action to achieve it n If it’s mutex with another action, Fail ¨ Repeat for preconditions at level i - 2 25 Example: Dinner date n Initial conditions: garbage Ù cleanHands Ù quiet n Goal: dinner Ù present Ù ¬ garbage n Actions: ¨ Cook precondition: cleanHands effect: dinner ¨ Wrap precondition: quiet effect: present ¨ Carry precondition: - effect: ¬ garbage Ù ¬ cleanHands ¨ Dolly precondition effect: ¬ garbage Ù ¬ quiet Search for a solution plan 26 Extensions n Lots of time optimizations n Disjunctive preconditions n Universally quantified (almost :) preconditions and effects n Conditional planning Other approaches n Hierarchical planning n SATPlan ¨ Reduces planning problem to satisfiability problem ¨ Strongly related to GraphPlan n FOPL like planning ¨ Using structural information and heuristics n Comeback of state-space planners ¨ UnPOP (1996), Heuristic Search Planner (1999), FastDownward (2004), LAMA (2008) n Introducing uncertainty ¨ Learning world dynamics ¨ Conditional planning ¨ Replanning 27

Planning in Artificial Intelligence PDF

Document Details

Tags

Related

Summary

Full Transcript