Artificial Intelligence Lecture Notes PDF
Document Details
Uploaded by Deleted User
Prof. Hussein Alzoubi
Tags
Summary
These lecture notes cover constraint satisfaction problems (CSPs) in artificial intelligence. The notes detail the components of a CSP, including variables, domains, and constraints, and provide examples including map coloring and job-shop scheduling.
Full Transcript
Artificial intelligence Prof. Hussein Alzoubi Introduction In this chapter we use a factored representation for each state: a set of variables, each of which has a value. A problem is solved when each variable has a value that satisfies all the constraints on the variable...
Artificial intelligence Prof. Hussein Alzoubi Introduction In this chapter we use a factored representation for each state: a set of variables, each of which has a value. A problem is solved when each variable has a value that satisfies all the constraints on the variable. A problem described this way is called a constraint satisfaction problem, or CSP. 12/17/2024 Prof. Hussein Alzoubi 2 …. Introduction CSP search algorithms take advantage of the structure of states and use general rather than domain-specific heuristics to enable the solution of complex problems. The main idea is to eliminate large portions of the search space all at once by identifying variable/value combinations that violate the constraints. CSPs have the additional advantage that the actions and transition model can be deduced from the problem description. 12/17/2024 Prof. Hussein Alzoubi 3 A constraint satisfaction problem consists of three components, X, D, and C: X is a set of variables, {X1,... , Xn}. D is a set of domains, {D1,... , Dn}, one for each variable. C is a set of constraints that specify allowable combinations of values. A domain, Di, consists of a set of allowable values, {v1,... ,vk}, for variable Xi. For example, a Boolean variable would have the domain {true, false}. Different variables can have different domains of different sizes. 12/17/2024 Prof. Hussein Alzoubi 4 … Defining Constraint Satisfaction Problems Each constraint Cj consists of a pair〈scope, rel〉, where scope is a tuple of variables that participate in the constraint and rel is a relation that defines the values that those variables can take on. A relation can be represented as an explicit set of all tuples of values that satisfy the constraint, or as a function that can compute whether a tuple is a member of the relation. For example, if X1 and X2 both have the domain {1,2,3}, then the constraint saying that X1 must be greater than X2 can be written as 〈(X1,X2), {(3,1), (3,2), (2,1)} 〉 or as 〈(X1,X2), X1 > X2〉. 12/17/2024 Prof. Hussein Alzoubi 5 … Defining Constraint Satisfaction Problems CSPs deal with assignments of values to variables, {Xi = vi, Xj = vj,...}. An assignment that does not violate any constraints is called a consistent or legal assignment. A complete assignment is one in which every variable is assigned a value, and a solution to a CSP is a consistent, complete assignment. A partial assignment is one that leaves some variables unassigned, and a partial solution is a partial assignment that is consistent. Solving a CSP is an NP-complete problem in general, although there are important subclasses of CSPs that can be solved very efficiently. 12/17/2024 Prof. Hussein Alzoubi 6 6.1.1 Example problem: Map coloring 12/17/2024 Prof. Hussein Alzoubi 7 6.1.2 Example problem: Job-shop scheduling We consider a small part of the car assembly, consisting of 15 tasks: install axles (front and back), affix all four wheels (right and left, front and back), tighten nuts for each wheel, affix hubcaps, and inspect the final assembly. We can represent the tasks with 15 variables: Next, we represent precedence constraints between individual tasks. 12/17/2024 Prof. Hussein Alzoubi 8 … Example problem: Job-shop scheduling Suppose we have four workers to install wheels, but they have to share one tool that helps put the axle in place. We need a disjunctive constraint to say that AxleF and AxleB must not overlap in time; either one comes first or the other does: We also need to assert that the inspection comes last and takes 3 minutes. For every variable except Inspect we add a constraint of the form X + dX ≤ Inspect. Finally, suppose there is a requirement to get the whole assembly done in 30 minutes. We can achieve that by limiting the domain of all variables: This particular problem is trivial to solve, but CSPs have been successfully applied to jobshop scheduling problems like this with thousands of variables. 12/17/2024 Prof. Hussein Alzoubi 9 6.1.3 Variations on the CSP formalism The simplest kind of CSP involves variables that have discrete, finite domains. Map-coloring problems and scheduling with time limits are both of this kind. The 8-queens problem can also be viewed as a finite-domain CSP, where the variables Q1,... ,Q8 correspond to the queens in columns 1 to 8, and the domain of each variable specifies the possible row numbers for the queen in that column, Di = {1,2,3,4,5,6,7,8}. The constraints say that no two queens can be in the same row or diagonal. 12/17/2024 Prof. Hussein Alzoubi 10 … Variations on the CSP formalism Constraint satisfaction problems with continuous domains are common in the real world and are widely studied in the field of operations research. The best-known category of continuous-domain CSPs is that of linear programming problems, where constraints must be linear equalities or inequalities. 12/17/2024 Prof. Hussein Alzoubi 11 … Variations on the CSP formalism Types of constraints: The simplest type is the unary constraint e.g.,〈(SA), SA ≠ green〉 A binary constraint relates two variables. For example, SA ≠ NSW is a binary constraint. A binary CSP is one with only unary and binary constraints; it can be represented as a constraint graph, as in Figure 6.1(b). We can also define higher-order constraints. The ternary constraint Between(X,Y,Z), for example, can be defined as 〈(X,Y,Z), X Y > Z〉. 12/17/2024 Prof. Hussein Alzoubi 12 … Variations on the CSP formalism A constraint involving an arbitrary number of variables is called a global constraint. One of the most common global constraints is Alldiff, which says that all of the variables involved in the constraint must have different values. In Sudoku problems (see Section 6.2.6), all variables in a row, column, or 3×3 box must satisfy an Alldiff constraint. 12/17/2024 Prof. Hussein Alzoubi 13 … Variations on the CSP formalism 12/17/2024 Prof. Hussein Alzoubi 14 … Variations on the CSP formalism Every finite-domain constraint can be reduced to a set of binary constraints if enough auxiliary variables are introduced. This means that we could transform any CSP into one with only binary constraints— which certainly makes the life of the algorithm designer simpler. Another way to convert an n-ary CSP to a binary one is the dual graph transformation: create a new graph in which there will be one variable for each constraint in the original graph, and one binary constraint for each pair of constraints in the original graph that share variables. 12/17/2024 Prof. Hussein Alzoubi 15 … Variations on the CSP formalism … Dual graph transformation example: Consider a CSP with the variables X = {X,Y,Z}, each with the domain {1,2,3,4,5}, and with the two constraints C1:〈(X,Y,Z), X+Y = Z〉and C2:〈 (X,Y), X+1=Y 〉 Then the dual graph would have the variables X = {C1,C2}, where the domain of the C1 variable in the dual graph is the set of {(xi,yj,zk)} tuples from the C1 constraint in the original problem, and similarly, the domain of C2 is the set of {(xi,yj)} tuples. The dual graph has the binary constraint 〈(C1, C2), R1〉, where R1 is a new relation that defines the constraint between C1 and C2; in this case it would be R1 = {((1,2,3), (1,2)), ((2,3,5), (2,3))}. 12/17/2024 Prof. Hussein Alzoubi 16 … Variations on the CSP formalism There are however two reasons why we might prefer a global constraint such as Alldiff rather than a set of binary constraints. First, it is easier and less error-prone to write the problem description using Alldiff. Second, it is possible to design special-purpose inference algorithms for global constraints that are more efficient than operating with primitive constraints. 12/17/2024 Prof. Hussein Alzoubi 17 … Variations on the CSP formalism The constraints we have described so far have all been absolute constraints, violation of which rules out a potential solution. Many real-world CSPs include preference constraints indicating which solutions are preferred. For example, in a university class-scheduling problem there are absolute constraints that no professor can teach two classes at the same time. But we also may allow preference constraints: Prof. R might prefer teaching in the morning, whereas Prof. N prefers teaching in the afternoon. A schedule that has Prof. R teaching at 2 p.m. would still be an allowable solution but would not be an optimal one. 12/17/2024 Prof. Hussein Alzoubi 18 … Variations on the CSP formalism Preference constraints can often be encoded as costs on individual variable assignments— for example, assigning an afternoon slot for Prof. R costs 2 points against the overall objective function, whereas a morning slot costs 1. With this formulation, CSPs with preferences can be solved with optimization search methods, either path-based or local. We call such a problem a constrained optimization problem, or COP. Linear programs are one class of COPs. 12/17/2024 Prof. Hussein Alzoubi 19 A CSP algorithm can generate successors by choosing a new variable assignment, or it can do a specific type of inference called constraint propagation: using the constraints to reduce the number of legal values for a variable, which in turn can reduce the legal values for another variable, and so on. The idea is that this will leave fewer choices to consider when we make the next choice of a variable assignment. 12/17/2024 Prof. Hussein Alzoubi 20 … Constraint Propagation: Inference in CSPs Constraint propagation may be intertwined with search, or it may be done as a preprocessing step, before search starts. Sometimes this preprocessing can solve the whole problem, so no search is required at all. The key idea is local consistency. If we treat each variable as a node in a graph and each binary constraint as an edge, then the process of enforcing local consistency in each part of the graph causes inconsistent values to be eliminated throughout the graph. 12/17/2024 Prof. Hussein Alzoubi 21 6.2.1 Node consistency A single variable (corresponding to a node in the CSP graph) is node- consistent if all the values in the variable’s domain satisfy the variable’s unary constraints. We say that a graph is node-consistent if every variable in the graph is node-consistent. It is easy to eliminate all the unary constraints in a CSP by reducing the domain of variables with unary constraints at the start of the solving process. As mentioned earlier, it is also possible to transform all n-ary constraints into binary ones. Because of this, some CSP solvers work with only binary constraints 12/17/2024 Prof. Hussein Alzoubi 22 6.2.2 Arc consistency A variable in a CSP is arc-consistent if every value in its domain satisfies the variable’s binary constraints. More formally, Xi is arc-consistent with respect to another variable Xj if for every value in the current domain Di there is some value in the domain Dj that satisfies the binary constraint on the arc (Xi,Xj). A graph is arc-consistent if every variable is arc-consistent with every other variable. 12/17/2024 Prof. Hussein Alzoubi 23 … Arc consistency For example, consider the constraint Y = X2 where the domain of both X and Y is the set of decimal digits. We can write this constraint explicitly as 〈(X,Y), {(0,0), (1,1), (2,4), (3,9)}〉. To make X arc-consistent with respect to Y, we reduce X’s domain to {0,1,2,3}. If we also make Y arc-consistent with respect to X, then Y’s domain becomes {0,1,4,9}, and the whole CSP is arc-consistent. On the other hand, arc consistency can do nothing for the Australia map- coloring problem, Consider the following inequality constraint on (SA,WA): {(red, green), (red, blue), (green, red), (green, blue), (blue, red), (blue, green)}. No matter what value you choose for SA (or for WA), there is a valid value for the other variable. So, applying arc consistency has no effect on the domains of either variable. 12/17/2024 Prof. Hussein Alzoubi 24 … Arc consistency 12/17/2024 Prof. Hussein Alzoubi 25 6.2.3 Path consistency Path consistency tightens the binary constraints by using implicit constraints that are inferred by looking at triples of variables. A two-variable set {Xi,Xj} is path-consistent with respect to a third variable Xm if, for every assignment {Xi = a, Xj = b} consistent with the constraints (if any) on {Xi,Xj}, there is an assignment to Xm that satisfies the constraints on {Xi, Xm} and {Xm, Xj}. The name refers to the overall consistency of the path from Xi to Xj with Xm in the middle. 12/17/2024 Prof. Hussein Alzoubi 26 6.2.4 K-consistency Stronger forms of propagation can be defined with the notion of k- consistency. A CSP is k-consistent if, for any set of k−1 variables and for any consistent assignment to those variables, a consistent value can always be assigned to any kth variable. 1-consistency says that, given the empty set, we can make any set of one variable consistent: this is what we called node consistency. 2-consistency is the same as arc consistency. For binary constraint graphs, 3-consistency is the same as path consistency. A CSP is strongly k-consistent if it is k-consistent and is also (k−1)- consistent, (k−2)-consistent,... all the way down to 1-consistent. 12/17/2024 Prof. Hussein Alzoubi 27 6.2.5 Global constraints Global constraints occur frequently in real problems and can be handled by special-purpose algorithms that are more efficient than the general-purpose methods described so far. e.g., the Alldiff constraint says that all the variables involved must have distinct values Another important higher-order constraint is the resource constraint, sometimes called the Atmost constraint. For large resource-limited problems with integer values, domains are represented by upper and lower bounds and are managed by bounds propagation. We say that a CSP is bounds-consistent if for every variable X, and for both the lower-bound and upper-bound values of X, there exists some value of Y that satisfies the constraint between X and Y for every variable Y. 12/17/2024 Prof. Hussein Alzoubi 28 6.2.6 Sudoku A row, column, or box is called a unit 12/17/2024 Prof. Hussein Alzoubi 29 … Sudoku The Sudoku puzzles that appear in newspapers and puzzle books have the property that there is exactly one solution. Although some can be tricky to solve by hand, taking tens of minutes, a CSP solver can handle thousands of puzzles per second. A Sudoku puzzle can be considered a CSP with 81 variables, one for each square. We use the variable names A1 through A9 for the top row (left to right), down to I1 through I9 for the bottom row. The empty squares have the domain {1,2,3,4,5,6,7,8,9} and the pre-filled squares have a domain consisting of a single value. 12/17/2024 Prof. Hussein Alzoubi 30 … Sudoku In addition, there are 27 different Alldiff constraints, one for each unit (row, column, and box of 9 squares): Assume that the Alldiff constraints have been expanded into binary constraints (such as A1 ≠ A2) so that we can apply the AC-3 algorithm directly. 12/17/2024 Prof. Hussein Alzoubi 31 … Sudoku AC-3 works only for the easiest Sudoku puzzles. To solve the hardest puzzles and to make efficient progress, we need more complex inference strategies, such as “naked triples.” All the strategies—arc consistency, path consistency, and so on—apply generally to all CSPs, not just to Sudoku problems. Even naked triples is really a strategy for enforcing consistency of Alldiff constraints and is not specific to Sudoku per se. This is the power of the CSP formalism: for each new problem area, we only need to define the problem in terms of constraints; then the general constraint-solving mechanisms can take over. 12/17/2024 Prof. Hussein Alzoubi 32 Sometimes we can finish the constraint propagation process and still have variables with multiple possible values. In that case we have to search for a solution. A standard depth-limited search could solve CSPs A state would be a partial assignment, and an action would extend the assignment A crucial property of CSPs is commutativity A problem is commutative if the order of application of any given set of actions does not matter. 12/17/2024 Prof. Hussein Alzoubi 33 … Backtracking Search for CSPs 12/17/2024 Prof. Hussein Alzoubi 34 … Backtracking Search for CSPs Backtracking search can be improved using domain-independent heuristics that take advantage of the factored representation of CSPs. 12/17/2024 Prof. Hussein Alzoubi 35 6.3.1 Variable and value ordering Choosing the variable with the fewest “legal” values is called the minimum-remaining-values (MRV) heuristic. It also has been called the “most constrained variable” or “fail-first” heuristic. If some variable X has no legal values left, the MRV heuristic will select X and failure will be detected immediately— avoiding pointless searches through other variables. The MRV heuristic doesn’t help at all in choosing the first region to color in Australia, because initially every region has three legal colors. 12/17/2024 Prof. Hussein Alzoubi 36 … Variable and value ordering The degree heuristic attempts to reduce the branching factor on future choices by selecting the variable that is involved in the largest number of constraints on other unassigned variables. 12/17/2024 Prof. Hussein Alzoubi 37 … Variable and value ordering Once a variable has been selected, the algorithm must decide on the order in which to examine its values. The least-constraining-value heuristic prefers the value that rules out the fewest choices for the neighboring variables in the constraint graph. In general, the heuristic is trying to leave the maximum flexibility for subsequent variable assignments. 12/17/2024 Prof. Hussein Alzoubi 38 6.3.2 Interleaving search and inference We saw how AC-3 can reduce the domains of variables before we begin the search. But inference can be even more powerful during the course of a search: every time we make a choice of a value for a variable, we have a brand-new opportunity to infer new domain reductions on the neighboring variables. One of the simplest forms of inference is called forward checking Whenever a variable X is assigned, the forward-checking process establishes arc consistency for it: for each unassigned variable Y that is connected to X by a constraint, delete from Y’s domain any value that is inconsistent with the value chosen for X. 12/17/2024 Prof. Hussein Alzoubi 39 … Interleaving search and inference 12/17/2024 Prof. Hussein Alzoubi 40 … Interleaving search and inference The algorithm called MAC (for Maintaining Arc Consistency) detects inconsistencies ahead. After a variable Xi is assigned a value, the INFERENCE procedure calls AC-3, but instead of a queue of all arcs in the CSP, we start with only the arcs (Xi, Xj) for all Xj that are unassigned variables that are neighbors of Xi,. From there, AC-3 does constraint propagation in the usual way, and if any variable has its domain reduced to the empty set, the call to AC-3 fails and we know to backtrack immediately. 12/17/2024 Prof. Hussein Alzoubi 41 6.3.3 Intelligent backtracking: Looking backward The backtracking adopted in the BACKTRACKING-SEARCH algorithm is called chronological backtracking because the most recent decision point is revisited. The backjumping method backtracks to the most recent assignment in the conflict set. A different–and deeper–notion of the conflict set for a variable X: it is that set of preceding variables that caused X, together with any subsequent variables, to have no consistent solution. A backjumping algorithm that uses conflict sets defined in this way is called conflict-directed backjumping. 12/17/2024 Prof. Hussein Alzoubi 42 … Intelligent backtracking: Looking backward … conflict-directed backjumping let Xj be the current variable, and let conf(Xj) be its conflict set. If every possible value for Xj fails, backjump to the most recent variable Xi in conf(Xj) and recompute the conflict set for Xi as follows: conf(Xi) ← conf(Xi) ∪conf(Xj) −{Xi}. 12/17/2024 Prof. Hussein Alzoubi 43 6.3.4 Constraint learning When the search arrives at a contradiction, we know that some subset of the conflict set is responsible for the problem. Constraint learning is the idea of finding a minimum set of variables from the conflict set that causes the problem. This set of variables, along with their corresponding values, is called a no-good. We then record the no-good, either by adding a new constraint to the CSP to forbid this combination of assignments or by keeping a separate cache of no-goods. 12/17/2024 Prof. Hussein Alzoubi 44 Select the value that results in the minimum number of conflicts with other variables—the min-conflicts heuristic. 12/17/2024 Prof. Hussein Alzoubi 45 … Local Search for CSPs 12/17/2024 Prof. Hussein Alzoubi 46 … Local Search for CSPs Min-conflicts is surprisingly effective for many CSPs. Amazingly, on the n-queens problem, if you don’t count the initial placement of queens, the run time of min-conflicts is roughly independent of problem size. It solves even the million-queens problem in an average of 50 steps (after the initial assignment). It has also been used to schedule observations for the Hubble Space Telescope, reducing the time taken to schedule a week of observations from three weeks (!) to around 10 minutes. 12/17/2024 Prof. Hussein Alzoubi 47 … Local Search for CSPs All the local search techniques are candidates for application to CSPs, and some of those have proved especially effective. The landscape of a CSP under the minconflicts heuristic usually has a series of plateaus. Plateau search—allowing sideways moves to another state with the same score—can help local search find its way off this plateau. This wandering on the plateau can be directed with a technique called tabu search: keeping a small list of recently visited states and forbidding the algorithm to return to those states. Simulated annealing can also be used to escape from plateaus. 12/17/2024 Prof. Hussein Alzoubi 48 … Local Search for CSPs Another technique called constraint weighting aims to concentrate the search on the important constraints. Each constraint is given a numeric weight, initially all 1. At each step of the search, the algorithm chooses a variable/value pair to change that will result in the lowest total weight of all violated constraints. The weights are then adjusted by incrementing the weight of each constraint that is violated by the current assignment. This has two benefits: it adds topography to plateaus, making sure that it is possible to improve from the current state, and it also adds learning: over time the difficult constraints are assigned higher weights. 12/17/2024 Prof. Hussein Alzoubi 49 Independent subproblems— any solution for the mainland combined with any solution for Tasmania yields a solution for the whole map. A constraint graph is a tree when any two variables are connected by only one path. Any tree-structured CSP can be solved in time linear in the number of variables. The key is a new notion of consistency, called directional arc consistency or DAC. A CSP is defined to be directional arc-consistent under an ordering of variables X1,X2,... ,Xn if and only if every Xi is arc-consistent with each Xj for j > i. 12/17/2024 Prof. Hussein Alzoubi 50 … The Structure of Problems Can more general constraint graphs be reduced to trees somehow? 12/17/2024 Prof. Hussein Alzoubi 51 … The Structure of Problems 12/17/2024 Prof. Hussein Alzoubi 52 6.5.1 Cutset conditioning We can delete SA by fixing a value for SA and deleting from the domains of the other variables any values that are inconsistent with the value chosen for SA. 12/17/2024 Prof. Hussein Alzoubi 53 … Cutset conditioning The general algorithm is as follows: 1. Choose a subset S of the CSP’s variables such that the constraint graph becomes a tree after removal of S. S is called a cycle cutset. 2. For each possible assignment to the variables in S that satisfies all constraints on S, (a) remove from the domains of the remaining variables any values that are inconsistent with the assignment for S, and (b) if the remaining CSP has a solution, return it together with the assignment for S. Finding the smallest cycle cutset is NP-hard, but several efficient approximation algorithms are known. The overall algorithmic approach is called cutset conditioning; it comes up again in Chapter 13, where it is used for reasoning about probabilities. 12/17/2024 Prof. Hussein Alzoubi 54 6.5.2 Tree decomposition 12/17/2024 Prof. Hussein Alzoubi 55 … Tree decomposition A given graph admits many tree decompositions; in choosing a decomposition, the aim is to make the subproblems as small as possible. The tree width of a tree decomposition of a graph is one less than the size of the largest node; the tree width of the graph itself is defined to be the minimum width among all its tree decompositions. CSPs with constraint graphs of bounded tree width are solvable in polynomial time. 12/17/2024 Prof. Hussein Alzoubi 56 … Tree decomposition Unfortunately, finding the decomposition with minimal tree width is NP-hard, but there are heuristic methods that work well in practice. Whenever you have a cycle-cutset of size c, there is also a tree width of size w < c+1, and it may be far smaller in some cases. So, time consideration favors tree decomposition, but the advantage of the cycle-cutset approach is that it can be executed in linear memory, while tree decomposition requires memory exponential in w. 12/17/2024 Prof. Hussein Alzoubi 57 6.5.3 Value symmetry On the Australia map, we know that WA, NT, and SA must all have different colors, but there are 3! =6 ways to assign three colors to three regions. This is called value symmetry. We would like to reduce the search space by a factor of d! by breaking the symmetry in assignments. We do this by introducing a symmetry-breaking constraint. For our example, we might impose an arbitrary ordering constraint, NT < SA < WA, that requires the three values to be in alphabetical order. This constraint ensures that only one of the d! solutions is possible: {NT = blue,SA = green,WA = red}. 12/17/2024 Prof. Hussein Alzoubi 58