B551 Elements Of Artificial Intelligence PDF
Document Details
Uploaded by WorldFamousDogwood
Luddy School of Informatics, Computing, and Engineering
2024
Tags
Summary
These notes cover elements of artificial intelligence, focusing on search algorithms like local search, informed search, and adversarial search, along with their motivations and applications. The document includes examples and emphasizes the importance of optimization techniques within the field.
Full Transcript
10/5/2024 B551 Elements of Artificial Intelligence 1 Search, So Far Uninformed search Informed search Developing search heuristics Adversarial search 46 1 ...
10/5/2024 B551 Elements of Artificial Intelligence 1 Search, So Far Uninformed search Informed search Developing search heuristics Adversarial search 46 1 10/5/2024 Taking Stock We’ve looked at a lot of search methods. The physical symbol system (PSS) hypothesis says that PSSes have the necessary and sufficient conditions for intelligent behavior, and that this is exercised by search In small groups, think about everyday reasoning tasks that would be hard to handle as search problems using our methods Which tasks would be hard to handle, and why? 47 Taking Stock (With one more question) We’ve looked at a lot of search methods. The physical symbol system (PSS) hypothesis says that PSSes have the necessary and sufficient conditions for intelligent behavior, and that this is exercised by search In small groups, think about everyday reasoning tasks that would be hard to handle as search problems using our methods Which tasks would be hard to handle, and why? Does informed search help? Why, why not, when, how much? 48 2 10/5/2024 Task Domains Hard for our Methods so Far Our methods assume we have a pre-specified goal or a goal for which we can define a goal test. What if we don't? For complex spaces, big branching factors/deep solutions make them expensive or infeasible for time and/or space Informed search helps control costs---but may not be enough 49 Searching Complex Spaces: Local Search 52 3 10/5/2024 Motivations Compared to Informed Search When might informed search struggle (or not apply)? Solution space is large (or infinite!) Spaces without good heuristics for direction When quick solutions are required When local optima are prevalent When continuous optimization is needed When does informed search work too hard? When an approximate solution is fine 53 Local Search Light-memory search methods No search tree; only the current state is represented! Is this a problem? Applicable to problems where the path is irrelevant (e.g., design) Even if the path matters, there’s a workaround when it matters: Encode entire paths in the state Is a subset of optimization techniques (optimization techniques need not be local) 54 54 4 10/5/2024 Sample Tasks Scheduling (job scheduling, machine scheduling, airline scheduling) Constraint satisfaction (crossword puzzles, Sudoku) Vehicle routing Resource allocation (e.g., in a hospital) 55 Goal: Minimize Cost or Maximize Goodness Minimizing cost Given a goal and a heuristic function h(s) for distance to the goal, we can think in terms of minimizing a function h(s) reflecting distance to the goal…Because h(G)=0 for any goal G For many tasks we have a cost function, such as the error when training a machine learning system. E.g., in a neural network we want to find weights on links that minimize error Maximizing goodness For many tasks we want to maximize a function, such as profit or happiness. We may not know the maximum possible in advance---or ever. Either way, it’s an optimization problem! 56 56 5 10/5/2024 Example Local Search Methods Iterative improvement methods Hill climbing (maximizing an objective function) Simulated annealing (minimizing an energy function) Genetic algorithms (maximizing a fitness function) These approaches Start with an initial guess at the solution and gradually improve Rely on an objective function to estimate quality 59 Hill climbing on a surface of states Height Defined by Evaluation Function 60 6 10/5/2024 Hill-climbing (or descent) search Looks one step ahead to determine if any successor is better than the current state; if there is, move to the best successor. Rule (as written here assumes going down hill to minimize, but could used to maximize): If there exists a successor s for the current state n such that h(s) < h(n) and h(s) ≤ h(t) for all the successors t of n, then move from n to s. Otherwise, halt at n. (Or use variant that tries to alternatives with same value, but then need other end condition) Similar to Greedy search in that it uses h(), but does not allow backtracking or jumping to an alternative path since it doesn’t “remember” where it has been. Not complete. Why? The search may terminate at "local minima," "plateaus," and "ridges." 61 Hill climbing example (Here saving states) 2 8 3 1 2 3 start 1 6 4 h = -4 goal 8 4 h=0 7 5 7 6 5 -5 -5 2 8 3 h(n) = -(number of tiles out of place) 1 4 h = -3 7 6 5 Form groups to trace hill climbing for this problem for the first step, working individually but comparing each step with your group 62 7 10/5/2024 Hill climbing example (Here saving states) 2 8 3 1 2 3 start 1 6 4 h = -4 goal 8 4 h=0 7 5 7 6 5 -5 -5 -2 2 8 3 1 2 3 1 4 h = -3 8 4 h = -1 7 6 5 7 6 5 -3 -4 2 3 2 3 1 8 4 1 8 4 h = -2 7 6 5 7 6 5 h = -3 -4 h(n) = -(number of tiles out of place) 63 Exploring the Landscape local maximum Local Maxima: For maximization. These are peaks that aren’t the highest point in the space. For plateau minimization, the analogue is local minima. Plateaus: the space has a broad flat region that gives the search algorithm no direction (random walk) Ridges: flat like a plateau, but with ridge drop-offs to the sides; steps to the Image from: North, East, South and West may go http://classes.yale.edu/fractals/CA/GA/Fitness/Fitness.html down, but a step to the NW may go up. 65 8 10/5/2024 Drawbacks of hill climbing Problems: local maxima, plateaus, ridges Remedies: Random restart: keep restarting the search from random locations until a goal is found. Problem reformulation: reformulate the search space to eliminate these problematic features Some problem spaces are great for hill climbing; others are terrible 66 Gradient ascent / descent Images from http://en.wikipedia.org/wiki/Gradient_descent Gradient descent procedure for finding the argx min f(x) choose initial x0 randomly repeat xi+1 ← xi – η f '(xi) until the sequence x0, x1, …, xi, xi+1 converges Step size η (eta) is small (perhaps 0.1 or 0.05) 68 9 10/5/2024 Simulated annealing Simulated annealing (SA) exploits an analogy between How a metal cools and freezes into a minimum-energy crystalline structure (the annealing process), and The search for a minimum [or maximum] in a more general system. SA can avoid becoming trapped at local minima. SA uses a random search that accepts changes that increase objective function f, as well as some that decrease it. SA uses a control parameter T, which by analogy with the original application is known as the system “temperature.” T starts out high and gradually decreases toward 0. 70 Simulated annealing (cont.) A “bad” move from A to B is accepted with a probability P(moveA→B) = e ( f (B) – f (A)) / T The higher the temperature, the more likely it is that a bad move can be made. As T tends to zero, this probability tends to zero, and SA becomes more like hill climbing If T is lowered slowly enough, SA is highly likely to succeed Can often find good solutions fast 71 10 10/5/2024 The simulated annealing algorithm 72 Example Applications of Simulated Annealing I (Adapted from https://sites.gatech.edu/omscs7641/2024/02/19/simulated-annealing-methods-and-real-world- applications/) VSLI design: Examples include “placement, floorplan design, routing, PLA folding, Gate Matrix layout, logic array optimization, two-dimensional compaction, via minimization, logic minimization, transistor sizing, and digital filter design.” Placement: Arranging a set of circuit modules on a chip to minimizes a specified objective function. The primary objective is to reducing the overall chip area used by both the circuit modules and the interconnections (wire length reduction). The Timber Wolf standard-cell placement algorithm, which allows module rearrangements both within and between rows, including module overlaps, and then eliminates overlaps and restricting interchanges to adjacent modules in the same row. The cost function evaluates interconnection cost, module overlaps, and row length deviations 74 11 10/5/2024 Example of Simulated Annealing II: Traveling Salesman Problem The problem: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city? Applications include: School bus routing, Meals on Wheels deliveries, Scheduling the order in which an automated drill drills holes, genome sequencing (cities are maps, costs are a measure of the likelihood one follows another)---used for the mouse genome---NASA optimizing fuel use for maneuvers. 75 Visualizations Video comparing greedy, hill climbing, and simulated annealing for TSP: https://www.youtube.com/watch?v=SC5CX8drAtU Video of simulated annealing in detail: https://www.youtube.com/watch?v=NPE3zncXA5s Note on terminology in the video: Here the sequence of routes is considered as a Markov chain with probabilistic transitions from one state to the next (which depends on temperature). o Number of chains is the number of temperature settings tried (for each, a new solution process was run), and o Chain length is the number of variants of the path tried for a given temperature before stopping 76 12 10/5/2024 Example Applications of Simulated Annealing III (Adapted from https://sites.gatech.edu/omscs7641/2024/02/19/simulated-annealing-methods-and-real-world- applications/) Vehicle Routing: Optimizing routes for a limited number of vehicles to serve customers with specific time windows. Problem definition: How would you define the problem and objective for this task? 77 Example Applications of Simulated Annealing III (Adapted from https://sites.gatech.edu/omscs7641/2024/02/19/simulated-annealing-methods-and-real-world- applications/) Vehicle Routing: Optimizing routes for a limited number of vehicles to serve customers with specific time windows. Problem: Defined on a complete graph with nodes representing the depot, customers, and connections between them. Customers have fixed delivery and pickup demands, and the central depot serves as the distribution and collection center. A fleet of identical vehicles with capacity and dispatching cost is available at the depot. The depot and customers have time windows, treated as hard constraints, and vehicles must adhere to these time limits (waiting if they’re early). Distances between customers are based on Euclidean distance, and routes start and end at the depot, visiting customers at most once. Objective: Minimize overall cost while satisfying constraints of vehicle capacity, time windows, and service requirements. 78 13 10/5/2024 Genetic Algorithms Inspired by evolution “Evolve” a solution How can this be seen as search? What knowledge is needed? 80 Natural Evolution Natural evolution occurs by natural selection Individuals pass on traits to offspring Individuals have different traits Fittest individuals survive to produce more offspring Over time, variation can accumulate Result: new species 84 14 10/5/2024 Natural Evolution: Passing on Traits to Offspring Chromosomes carry genes for different traits Usually chromosomes paired - one from each parent Chromosomes are duplicated before mating Crossover mixes genetic material from chromosomes Each parent produces one single chromosome cell Mating joins cells 85 Natural Evolution Variation: Arises from crossover & mutation Crossover: Produces new gene combinations Mutation: Produces new genes Different traits lead to different fitnesses 86 15 10/5/2024 Simulated Evolution Evolving a solution Begin with population of individuals Individuals = candidate solutions ~chromosomes Produce offspring with variation Mutation: change features Crossover: exchange features between individuals Apply natural selection Select “best” individuals to go on to next generation Continue until satisfied with solution 87 Genetic algorithms Start with k random states (the initial population) New states are generated by “mutating” a single state or “reproducing” (combining via crossover) two parent states (selected according to their fitness) Representation is still important: The encoding used for the “genome” of an individual strongly affects the behavior of the search GAs explore multiple solutions in parallel, a strength when there are many good solutions. However, may have more risk from local optima than simulated annealing 88 16 10/5/2024 Practical Uses Genetic algorithms have been used for many complex hard-to-characterize tasks, such as Jet engine design Setting refinery valves Parameter tuning Feature selection for AI systems Drug discovery 89 Perhaps Not As Practical a Use How would you apply a GA to developing cookie recipes? 90 17 10/5/2024 Genetic Algorithm Example (Winston) Cookie recipes Explore amounts of sugar and flour Individual = batch of cookies Fitness: Quality: 0-9. Illustrations: 91 Representation and Operations Chromosomes = 2 genes: 1 chromosome each Flour Quantity, Sugar Quantity: 1-9 Mutation: Randomly select Flour/Sugar: +/- 1 [1-9] Crossover: Split 2 chromosomes & rejoin; keeping both 92 18 10/5/2024 Mutation & Crossover Mutation: 1 1 1 1 2 1 1 2 2 2 1 3 Crossover: 2 2 2 3 1 3 1 2 93 Fitness Natural selection: Most fit survive Fitness= Probability of survival to next gen Question: How do we measure fitness? “Standard method”: Relate fitness to quality: % of total quality Use as selection probability 94 19 10/5/2024 GA Procedure Create an initial population (Here, 1 chromosome) Mutate 1+ genes in 1+ chromosomes Produce one offspring for each chromosome Mate 1+ pairs of chromosomes with crossover Add mutated & offspring chromosomes to pop Create new population Best + randomly selected (biased by fitness) 96 GA Design Issues Genetic design: Identify sets of features = genes; Constraints? What would this mean for designing cookie recipes? Population: How many chromosomes? What happens if too few? If too many? Mutation: How frequent? What happens if too rare? If too frequent? 97 20 10/5/2024 GA Design Issues Genetic design: Identify sets of features = genes; Constraints? What would this mean for designing cookie recipes? Population: How many chromosomes? Too few => inbreeding; Too many=>too slow Mutation: How frequent? Too few=>slow change; Too many=> wild Crossover: Allowed? How selected? Duplicates? 98 GA Design: Basic Cookie GA Genetic design: Identify sets of features: 2 genes: flour+sugar;1-9 Population: How many chromosomes? 1 initial, 4 max Mutation: How frequent? 1 gene randomly selected, randomly mutated Crossover: Allowed? No Duplicates? No Survival: Standard method 99 21 10/5/2024 Cookie Fitness. What does this Mean? 1 2 3 4 5 4 3 2 1 Flour 2 3 4 5 6 5 4 3 2 3 4 5 6 7 6 5 4 3 4 5 6 7 8 7 6 5 4 5 6 7 8 9 8 7 6 5 4 5 6 7 8 7 6 5 4 3 4 5 6 7 6 5 4 3 2 3 4 5 6 5 4 3 2 1 2 3 4 5 4 3 2 1 Sugar 100 Example Mutation of 2 Generation 0: Chromosome Quality Chromosome Quality 1 4 4 1 1 1 2 2 3 1 3 3 Generation 1: 2 1 2 Chromosome Quality 1 2 2 1 2 2 1 1 1 1 1 1 Generation 3: Chromosome Quality Generation 2: 1 4 4 Chromosome Quality 1 3 3 1 3 3 1 2 2 1 2 2 2 1 2 1 1 1 101 22 10/5/2024 Basic Cookie GA Results Results are for 1000 random trials Initial state: 1 1-1, quality 1 chromosome On average, reaches max quality (9) in 16 generations Best: max quality in 8 generations Conclusion: Low dimensionality search Successful even without crossover 102 Adding Crossover Genetic design: Identify sets of features: 2 genes: flour+sugar;1-9 Population: How many chromosomes? 1 initial, 4 max Mutation: How frequent? 1 gene randomly selected, randomly mutated Crossover: Allowed? Yes, select random mates; cross at middle Duplicates? No Survival: Standard method 103 23 10/5/2024 Basic Cookie GA+Crossover Results Results are for 1000 random trials Initial state: 1 1-1, quality 1 chromosome On average, reaches max quality (9) in 14 generations Conclusion: Faster with crossover: combine good in each gene Key: Global max achievable by maximizing each dimension independently - reduce dimensionality 104 Why Can’t Basic GA Solve This? 1 2 3 4 5 4 3 2 1 Flour 2 0 0 0 0 0 0 0 2 3 0 0 0 0 0 0 0 3 4 0 0 7 8 7 0 0 4 5 0 0 8 9 8 0 0 5 4 0 0 7 8 7 0 0 4 3 0 0 0 0 0 0 0 3 2 0 0 0 0 0 0 0 2 1 2 3 4 5 4 3 2 1 Sugar 105 24 10/5/2024 Solving the Moat Problem Problem: No single step mutation can reach optimal values using standard fitness 1 2 3 4 5 4 3 2 1 (quality=0 => probability=0) 2 0 0 0 0 0 0 0 2 Solution A: 3 0 0 0 0 0 0 0 3 Crossover can combine fit parents in 4 0 0 7 8 7 0 0 4 EACH gene 5 0 0 8 9 8 0 0 5 However, still slow: 155 4 0 0 7 8 7 0 0 4 generations on average 3 0 0 0 0 0 0 0 3 2 0 0 0 0 0 0 0 2 1 2 3 4 5 4 3 2 1 106 Questions How can we avoid the 0 quality problem? How can we avoid local maxima? 107 25 10/5/2024 Rethinking Fitness Goal: Explicit bias to best Remove implicit biases based on quality scale Solution: Rank method Ignore actual quality values except for ranking Step 1: Rank candidates by quality Step 2: Probability of selecting ith candidate, given that i-1 candidate not selected, is constant p. Step 2b: Last candidate is selected if no other has been Step 3: Select candidates using the probabilities 108 Rank Method Chromosome Quality Rank Std. Fitness Rank Fitness 14 4 1 0.4 0.667 13 3 2 0.3 0.222 12 2 3 0.2 0.074 52 1 4 0.1 0.025 75 0 5 0.0 0.012 Results: Average over 1000 random runs on Moat problem - 75 Generations (vs 155 for standard method) No 0 probability entries: Based on rank not absolute quality 109 26 10/5/2024 What’s the problem underlying stopping on local maxima, and what could help? One approach is to increase the mutation rate How could we revise our fitness function to help address this? 110 Diversity Diversity: Degree to which chromosomes exhibit different genes Rank & Standard methods look only at quality Need diversity: escape local min, variety for crossover “As good to be different as to be fit” 111 27 10/5/2024 GA’s and Local Maxima Quality metrics only: Susceptible to local max problems Quality + Diversity: Can populate all local maxima Including global max Key: Population must be large enough 114 Genetic Algorithms Summary Evolution mechanisms as search technique Produce offspring with variation Mutation, Crossover Select “fittest” to continue to next generation Fitness: Probability of survival Standard: Quality values only Rank: Quality rank only Rank-space: Rank of sum of quality & diversity ranks Large population is robust to local max 115 28 10/5/2024 Illustrations Evolving virtual creatures https://www.youtube.com/watch?v=bBt0imn77Zg Learning to walk in a simulator https://www.youtube.com/watch?v=RThs0nXNWRE 119 29