MIT Evolutionary Algorithms PDF
Document Details
Uploaded by GorgeousNarrative
MIT Manipal
Dr. Dayananda Pai
Tags
Summary
This document presents a lecture on evolutionary algorithms, providing an overview of the key concepts, types, and applications of these algorithms in optimization and machine learning.
Full Transcript
Evolutionary algorithms Dr. Dayananda Pai M.I.T Manipal Contents What is Evolutionary Algorithm? Need and General flow chart of Evolutionary Algorithms Difficulties with Classical Optimisation Techniques Types of Evolutionary Algorithms Examp...
Evolutionary algorithms Dr. Dayananda Pai M.I.T Manipal Contents What is Evolutionary Algorithm? Need and General flow chart of Evolutionary Algorithms Difficulties with Classical Optimisation Techniques Types of Evolutionary Algorithms Examples What is Evolutionary Algorithm (EA)? EA are algorithms that perform optimization or learning tasks with the ability to evolve. They have three main characteristics: Population-based. EAs maintain a group of solutions, called a population, to optimize or learn the problem in a parallel way. The population is a basic principle of the evolutionary process. Fitness-oriented. Every individual has its gene representation, called its code, and performance evaluation, called its fitness value. EAs prefer fitter individuals, which is the foundation of the optimization and convergence of the algorithms. Variation-driven. Individuals will undergo a number of variation operations to mimic genetic gene changes, which is fundamental to searching the solution space. What is Evolutionary Algorithm? When designing aircraft wings, the inputs might represent a description of a proposed wing shape. The model might contain equations of complex fluid dynamics to estimate the drag and lift coefficients of any wing shape. These estimates form the output of the system. A voice control system for smart homes takes as input the electrical signal produced when a user speaks into a microphone. Suitable outputs might be commands to be sent to the heating system, the TV set, or the lights. Thus in this case the model consists of a mapping from certain patterns in electrical waveforms coming from an audio input onto the outputs that would normally be created by key-presses on a keyboard Flowchart of a typical evolutionary algorithm A Typical Evolutionary Algorithm Cycle Step 1: Initialize the population randomly or with potentially good solutions. Step 2: Compute the fitness of each individual in the population. Step 3: Select parents using a selection procedure. Step 4: Create offspring by crossover and mutation operators. Step 5: Compute the fitness of the new offspring. Step 6: Select members of population to die using a selection procedure. Step 7: Go to Step 2 until termination criteria are met Why Evolutionary Algorithm? Multi dimensional objectives Conflicting objectives One of the most striking differences to classical search and optimization algorithms is that EAs use a population of solutions in each iteration, instead of a single solution In order to widen the applicability of an optimization algorithm in various different problem domains, natural and physical principles are mimicked to develop robust optimization algorithms. Difficulties with Classical Optimization Algorithms Most classical point-by-point algorithms use a deterministic procedure for approaching the optimum solution Since derivative information is not used, the direct search methods are usually slow, requiring many function evaluations for convergence The convergence to an optimal solution depends on the chosen initial solution. Most algorithms tend to get stuck to a suboptimal solution. Every classical optimization algorithm is designed to solve a specific type of problem. In most practical optimization problems, some decision variables are restricted to take discrete values only. Evolutionary Algorithms Genetic Algorithms Simulated annealing Neural Networks Tabu search Genetic Algorithm Based on Darwinian Paradigm Reproduction Competition Survive Selection Intrinsically a robust search and optimization mechanism What is GA A genetic algorithm (or GA) is a search technique used in computing to find true or approximate solutions to optimization and search problems. Genetic algorithms are categorized as global search heuristics. Genetic algorithms are a particular class of evolutionary algorithms that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover. What is GA Genetic algorithms are implemented as a computer simulation in which a population of abstract representations (called chromosomes or the genome) of candidate solutions (called individuals) to an optimization problem evolves toward better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s. What is GA The evolution usually starts from a population of randomly generated individuals and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are selected from the current population (based on their fitness), and modified (recombined and possibly mutated) to form a new population. What is GA The new population is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. If the algorithm has terminated due to a maximum number of generations, a satisfactory solution may or may not have been reached. Genetic Algorithms GAs do not use much knowledge about the problem to be optimised and do not deal directly with the parameters of the problem. They work with codes which represent the parameters. Thus, the first issue in a GA application is how to codethe problem under study second issue is how to create the initial population of possible solutions. The third issue is how to select a suitable set of genetic operators. Finally, how to know the quality of already found solutions to improve them Simulated annealing A probabilistic optimization technique inspired by the annealing process in metallurgy, where materials are heated and then slowly cooled to remove defects and minimize their energy state. This concept is applied to optimization problems, where the goal is to find a global minimum (or maximum) of a function, especially in cases with a complex, multi-dimensional search space. It is widely used in engineering, economics, machine learning, and other fields for solving large-scale optimization problems. Simulated annealing The simulated annealing algorithm was derived from statistical mechanics. Kirkpatrick et al. In the analogy between a combinatorial optimisation problem and the annealing process: the states of the solid represent feasible solutions of the optimisation problem, the energies of the states correspond to the values of the objective function computed at those solutions, the minimum energy state corresponds to the optimal solution to the problem and rapid quenching can be viewed as local optimization. Neural network Brain could solve many problems much easier than even the best computer – image recognition – speech recognition – pattern recognition A neural network is a computational model designed to simulate the way biological brains process information. It is a key component of machine learning, especially in areas like deep learning, where neural networks with many layers— called deep neural networks—are used for tasks such as image recognition, natural language processing, and autonomous systems. Neural Networks Neural networks are modelled on the mechanism of the brain [Kohonen, 1989; Hecht-Nielsen, 1990]. Theoretically, they have a parallel distributed information processing structure. Two of the major features of neural networks are their ability to learn from examples and their tolerance to noise and damage to their components. A neural network is a system of interconnected processing elements called neurones or nodes. Each node has a number of inputs and one output, which is a function of the inputs. There are three types of neuron layers: input, hidden, and output layers. Two layers communicate via a weight connection network. Tabu Search The tabu search algorithm was developed independently by Glover and Hansen for solving combinatorial optimisation problems. It is a kind of iterative search and is characterised by the use of a flexible memory. The process with which tabu search overcomes the local optimality problem is based on an evaluation function that chooses the highest evaluation solution at each iteration. Tabu Strategies Forbidding Strategy. This strategy is employed to avoid cycling problems by forbidding certain moves or in other words classifying them as tabu. In order to prevent the cycling problem, it is sufficient to check if a previously visited solution is revisited. Freeing Strategy. This strategy is used to decide what exits the tabu list. The strategy deletes the tabu restrictions of the solutions so that they can be reconsidered in further steps of the search. Short-Term Strategy (Overall Strategy). This strategy manages the interplay between the above two strategies. Genetic Algorithms (GA) Based on the mechanics of biological evolution Initially developed by John Holland, University of Michigan (1970’s) –To understand processes in natural systems –To design artificial systems retaining the robustness and adaptation properties of natural systems Holland’s original GA is known as the simple genetic algorithm(SGA) Provide efficient techniques for optimization and machine learning applications Widely used in business, science and engineering Image adapted from http://today.mun.ca/news.php?news_id=2376 Genetic Algorithm Inspired by natural evolution Population of individuals Individual is feasible solution to problem Each individual is characterized by a Fitness function Higher fitness is better solution Based on their fitness, parents are selected to reproduce offspring for a new generation Fitter individuals have more chance to reproduce New generation has same size as old generation; old generation dies Offspring has combination of properties of two parents If well designed, population will converge to optimal solution GAs are different from most optimization techniques in many ways: GAs work on a coding of the design variables– This characteristic allows the genetic algorithms to be extended to a design space consisting of a mix of continuous, discrete and integer variables. GAs proceed from several points in the design space to another set of design points. Consequently, GA techniques have a better chance of locating the global minima as opposed to these schemes that proceed from one point to another. GAs work on the function evaluations alone and do not require any of the function derivatives. GAs use probabilistic transition rules, which is an important advantage in guiding a highly exploitative search. Issues with Genetic Algorithms Choosing parameters: –Population size –Crossover and mutation probabilities –Selection, deletion policies –Crossover, mutation operators, etc. –Termination criteria Performance: –Can be too slow but covers a large search space –Is only as good as the fitness function Benefits of Genetic Algorithms Concept is easy to understand Supports multi-objective optimization Always an answer; answer gets better with time. Easy to exploit previous or alternate solutions Flexible building blocks for hybrid applications GA applications Aircraft Design Optimization Flight Control Systems Autopilot Tuning Adaptive Flight Control Trajectory Optimization Structural Design Noise Reduction Engine Design GA applications Vehicle Design Optimization Engine Performance Optimization Hybrid and Electric Vehicle Design Vehicle Dynamics and Control Anti-lock Braking Systems Active Suspension Systems Fuel Economy and Emissions Optimization Driver Assistance and Autonomous Systems Genetic Algorithm Genes Chromosomes (strings) Population (Genome) GAs do not use much knowledge about the problem to be optimised and do not deal directly with the parameters of the problem. They work with codes which represent the parameters Basics of GA The most common type of genetic algorithm works like this: a population is created with a group of individuals created randomly. The individuals in the population are then evaluated. The evaluation function is provided by the programmer and gives the individuals a score based on how well they perform at the given task. Two individuals are then selected based on their fitness, the higher the fitness, the higher the chance of being selected. These individuals then "reproduce" to create one or more offspring, after which the offspring are mutated randomly. This continues until a suitable solution has been found or a certain number of generations have passed, depending on the needs of the programmer. the first issue in a GA application is how to code the problem under study. The second issue is how to create the initial population of possible solutions. The third issue in a GA application is how to select a suitable set of genetic operators. Finally, as with other search algorithms, GAs have to know the quality of already found solutions to improve them further. Therefore, there is a need for an interface between the problem environment and the GA itself for the GA to have this knowledge. General Algorithm for GA Initialization Initially many individual solutions are randomly generated to form an initial population. The population size depends on the nature of the problem, but typically contains several hundreds or thousands of possible solutions. Traditionally, the population is generated randomly, covering the entire range of possible solutions. Creation of Initial Population There are two ways of forming this initial population. The first consists of using randomly produced solutions created by a random number generator. This method is preferred for problems about which no a priori knowledge exists or for assessing the performance of an algorithm. The second method employs a priori knowledge about the given optimization problem. Using this knowledge, a set of requirements is obtained and solutions which satisfy those requirements are collected to form an initial population. Genetic Operators Selection Crossover Mutation Selection Functions of Selection operator are Identify the good solutions in a population Make multiple copies of the good solutions Eliminate bad solutions from the population so that multiple copies of good solutions can be placed in the population Now how to identify the good solutions? A fitness value can be assigned to evaluate the solutions A fitness function value quantifies the optimality of a solution. The value is used to rank a particular solution against all the other solutions. A fitness value is assigned to each solution depending on how close it is actually to the optimal solution of the problem There are different techniques to implement selection in Genetic Algorithms 1. Tournament selection 2. Roulette wheel and Proportionate selection 3. Ranking selection 4. Steady state selection In the tournament selection, tournaments are played between two solutions and the better solution is chosen and placed in the mating pool. Two other solutions are picked again and another slot in the mating pool is filled with the better solution. Basically each solution can be made to participate in exactly two tournaments. The best solution in a population will win both times, thereby making two copies of it in the new population. Using a similar argument, the worst solution will lose in both tournaments and will be eliminated from the population Figure shows the random population of six cans. The fitness (this is the cost term, while for infeasible solutions it is the cost plus the penalty in proportion to constraint violation) of each can is marked on the latter. It is interesting to note that two solutions do not have an internal volume of 300 ml and thus are penalized by adding an extra artificial cost Chrome Fitness Pi (%RW) EC AC 6 12% 1 25% 1 44 0.25 1.53 2 5 21% 2 2 6 0.03 0.21 0 4% 3 36 0.21 1.25 1 4 3 4 30 0.17 1.04 17% 21% 1 5 36 0.21 1.25 1 6 21 0.12 1 0.73 Total 173 1.00 6 6 1. Roulette wheel Pi=Ind. Fitness/ Total; Estimated count (EC) =Pi X Total Chrome; AC= Actual count. Parents are selected according to their fitness. The better the chromosomes are, the more chances to be selected they have. Imagine a roulette wheel where are placed all chromosomes in the population, every selection has its place big accordingly to its fitness function. Then a marble is thrown there and selects the chromosome. Chromosome with bigger fitness will be selected more times In the Roulette wheel and proportionate selection method, solutions are assigned copies, the number of which is proportional to their fitness values. If the average fitness of all population members is favg, a solution with a fitness fi gets an expected fi/favg number of copies Rank selection first ranks the population and then every chromosome receives fitness from this ranking. The worst will have fitness 1, second worst 2 etc. and the best will have fitness N. Rank Selection Steady state selection The first best chromosome or the few best chromosomes are copied to the new population. The rest is done in a classical way. Such individuals can be lost if they are not selected to reproduce or if crossover or mutation destroys them. This significantly improves the GA’s performance Crossover Operator Selection operator cannot create any new solutions in the population. In crossover two strings are picked from the mating pool at random and some portion of the strings are exchanged between the strings to create two new strings Single crossover Double crossover Uniform crossover It is true that every crossover is not likely to find offspring better than both parent solutions. In order to preserve some good strings selected during the reproduction operator, not all strings in the population are used in a crossover. Initial Strings Offspring Single-Point 11000101 01011000 01101010 11000101 01011001 01111000 00100100 10111001 01111000 00100100 10111000 01101010 Two-Point 11000101 01011000 01101010 11000101 01111001 01101010 00100100 10111001 01111000 00100100 10011000 01111000 Uniform 11000101 01011000 01101010 01000101 01111000 01111010 00100100 10111001 01111000 10100100 10011001 01101000 Mutation Operator The bitwise mutation operator changes a 1 to a 0, and vice versa, with a mutation probability of Pm. The need for mutation is to keep diversity in the population. Goldberg (1989) suggested a mutation clock operator, where the location of the next mutated bit is determined by an exponential distribution. The procedure is, First, create a random number r belongs to [0,1]. Then, estimate the next bit to be mutated by skipping η= -Pm ln(1-r) bits from the current bit. Mutation Operator Alter each gene independently with a probability pm pm is called the mutation rate Typically between 1/pop_size and 1/ chromosome_length Step 1. Create initial population By using an in-builtrandom number generator, we can generate an “initial population” of binary strings. If r is a random number from an uniform distribution, then we can define z = 0 if r < 0.5, and z = 1 if r ≥ 0.5. Repeating this NB*NPOP times, we generate a population Step 2. Fitness evaluation The fitness of each member in the population, FIT(i), i=1,..., NPOP, is evaluated using the given objective function f (x). The user must define f such that it is positive for all members in the population Step 3. Roulette wheel marking Step 4. Creation of mating pool A random number r in the interval [0,1] is chosen (i.e., the roulette wheel is spun). Depending on interval in which r lies, the corresponding string is chosen from the population. For example, if r lies between A(1) and A(2), then the second member in the population is chosen. The roulette wheel is spun NPOP times, resulting in a mating pool with NPOP strings Step 5. Crossover Step 6. Mutation An example after Goldberg (1989) Simple problem: max x2 over {0,1,…,31} GA approach: Representation: binary code, e.g. 01101 13 Population size: 4 1-point Cross over, bitwise mutation Roulette wheel selection Random initialisation x2 example: selection We show one generational cycle done by hand X2 example: crossover X2 example: mutation GA for Multi variable Optimisation maximize f (x) subject to li ≤ xi ≤ ui ; i =1 to n Coding and Decoding of Variables In the most commonly used genetic algorithm, each variable is represented as a binary number say of m bits. This is conveniently carried out by dividing the feasible interval of variable xi into 2m −1 intervals. For m= 6, the number of intervals will be N = 2m −1 = 63. Then each variable xi can be represented by any of the discrete representations 000000, 000001, 000010, 000011,..., 111111 For example, the binary number 110110 can be decoded as 𝑥𝑖 =𝑥𝑙 + 𝑠𝑖 1 × 25 + 1 × 24 + 0 × 23 + 1 × 22 + 1 × 21 + 0 × 20 = 54𝑠𝑖 where si is the interval step for variable xi, given by 𝑢𝑖 −𝑙𝑖 𝑠𝑖 = 63 Step 1. Creation of Initial Population The first step in the development of GA is the creation of initial population. Each member of the population is a string of size n*m bits. The variables in the binary form are attached end for end as follows Step 2. Evaluation Sets of six binary digits are read and decoding is done. The function values f1, f2,..., fz are evaluated. The function values are referred to as “fitness values” in GA language. The average fitness is calculated. Step 3. Creation of a Mating Pool Here, the weaker members are replaced by stronger ones based on the fitness values. We make use of a simple chance selection process by simulating a roulette wheel. The first step in this process is to have all the function values as positive entities. It can be done by scaling as Step 4. Crossover Operation The crossover operation is now performed on the mating parent pool to produce offspring. A simple crossover operation involves choosing two members and a random integer k in the range 1 to nm –1 (6n –1 for six bit choice). The first k positions of the parents are exchanged to produce the two children. Step 5. Mutation Mutation is a random operation performed to change the expected fitness. Every bit of a member of the population is visited. A bit permutation probability bp of 0.005 to 0.1 is used. A random number r is generated in the range 0 to 1. If r < bp a bit is changed to 0 if it is 1 and 1 if it is a zero. Step 6. Evaluation The population is evaluated using the ideas developed in step 2. The highest fitness value and the corresponding variable values are stored in fmax and xmax. A generation is now completed. If the preset number of generations is complete, the process is stopped. If not, we go to step 3. Simulated Annealing Proposed by Kirkpatrick et al. which is based on the analogy between the annealing of solids and the problem of solving combinatorial optimisation problems. Basic Elements In the analogy between a combinatorial optimisation problem and the annealing process, the states of the solid represent feasible solutions of the optimisation problem. the energies of the states correspond to the values of the objective function computed at those solutions. Temperature of the states correspond to Control parameter. the minimum energy state corresponds to the optimal solution to the problem and rapid quenching can be viewed as local optimisation. Analogy Water Rings Game Basic Terminology Slowly cool down a heated solid, so that all particles arrange in the ground energy state At each temperature wait until the solid reaches its thermal Equilibrium Introduce a Temperature like parameter Boltzmann Probability Distribution E − OR P( E ) = e KT K – Boltzman’s constant P – Proability of being in energy state E At high value of T, finite probability of being at any state P(E(t+1)) = min [1,exp(-ΔE/kT)] – Where ΔE = E(t+1) – E(t) If ΔE is negative, P = 1, so accept new point Otherwise, accept only with a small probability Reduce T slowly over the iterations Algorithm 1. Choose x0, ε, T, n (number of iterations at each T); t =0 2. Calculate xt+1 = random point in neighbourhood of xt. 3. If ΔE=E(xt+1-E(xt))