Summary

This document explores adversary search strategies, focusing on game-playing algorithms. It delves into the MinMax algorithm and alpha-beta pruning, techniques used for evaluating possible moves and reducing search space. The document also provides examples and considerations for heuristic functions in these searches.

Full Transcript

Adversary Search Jordi Nin Index 01 Introduction 02 MinMax algorithm 03 αβ-pruning 2 01 Introduction Example I -- Tic Tac Toe 4 Example II -- 4 in a row 5 Example III -- Checkers...

Adversary Search Jordi Nin Index 01 Introduction 02 MinMax algorithm 03 αβ-pruning 2 01 Introduction Example I -- Tic Tac Toe 4 Example II -- 4 in a row 5 Example III -- Checkers 6 Example IV -- Chess 7 Example V -- Go 8 What shared features have these games? Common features Two-player games Non-infinite number of movements Competitive games (no collaboration between players) Zero-sum games (the win of one player is the loss of the other) Both players have perfect information (unlike poker) Deterministic, no randomness (unlike D&D) 10 Required hypotheses Adversaries introduce an element of uncertainty into the search process. Therefore, we need some hypotheses to create a search algorithm: We assume the adversary will always perform the best possible movement. We can define a heuristic to evaluate any possible game state. We have enough time to search (not required to find a solution, but at least to simulate a few movement levels). 11 Takeaways Some games can be solved by searching Randomness is a problem We need some hypotheses to model the adversary behavior 12 02 MinMax algorithm Game tree A game tree is a tree in which each node represents a configuration (e.g., a board position). The root corresponds to the current position. The children of a node represent configurations reachable in one move. Every other level corresponds to alternating opponent moves. 14 Game tree example 15 MinMax Algorithm Algorithm MinMax Require: state sn , depth d, and max_level (Boolean) Ensure: h(sn ) 1: if d == 0 OR h(sn ) == 0 then ▷ h(sn ) == 0 means game over 2: Return h(sn ) 3: end if 4: if max_level then ▷ if it is a maximizing level 5: max_eval ← −∞ 6: for su ∈ sn.children() do ▷ For all sn children 7: eval ← MinMax(su , d-1, False) 8: max_eval ← max(max_eval, eval) 9: end for 10: Return max_eval 11: else 12: min_eval ← ∞ 13: for su ∈ sn.children() do ▷ For all sn children 14: eval ← MinMax(su , d-1, True) 15: min_eval ← min(min_eval, eval) 16: end for 17: Return min_eval 16 18: end if Extra considerations When the search space is too large to be searched entirely (or it is a real-time game), the analysis is truncated at some fixed level, and the resulting terminal nodes are evaluated by a heuristic function. Heuristic functions are implemented using any appropriate representation. One common, simple form is a weighted sum of game-specific features, such as material difference, figure mobility, attacked figures, and so on. A significant share of the secret of writing intense computer games lies in the "black magic" of crafting good evaluation functions. The challenge lies in striking the right balance between giving an indicative value without high computational cost. 17 Takeaways Game trees may store all the possible configurations MinMax algorithm is a modification of the A∗ algorithm Heuristics play an important role in game search 18 03 αβ -pruning Idea MinMax algorithm performs a depth-first search of the game tree and assigns a value to each node. αβ-pruning is a branch-and-bound technique to determine the MinMax value by examining only a part of the game tree. It requires two bounds (αβ-window) that are used to restrict the search: α is the least value the player can achieve β is the largest value the player can achieve If the initial window is (−∞, ∞), the αβ-pruning determines the correct root value. 20 Possible cut-offs There are two types to be distinguished: shallow and deep cut-offs. 21 Shallow cut-off 22 Deep cut-off 23 αβ-pruning Algorithm Algorithm αβ-pruning Require: state sn , depth d, max_level (Boolean), α, and β Ensure: h(sn ) if d == 0 OR h(sn ) == 0 then ▷ h(sn ) == 0 means game over Return h(sn ) ▷ Propagate value end if if max_level then ▷ if it is a maximizing level max_eval ← α for su ∈ sn.children() do ▷ For all sn children eval ← MinMax(su , d-1, False, max_eval, β) max_eval ← max(max_eval, eval) if max_eval ≥ β then Return max_eval ▷ Propagate value end if end for Return max_eval ▷ Propagate value else min_eval ← β for su ∈ sn.children() do ▷ For all sn children eval ← MinMax(su , d-1, True, α, min_eval) min_eval ← min(min_eval, eval) if min_eval ≤ α then Return min_eval ▷ Propagate value end if end for Return min_eval ▷ Propagate value end if 24 Animated example αβ-pruning 25 Time complexity MinMax algorithm has a complexity equal to O(bd ) where b and d stand for the node degree and depth respectively. If the heuristic function sorts correctly the most promising node, the αβ-pruning √ algorithm has a complexity equal to O(bd/2 ), then, the resulting node degree is b. For a heuristic function ordering nodes at random, the αβ-pruning algorithm has a complexity equal to O(b3d/4 ). 26 Takeaways The alternating minimization and maximization process allows us to filter out impossible paths. αβ-pruning algorithm always improve the time complexity of the MinMax algorithm. 27 Questions? Heuristic search Jordi Nin Index 01 Introduction 02 Heuristic 03 A∗ algorithm 04 Heuristic properties 2 01 Introduction Sliding tiles puzzle Starting point → Goal Sliding puzzle game consists of a frame of numbered square tiles in random order, with one tile missing. The object of the puzzle is to place the tiles in order by making sliding moves that use the empty space. 4 N queens problem N Queens problem is to place n queens in such a manner on an nxn chessboard that no queens attack each other by being in the same row, column, or diagonal. 5 Systematic heuristic search in state spaces I Search ↓ We solve the problem by searching 6 Systematic heuristic search in state spaces II Space states ↓ We represent the problem using states ↓ We move from one to another using "legal" operators 7 Systematic heuristic search in state spaces III Systematic ↓ We traverse the space exhaustively ↓ We only skip one state if it is invalid (no solution) ↓ Complete algorithms ↓ If there is a solution, we find it 8 Systematic heuristic search in state spaces IV Heuristic ↓ There is a function that indicates how far we are from the solution 9 Goal The solution is the sequence of movements that takes us from the initial to the end state. Usually, we look for the shortest (less costly) solution. 10 Takeaways A state is a computer-based representation of a problem We solve problems by searching We need a function to guide us during the search 11 02 Heuristic Origin 13 Origin Eureka! → heuriskein (find in Greek) → heuristic 13 Idea It is a process to solve a problem. Sometimes it is not possible. There is a function (heuristic), denoted as h(sn ), to estimate the remaining cost to solve the problem from the current state sn. This information can be exploited by search algorithms to assess whether one state is more promising than the rest. In this setting, there is a trade-off: h(sn ) reduces the search time, by discarding not interesting states We have to compute h(sn ), this might be expensive. The use of h(sn ) is only a good idea if h(sn ) computation time saves time compared to searching the entire space. It improves the average cost, not the worst-case cost. 14 Can you propose a possible heuristic for the previous problems? Example I. Sliding tiles puzzle Number of tiles placed in a wrong place (Hamming distance) Sum of the Manhattan distance of wrongly placed tiles 16 Example II. N queens problem Number of discarded cells by each queen Largest number of empty cells in the row with fewer empty cells 17 Takeaways An heuristic h(s)n is a mathematical function to evaluate the goodness of one state sn h(sn ) has a trade-off: computational cost versus discarded states 18 03 A∗ algorithm Priority queue I A priority queue is similar to a regular queue or stack, but each element has an associated priority. Elements with high priority are served before elements with low priority. If two elements have the same priority, they are served in the same order that they were enqueued. 20 Priority queue II When a new element comes in, it is placed in order of its priority. 21 Best first search I It is a greedy algorithm, with the following steps: 1) Visit the first state of the priority queue. If the queue is empty the problem has no solution. 2) If the successor's heuristic is better than its parent, the successor is set at the front of the queue. 3) Else, the successor is inserted into the queue (in a location determined by its heuristic value). 4) If the successor is not a valid solution, go to step 1. 22 Best first search II Algorithm BFS Require: Initial si and target st states, and an heuristic function h(sn ) Ensure: Sequence S. V → V ∪ si ▷ Mark initial si state as visited Q.enqueue(si , h(si )) ▷ Enqueue si with priority while Q ̸= ∅ do si → Q.dequeue() ▷ Dequeue the state with minimum cost to target S → S ∪ si ▷ Add si as next state of the solution for sn ∈ si.adjacentNodes(V) do ▷ non-visited sn neighbors if sn == st then Return S ∪ sn else V → V ∪ sn ▷ mark the sn state as visited Q.enqueue(sn , h(sn )) ▷ Enqueue with priority sn end if end for end while Return failure 23 Is there any issue in the above algorithm? BFS example 25 A∗ solution The problem is that h(sn ) ignores the cost from the initial state si to sn. Solution → f(sn ) = g(sn ) + h(sn ) g(sn ) → Cost from the initial state to the state sn h(sn ) → Estimation of the cost from the state sn to the solution f(sn ) → Total estimated cost from the initial state to the solution 26 A∗ example 27 A∗ completeness A∗ is a complete algorithm, i.e., if there is a solution, A∗ will find it if: Search space (tree or graph) is finite Space nodes have finite degrees. No infinite paths, f(sn ) has an upper bound. If these conditions are met, then A∗ cannot run an infinite time without expanding the solution. 28 Takeaways Best first search is a modification of DFS/BFS traversal algorithms Priority queue is the distinguishing feature of the best first search algorithm A∗ is a BFS algorithm but considering the accumulated cost A∗ is a complete algorithm 29 04 Heuristic properties Admissible heuristic An estimate h(sn ) is an admissible heuristic if it is a lower bound for the optimal solution costs. That is h(sn ) ≤ δ(sn , st ) ∀sn ∈ S where δ(sn , st ) is the actual cost from the current state sn to the solution st. 31 Consistent heuristic An estimate h(sn ) is a consistent heuristic if h(sn ) ≤ h(su ) + w(sn , su ) ∀su ∈ child(sn ) where w(sn , su ) is the actual cost from the current state sn to each of its child su. 32 Monotonic heuristic Let (s0 ,... , sk ) any path and g(sk ) its cost, then we define f(sk ) = g(sk ) + h(sk ). An estimate h(sn ) is a monotone heuristic if f(sj ) ≥ f(si ) ∀sj > si , 0 ≤ i, j ≤ k that is, the estimate of the total path cost is non-decreasing from a node to its successors. 33 Equivalence of consistent and monotone heuristics A heuristic is consistent if and only if it is monotone. (Proof) For two subsequent states Si−1 and Si on a path (s0 ,... , sk ) we have f(si ) = g(si ) + h(si ) (by the definition of f) = g(si−1 ) + w(si−1 , si ) + h(si ) (by the definition of path cost) ≥ g(si−1 ) + h(si−1 ) (by the definition of consistency) = f(si−1 ) (by the definition of f) 34 Consistent heuristics are admissible Moreover, we obtain the following implication. (Reminder) If h(sn ) is consistent we have h(sn ) − h(su ) ≤ w(su , sv )∀su ∈ child(sn ). Let p = (s0 ,... , sk ) be any path from sn = s0 to st = sk. Then we have ∑ k−1 ∑ k−1 w(p) = w(si , si+1 ) ≥ (h(si ) − h(si+1 )) = h(sn ) − h(st ) = h(sn ) i=0 i=0 This is also true in the case of p being an optimal path from sn to st. Therefore, h(sn ) ≤ δ(sn , st ). 35 Can we improve A∗ execution time using better heuristics? Hints The maximum of two admissible heuristics is an admissible heuristic. The maximum of two consistent heuristics is a consistent heuristic. 37 Dominance If a heuristic h(sn ) is admissible, all the states from the initial state to the optimal solution have smaller h(sn ) than the solution. Then, A∗ will expand all of them. Imagine you have two admissible heuristics h1 (sn ) and h2 (sn ). We say that h2 (sn ) is more informative when h2 (sn ) > h1 (sn )∀sn. It implies that all states visited by h2 (sn ) are also visited by h1 (sn ), but not otherwise. Therefore, an A∗ algorithm using h1 will visit more states than using h2 (sn ). When we observe this phenomenon, we say that h2 (sn ) dominates h1 (s). 38 Takeaways Selected heuristic affects A∗ performance We look for the most dominant heuristic Heuristics must be consistent and monotonous to be admissible 39 Questions? Dynamic programming Jordi Nin Index 01 Introduction 02 Top-down DP 03 Bottom-up DP 04 Shortest paths 2 01 Introduction Intuition Dynamic programming (DT) is a general and powerful algorithm design technique. It is specially intended for optimization: Shortest paths Inventory management Knapsack Problem DP is a kind of exhaustive search, O(2n ), but done in a clever way, O(nx ). Therefore, we can think that DP is a careful brute-force approach. DP is based on generating subproblems and reusing them. 4 Origin -- Richard Bellman "I spent the Fall quarter (of 1950) at RAND. My first task was to find a name for multistage decision processes. An interesting question is, ‘Where did the name, dynamic programming, come from?’ The 1950s were not good years for mathematical research. We had a fascinating gentleman in Washington named Wilson. He was Secretary of Defense, and he actually had a pathological fear and hatred of the word, research. I’m not using the term lightly; I’m using it precisely. His face would suffuse, he would turn red, and he would get violent if people used the term, research, in his presence. You can imagine how he felt, then, about the term, mathematical. The RAND Corporation was employed by the Air Force, and the Air Force had Wilson as its boss, essentially. Hence, I felt I had to do something to shield Wilson and the Air Force from the fact that I was really doing mathematics inside the RAND Corporation. What title, what name, could I choose? In the first place, I was interested in planning, decision-making, in thinking. But planning is not a good word for various reasons. I decided therefore to use the word, ‘programming.’ I wanted to get across the idea that this was dynamic, this was multistage, this was time-varying—I thought, let’s kill two birds with one stone. Let’s take a word with an exact meaning, namely dynamic, in the classical physical sense. It also has a very interesting property as an adjective, and that is it’s impossible to use the word, dynamic, in a pejorative sense. Try thinking of some combination that will possibly give it a pejorative meaning. It’s impossible. Thus, I thought dynamic programming was a good name. It was something not even a Congressman could object to. So I used it as an umbrella for my activities”. 5 Greedy algorithms recap A greedy approach consists of always making a locally optimal decision by selecting the best option available to extend the current partial solution and solving the unique subproblem resulting from the decision taken. The greedy approach can be applied to problems that exhibit two principles: The principle of local optimality defines that a globally optimal solution can be reached through optimal local decisions. It allows us to make an optimal global decision without considering future decisions or sub-problems. The principle of optimal substructure establishes that every optimal solution of a problem contains optimal solutions of its sub-problems. It guarantees optimality 6 Divide & Conquer steps recap Divide and conquer is a strategy for solving a large problem by (1) breaking the problem into smaller sub-problems (2) solving the sub-problems (3) combining them to get the desired output 7 Independence principle recap Independent means that the solution to one subproblem does not depend on or affect the solution to another subproblem. This means that the subproblems can be solved separately and their solutions can be combined to get the final solution. Merge Sort 8 Overlapping subproblems In divide-and-conquer, when the subproblems overlap with each other, it can lead to a lot of unnecessary work being done because the algorithm will repeatedly solve the same subproblems, leading to exponential time complexity. Fibonacci sequence 9 Dynamic programming essence Dynamic programming optimizes the time complexity when subproblems are overlapping by solving each subproblem only once, and then storing the solution in an array or hash table (dictionary). This is possible only when the optimal substructure principle holds, i.e. the optimal solution of one subproblem can be utilized to obtain the optimal solution for other subproblems. This allows the algorithm to avoid solving the same subproblem multiple times, which can improve the time complexity from exponential to polynomial. 10 Dynamic programming exercise 1 & 2 Takeaways Divide & conquer is inefficient when subproblems overlap Dynamic programming solves this issue by storing subproblems results Dynamic programming is a clever brute-force approach 12 02 Top-down DP Methodology Top-down dynamic programming, aka memoization, consists of extending a recursive divide-and-conquer algorithm to an optimization problem, checking whether a subproblem has been solved before (and its solution is available in a dictionary), or solving it recursively otherwise. An efficient implementation of top-down dynamic programming requires constant time dictionary queries, so hashing-based structures are preferred to lists or arrays. 14 Fibonacci example You only traverse up the call tree of depth n once after returning from the base case. All the previously calculated values are highlighted in yellow. The orange path shows that no input to the Fibonacci function is called more than once. This reduces the time complexity from exponential O(2n ) (15 calls) to linear O(n) (6 calls). 15 Dynamic programming exercise 3, 4 & 5 Takeaways Top-down DP only recurses the first time is called Memorized calls (non-recursive calls) have time complexity equal to O(1) In the Fibonacci example time complexity drops from O(2n ) to O(n) 17 03 Bottom-up DP Methodology In bottom-up dynamic programming, computation is performed from the base case. Then the subproblems' optimal solutions are available for solving a larger subproblem. 19 DAG representation Bottom-up DP can be represented with a DAG (Direct Acyclic Graph). The execution follows a topological order, in this case from left to right. 20 Dynamic programming exercise 6 & 7 Top-down vs. bottom-up A bottom-up DP algorithm can be more efficient than a top-down DP algorithm if all subproblems need to be solved at least once. This reduction (constant factor) is due to the additional cost of recursion and dictionary maintenance. A top-down DP algorithm can be more efficient than a bottom-up DP algorithm when it is not necessary to solve all the subproblems because the solution from top to bottom has the advantage of solving only those subproblems that are strictly necessary to find the optimal solution. There are problems in which the regular dictionary access pattern of the bottom-up approach can be exploited to reduce its space complexity. 22 Takeaways Bottom-up DP relies on iterative programming We choose bottom-up or top-down DP depending on the problem Sometimes DP does not require an increase in space complexity of the algorithm In general, DP time complexity is calculated as |subproblems| · (time/subproblem) 23 04 Shortest paths Problem description Imagine we have the following graphs: a a 7 2 s c v s 15 v 4 10 b b and we want to create an algorithm to find the shortest path (δ) between (s, v), i.e. δ(s, v). 25 Any idea? Guessing What do we do when we do not know an answer? ↓ Guessing ↓ But we do not know which guess is the best one ↓ All right, then try all guesses and take the best one 27 Playing the guessing game When guessing, we also need to define a base case → δ(s, ∀x) = 0. 28 Playing the guessing game Finally, take the best one → δ(s, v) = min (δ(s, u1 ) + · · · + δ(un , v)). (u,v)∈E 28 What is the complexity of the previous algorithm? Now it's your turn a s v b Exercise Use the previous algorithm to find the shortest path δ(s, v) for the graph above. 30 Any idea? We need an acyclic graph Each time we follow an edge we move one layer down. This technique will convert any graph with cycles to an acyclic one. 32 Acyclic graph for the previous exampe 33 What is the complexity of the previous algorithm? Bottom-up DP algorithm for shortest paths 1) Mark all nodes as unvisited. 2) Assign to every node a tentative distance value: δ(s) = 0 as the distance of the initial node and δ(u) = ∞ ∀v − s. Set s as the current node. 3) For the current node, consider all of its unvisited neighbors and calculate their tentative distances through the current node. Compare the newly calculated tentative distance to the one currently assigned and assign it to the smaller one. 4) When all of the unvisited neighbors of the current node are considered, remove it from the unvisited set. 5) If the destination node has been visited, finish the algorithm. 6) Otherwise, select the unvisited node marked with the smallest tentative distance, set it as the new current node, and go back to step 3. 35 Dijkstra's algorithm (1956-1959) Algorithm Dijkstra Require: graph G, node s Ensure: A distance matrix dist, and a list of previous nodes prev. 1: Q ← ∅ 2: for v ∈ G do 3: dist[v] ← ∞, prev[v] ← None, Q ← Q ∪ v 4: end for 5: dist[s] ← 0 6: while Q ̸= ∅ do 7: u ← min(dist, Q) ▷ min returns the node in Q with the minimum distance 8: Q←Q-u ▷ remove u from Q 9: for v ∈ u.adjacentNodes(Q) do ▷ neighbors v of u still in Q 10: if dis[u] + δ(u, v) < dist[v] then 11: dist[v] ← dis[u] + δ(u, v), prev[v] ← u ▷ Update dist and prev 12: end if 13: end for 14: end while 15: Return dist, prev 36 Shortest path reconstruction Algorithm Shortest path Require: List of previous nodes prev, destination node v, source node s. Ensure: Stack s. q ← ∅, u ← v if prev[u] ̸= ∅∨ u == s then ▷ Do something only if the vertex is reachable while u ̸= ∅ do s.stack(u) ▷ Construct the path by stacking u ← prev[u] ▷ Go to the previous node end while end if Return s 37 Takeaways Shortest paths are easy in DAGs (O(|E + V|)) Shortest paths are not so easy when graphs have cycles (O(V2 )) Dijkstra's algorithm is based on DP and uses memoization In this link: https://algorithms.discrete.ma.tum.de, there is a visual explanation of the Dijkstra's algorithm 38 Questions? Graphs Jordi Nin Index 01 Definition 02 Components 03 Graph traversal 04 Topological ordering 05 Minimum spanning tree 06 PageRank 2 01 Definition Formal definition Wikipedia definition A graph is a pair G = (V, E), where V is a set whose elements are called vertices, and E is a set of paired vertices, whose elements are called edges. The vertices i and j of an edge i, j are called the endpoints of the edge. 4 Graphs versus trees Graphs contain multiple paths (with cycles) between two nodes. Trees have only one path (no loops). Graphs have no hierarchy. Edges connect any pair of vertices, even to themselves. In trees, a node is only linked to its parent. A graph consists of one or more connected components. A connected component is a portion of a graph containing at least one path between every pair of vertices. Trees have only one connected component. 5 Can you think of examples of where to use graphs? Example I. Social networks The Star Wars social network 7 Example II. Information networks Global InternetMap 2018 8 Example III. Transportation networks Barcelona Metro Map 9 Example IV. Interaction networks Cross criticism between candidates in the last Spanish general elections 10 02 Components Graph components 12 Graph components Node (Vertex): set of objects connected together Node attributes define a node based on its characteristics. For example, in a business network, if nodes represent businesses, attributes could be their revenue or sector 12 Graph components Edge: connections between nodes If the edges are directed, pointing in only one direction, the graph is called directed If all edges are undirected, the graph is called undirected Thickness of the edge determines how strong the relationship between the two related nodes is 12 Directed adjacency matrix D A B C D B C A 0 1 1 0  0 A= B0 0 1 C 0 0 0 1 A D 0 0 0 0 Interpretation aij = 1 → There is an edge between node i and j aij = 0 → There is no connection between node i and j 13 Undirected adjacency matrix D A B C D B C A 0 1 1 0  0 A= B1 0 1 C 1 1 0 1 A D 0 0 1 0 Remark aij = aji → An undirected graph is represented by a symmetric matrix 14 Directed weighted adjacency matrix D 6 A B C D B 3 C A 0 5 2 0  0 A= B0 0 3 5 2 C 0 0 0 6 A D 0 0 0 0 Interpretation aij = wij → There is an edge between node i and j with importance equal to wij aij = 0 → There is no connection between node i and j 15 Undirected weighted adjacency matrix D 6 A B C D B 3 C A 0 5 2 0  0 A= B5 0 3 5 2 C 2 3 0 6 A D 0 0 6 0 Remark aij = aji → As before, an undirected weighted graph is represented by a symmetric matrix 16 Adjacency lists D 6 A : [(B, 5), (C, 2)] B 3 C B : [(A, 5), (C, 3)] 5 2 C : [(A, 2), (B, 3), (D, 6)] A D : [(C, 6)] Space complexity Adjacency matrix. O = (n2 ) where n is the number of nodes. Adjacency list. O = (n + m) where m is the number of edges. 17 Graphs degree distribution The degree of a node in a graph is the number of edges the node has to other nodes. The degree distribution P(k) of a graph is then defined to be the fraction of nodes in the graph with degree k. { kin = 2 k=3 k= kout = 1 18 Graph strength distribution The strength of a node in a graph is the sum of edges' weights the node has to other nodes. The strength distribution P(s) of a graph is then defined to be the fraction of nodes in the graph with strength s. 6 6 3 3 5 2 5 2 { sin = 5 s = 11 s= sout = 6 19 Graphs and MST exercise 1 Takeaways Graphs are everywhere Graphs are a nice data representation Graphs can be stored in different ways 21 03 Graph traversal Idea From an initial vertex s in a graph G, enumerate all other vertices of G accessible from s through paths of G. If G has different components, we are sure that we cannot traverse the complete graph starting from a single node s. As in trees, we can traverse a graph G from an initial vertex s following the DFS or BFS algorithms by stacking or queuing the following nodes to visit in a stack or queue. If G has cycles we need to track visited nodes. 23 Generic algorithm Algorithm Graph traversal Require: graph G = (V,E), node s Ensure: traversal T 1: L ← ∅ 2: for v ∈ V do 3: visited[v] ← False 4: end for 5: L ← L ∪ s ▷ The ∪ operator can be a stack or enqueue method 6: while L ̸= ∅ do 7: v ← L.next() ▷ The next method can be a pop or dequeue operation 8: if not(visited[v]) then 9: T←T∪v ▷ visit node v 10: visited[v] ← True 11: for w ∈ G[v].adjacent() do ▷ The adjacent method returns all v’s neighbors 12: if not(visited[w]) then 13: L←L∪w ▷ The ∪ operator can be a stack or enqueue method 14: end if 15: end for 16: end if 17: end while 24 DFS example Graph traversal(G, s=B) D T→ B C B A L (stack) 25 DFS example Graph traversal(G, s=B) D T→ B B C A A C L (stack) 25 DFS example Graph traversal(G, s=B) D T→ B A B C C A C L (stack) 25 DFS example Graph traversal(G, s=B) D T→ B A C B C D A C L (stack) 25 DFS example Graph traversal(G, s=B) D T→ B A C D B C C A L (stack) 25 DFS example D Graph traversal(G, s=B) B C T→ B A C D A 25 Traversing directed graphs In directed graphs, a strongly connected component is a group of nodes where every node is reachable from every other node. In this example, we can only traverse the complete graph if we start from nodes a, b or e. 26 Graphs and MST exercise 2 Takeaways To avoid infinite loops, we need to track the visited nodes Graphs can be traversed using variants of the DFS and BFS algorithms. Graph traversal has the same time complexity as tree traversal O(n) 28 04 Topological ordering Definition The topological ordering of a directed graph G = (V, E) is a linear ordering of its nodes such that for every edge (u, v) ∈ E, the node u appears before the vertex v in the ordering. The most common application is the temporal ordering of events, such as the ordering of the study plan of a university career based on the prerequisites of its subjects, or ordering purchases in complex supply chain management processes. The interest in topological ordering also lies in the fact that a directed graph only admits a topological ordering if it is acyclic. 30 Ordering algorithm Algorithm Graph ordering Require: graph G = (V,E) Ensure: The topological ordering is given by the order in which the nodes are dequeued. 1: order ← ∅ 2: q ← ∀ v ∈ G ∃ v.inDegree() = 0 ▷ All nodes of zero in-degree are placed in a queue 3: while q ̸= ∅ do ▷ As long as the queue is not empty 4: next ← q.dequeue() ▷ The next node is dequeued 5: order ← order ∪ next ▷ We add it inside the ordered list 6: for w ∈ next.adjacentNodes() do 7: w.inDegree() ← w.inDegree() -1 ▷ The in-degree of adjacent nodes of is decremented 8: end for 9: q ← ∀ v ∈ G ∃ v.inDegree() = 0 10: end while 11: Return order If we represent the graph as adjacency lists, extended with nodes' in-degree, time and space complexities are O(n + m) where n and m stand for the numbers of nodes and edges. 31 Ordering example B E A C G F D 32 Ordering example Step 1 B E q→ A B next = A A C G F D order → 32 Ordering example Step 2 B E q→ B next = B A C G F D order → A 32 Ordering example Step 3 B E q→ C next = C A C G F D order → A B 32 Ordering example Step 4 B E q→ D E F next = D A C G F D order → A B C 32 Ordering example Step 5 B E q→ E F next = E A C G F D order → A B C D 32 Ordering example Step 6 B E q→ F G next = F A C G F D order → A B C D E 32 Ordering example Step 7 B E q→ G next = G A C G F D order → A B C D E F 32 Ordering example Step 8 B E q→ next = ∅ A C G F D order → A B C D E F G 32 Graphs and MST exercise 3 Takeaways Only acyclic-directed graphs have topological ordering Topological ordering complexity is equal to O(n + m) It solves different business problems. 34 05 Minimum spanning tree Greedy algorithms A greedy approach consists of always making a locally optimal decision by selecting the best option available to extend the current partial solution and solving the unique subproblem resulting from the decision taken. The greedy approach can be applied to problems that exhibit two principles: The principle of local optimality defines that a globally optimal solution can be reached through optimal local decisions. It allows us to make an optimal global decision without considering future decisions or sub-problems. The principle of optimal substructure establishes that every optimal solution of a problem contains optimal solutions of its sub-problems. It guarantees optimality 36 Idea The minimum spanning tree (MST) of a graph is a subgraph containing the edges with the minimum cost (strength) to create a tree that connects all graph's nodes. D D F 2 F 2 1 7 1 G 15 C 2 A G C 2 A 2 4 2 4 E 3 E 3 B B Complete graph Minimum spanning tree MST drops all the loops in a graph. 37 Kruskal's algorithm I Create a forest F (a set of trees), where each node in the graph is a separate tree Create an ordered list L containing all the edges in the graph While L is non empty and F is not yet spanning ▶ Remove the edge with minimum weight from L ▶ If the removed edge connects two different trees, add it to the forest F, combining two trees into a single tree. At the termination of the algorithm, the forest forms a minimum-spanning forest. If the graph is connected, the forest has a single connected component creating a minimum-spanning tree. 38 Kruskal's algorithm II Algorithm Kruskal Require: graph G=(V,E) Ensure: tree F 1: F ← N 2: L ← sort(E) 3: for (u,v) ∈ L do 4: if component(u) ̸= component(v) then 5: F ← F ∪ (u,v) 6: end if 7: end for 8: Return F 39 Kruskal example D D F 2 F 1 7 G 15 C 2 A G C A 2 4 E 3 E B B L = [(G,F), (C,A), (C,D), (G,E), (C,B), (C,E), (F,C), (C,G)] 40 Kruskal example D D F 2 F 1 7 1 G 15 C 2 A G C A 2 4 E 3 E B B L = [(C,A), (C,D), (G,E), (C,B), (C,E), (F,C), (C,G)] 40 Kruskal example D D F 2 F 1 7 1 G 15 C 2 A G C 2 A 2 4 E 3 E B B L = [(C,D), (G,E), (C,B), (C,E), (F,C), (C,G)] 40 Kruskal example D D F 2 F 2 1 7 1 G 15 C 2 A G C 2 A 2 4 E 3 E B B L = [(G,E), (C,B), (C,E), (F,C), (C,G)] 40 Kruskal example D D F 2 F 2 1 7 1 G 15 C 2 A G C 2 A 2 4 2 E 3 E B B L = [(C,B), (C,E), (F,C), (C,G)] 40 Kruskal example D D F 2 F 2 1 7 1 G 15 C 2 A G C 2 A 2 4 2 E 3 E 3 B B L = [(C,E), (F,C), (C,G)] 40 Kruskal example D D F 2 F 2 1 7 1 G 15 C 2 A G C 2 A 2 4 2 4 E 3 E 3 B B L = [(F,C), (C,G)] 40 Kruskal example D D F 2 F 2 1 7 1 G 15 C 2 A G C 2 A 2 4 2 4 E 3 E 3 B B L = [(C,G)] 40 Kruskal example D D F 2 F 2 1 7 1 G 15 C 2 A G C 2 A 2 4 2 4 E 3 E 3 B B L=∅ 40 Graphs and MST exercise 4 Takeaways Greedy algorithms rely on two principles: local optimality and optimal substructure MST is a greedy algorithm MST drops all loops in a graph 42 06 PageRank How Many Websites Are There? 44 Who is this person? 45 Who is this person? Larry Page (Computer Scientist, Google co-founder, European Parliament 2009) 45 Goal PageRank is an algorithm used by Google Search to rank web pages in their search engine results. PageRank is a way of measuring the importance of website pages Idea PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites. There- fore, a website is considered relevant if it is linked to by other highly relevant websites. 46 Can you guess which algorithmic method might be used to calculate this relevance? Website relevance computation I B A D C 48 Website relevance computation I B B 1 1 3 1 2 3 1 A D A 2 D 1 1 2 1 3 1 2 C C 48 Website relevance computation II B 1 1 3 New website A relevance = 3 1 2 2 3 1 1 New website B relevance = A 2 D 3 1 4 1 2 New website C relevance = 1 3 3 1 2 5 New website D relevance = 6 C 49 Website relevance computation III B 3 1 1 New website A relevance = 2 2 1 6 1 2 New website B relevance = 1 3 A 6 D 4 5 New website C relevance = 1 12 3 5 4 2 5 12 New website D relevance = 3 6 C 50 Can you identify the primary challenge or limitation of this algorithm? Random walk method The resulting graph representing the World Wide Web (WWW) is too large, even using a sparse representation, such as the adjacency lists To solve the problem by sampling, we must introduce random walks The random walk model simulates a human clicking links completely at random, i.e being a random surfer Random walk A random walk on a graph is a process that starts at a given vertex and, at each time step, transitions to a neighboring vertex. In an unweighted graph, the next vertex is chosen uniformly at random from the current vertex's neighbors. In a weighted graph, the next vertex is selected based on a probability distribution determined by the weights of the edges. 52 Is updating page relevance by following random walks alone sufficient to solve the problem? Mitigating random walk issues The World Wide Web (WWW) is not fully connected; it consists of millions of disconnected components (spider traps). 54 Mitigating random walk issues The World Wide Web (WWW) is not fully connected; it consists of millions of disconnected components (spider traps). Many pages have no outgoing links (also known as dangling pages). 54 Mitigating random walk issues The World Wide Web (WWW) is not fully connected; it consists of millions of disconnected components (spider traps). Many pages have no outgoing links (also known as dangling pages). Solution. Teleport At each time step, the random surfer has two options: With probability β, follow a link at random. With probability 1 − β , jump to some random page if there is no outgoing links always teleport β is typically in the range 0.8 to 0.9 54 Algorithm Algorithm Simplified PageRank Require: graph G = (V, E), damping factor β, and iterations N Ensure: Scores PR 1: for v ∈ V do 2: PR(v) ← |V| 1 ▷ Initialize PageRank values uniformly 3: end for 4: v ← rand_int(|V|) ▷ Select the initial vertex id at random 5: for n ∈ ([0,... , N]) do 6: for u ∈ v.adjacentNodes() do PR(u) 7: PR(v) += u.outDegree() ▷ Contribution from incoming links 8: end for 9: if β>> if : be executed (A, B or C). >>> A The resulting complexity will be the >>> elif : largest one. >>> B >>> else: >>> C Example If complexities are A = O(n), B = O(n2 ), and C = O(log(n)) respectively. We report a complexity of O(n2 ). 38 Iterative statements I Code Firstly, we compute the complexity of >>> for i in range(0, n): the code block A >>> A Then, we multiply it by the number of times we iterate. Example If A has a complexity equal to O(1). We report a complexity of O(n) for the loop. 39 Iterative statements II range(0, n):O(n) range(0, n, 2):O( 2n ) range(0, n, 0.5):O(2n) 40 Nested iterative statements Code Firstly, we compute the complexity of the >>> for i in range(0, n): code block A >>> for j in range(0, n): Then, we multiply it by the number of >>> A times we iterate considering both loops. Example If A has a complexity equal to O(n). We report a complexity of O(n3 ) for the loop. 41 Can you guess the complexity of a while loop? While Statement Code Generally, it has the same complexity >>> while : than an iterative loop. >>> A Example >>> i = 1 What is the complexity of this code? >>> while ( i < n ): >>> i *= 2 43 While Statement Code Generally, it has the same complexity >>> while : than an iterative loop. >>> A Example >>> i = 1 What is the complexity of this code? >>> while ( i < n ): >>> i *= 2 O(log(n)) 43 Introduction to algorithm complexity notebook exercise 5 Takeaways Each statement has a different complexity To evaluate the total complexity of an algorithm, we have to compute the sum of all its statements' complexities and select the dominating term. In while loops, we consider the worst-case complexity 45 04 Space complexity Idea The space complexity of an algorithm is the amount of memory required to solve an instance of the problem for a given input. Similar to time complexity, space complexity is also expressed using the O notation. it includes input space and the auxiliary space (temporal variables). 47 Evaluate space complexity We need to know the amount of memory used by the different data types, but memory allocation in Python is not straightforward. For simplicity we will use the following convention: Type Size boolean 1 byte char 2 bytes int 4 bytes double 8 bytes We will ignore extra memory requirements in composite types (lists, numpy types,...), and the instructions and stack and space. 48 Introduction to algorithm complexity notebook exercise 6 Takeaways Space complexity considers the input size and the temporal variables. Space complexity also uses O(n) notation. Python uses objects to store most data types, which makes it difficult to compute memory requirements. 50 Questions?

Use Quizgecko on...
Browser
Browser