M269 CHAPTER 22.pdf
Document Details
Uploaded by CourteousNewton
Tags
Full Transcript
CHAPTER 22 BACKTRACKING In Chapter 11 you learned about exhaustive search, also called brute-force search or generate and test in M269. It iteratively generates candidates and tests them to check if they are solutions. This usually leads to generating far more candidates than there are solutions. Th...
CHAPTER 22 BACKTRACKING In Chapter 11 you learned about exhaustive search, also called brute-force search or generate and test in M269. It iteratively generates candidates and tests them to check if they are solutions. This usually leads to generating far more candidates than there are solutions. There are some techniques to prune the search space, i.e. to generate fewer candidates, but they depend on the problem. This chapter introduces backtracking, a search technique that is elegant and concise (because it’s recursive), efficient (because it can prune the search space substantially) and systematic (because it always follows the same template). The technique only applies to problems where each solution (and therefore each candidate) is a collection of items, because backtracking relies on generating each candidate incrementally, one item at a time. Section 1 of this chapter introduces a recursive exhaustive search for sequences of items that satisfy some constraints. Section 2 turns exhaustive search into a backtracking algorithm by making a small change that prunes the search space. Section 3 shows a typical example of applying backtracking: solving a puzzle. Sections 4 and 5 show how backtracking can also solve optimisation problems, where one best among several possible solutions is asked for, using the travelling salesman problem as an example. Finally, Sections 6 and 7 explain how to adapt backtracking to solve problems that are about sets of items instead of sequences of items, using the 0/1 knapsack problem as an example. Info: Some authors use the term ‘exhaustive search’ to also include backtracking, but the latter in fact avoids searching all candidates. This chapter supports these learning outcomes: Understand the common general-purpose data structures, algorithmic techniques and complexity classes - you will learn about backtracking, which is more efficient than brute-force search. Write readable, tested, documented and efficient Python code – this chapter introduces code templates for backtracking that you can adapt to solve various problems. Before starting to work on this chapter, check the M269 news and errata, and check the TMAs for what is assessed. 22.1 Generate sequences Backtracking can solve many constraint satisfaction and optimisation problems. In M269 we restrict backtracking to problems on sets of items or on sequences of unique items, i.e. without duplicates. We will first look at constraint satisfaction problems on sequences. Here’s one, admittedly contrived. Given an integer n > 2, obtain all permutations of 1,... , n such that: the first and last numbers are at least n / 2 apart (range constraint) the sequence starts with an odd number and then alternates even and odd numbers (parity constraint). For n = 3, only permutations (1, 2, 3) and (3, 2, 1) satisfy both constraints. Permutations like (2, 1, 3) satisfy neither: the difference between the first and last numbers is 1, but it should be at least 1.5 the permutation starts with an even number and has consecutive odd numbers. I first solve the problem with a recursive exhaustive search, because backtracking is based on that. A brute-force search for the solutions to the above problem generates all possible permutations and tests each permutation against both constraints, to decide if it’s a solution. To test the range constraint I compute the difference between the first and last numbers with Python function abs, introduced in Section 18.2. : def satisfies_range(numbers: list, n: int) -> bool: """Check if first and last of numbers are at least n/2 apart. Preconditions: - numbers is a list of integers; len(numbers) >= 2 -n>2 """ return abs(numbers - numbers[-1]) >= n / 2 To test the parity constraint, I must verify there’s an odd number at index 0, an even number at index 1, an odd number at index 2, etc. The general rule is that each number and its index must have different parities. If there are no numbers (the list is empty) then the constraint is satisfied. : def satisfies_parity(numbers: list) -> bool: """Check if numbers is an odd, even, odd,... sequence. (continues on next page) 650 Chapter 22. Backtracking Algorithms, data structures and computability (continued from previous page) Preconditions: numbers is a list of integers """ for index in range(len(numbers)): if index % 2 == numbers[index] % 2: return False return True The test for whether a permutation candidate is a solution is simply: : def is_solution(permutation: list, n: int) -> bool: """Check if permutation satisfies the range and parity constraints. Preconditions: n > 2 and permutation is a rearrangement of 1,...,␣ ˓→n """ return satisfies_range(permutation, n) and satisfies_ ˓→parity(permutation) Now all that remains is to generate the permutations. This could be done iteratively with Python’s permutations function introduced in Section 11.4.4. However, backtracking requires each permutation to be generated one item at a time. (I’ll explain why in Section 2.) We need a different way of generating permutations. 22.1.1 Recursive generation To generate one permutation with pencil and paper, we could start with an empty sequence and take the numbers one by one, in any order, from the set {1,... , n}, appending each one to the sequence. When the set becomes empty, the sequence is a permutation of the numbers from 1 to n. To generate all permutations, we need to systematically go back to an earlier choice (e.g. when we picked the first number in the permutation) and make a different one. The following tree shows the decision process for n = 3. Each node represents the sequence created so far and the available set of choices. Initially the sequence is empty and all numbers are available. The children of a node are all the possible ways of extending the sequence in that node. When the set is empty, the sequence can’t be further extended. Nodes with the empty set are the leaves of the tree and contain the permutations. Figure 22.1.1 22.1. Generate sequences 651 Algorithms, data structures and computability The sequences in the leaves are the complete candidates, which for this problem are the permutations. The sequences in the other nodes are the partial candidates. Each set in a node is the extensions for that node’s candidate. A candidate is complete when it has no extensions. A solution is a complete or partial candidate that satisfies the constraints. For this problem, all solutions are complete candidates (permutations). Because of the constraints imposed, not every complete candidate is a solution. Since the candidates can be organised in a tree, all we need to generate them is a recursive tree-traversal algorithm. The algorithms in Chapter 16 went through an existing tree. It would be a waste of time and memory to create the whole tree in advance, with the partial candidates and their extensions, as we’re only interested in those complete candidates that are solutions. A better approach is to create the nodes as they’re visited. In fact, we don’t need to create node objects with pointers to children: we only need the content of nodes (the candidates and their extensions), which is much simpler and efficient. The following is a recursive exhaustive search. It generates all partial and complete candidates and their extensions, and tests complete candidates (the permutations) against the constraints. If the permutation is a solution then it’s appended to a sequence of solutions, so that we keep the solutions in the order they’re found. I’ve also added some print statements to follow what the search does. : def extend(candidate: list, extensions: set, n: int, solutions: list) ˓→> None: """Add to solutions all valid permutations that extend candidate. Preconditions: n > 2 and - candidate is a list of integers between 1 and n - extensions is a set of integers between 1 and n - candidate and extensions have no integer in common """ print('Visiting node', candidate, extensions) if len(extensions) == 0: # leaf node: candidate is complete print('Testing candidate', candidate) if is_solution(candidate, n): solutions.append(candidate) else: # create and visit children nodes for item in extensions: (continues on next page) 652 Chapter 22. Backtracking Algorithms, data structures and computability (continued from previous page) extend(candidate + [item], extensions - {item}, n,␣ ˓→solutions) Like all recursive algorithms, it has a base case (there are no extensions) and a reduction step (remove one item from the extensions) to make progress towards the base case. Being a tree-traversal function, I must call it on the root node: the empty candidate and the full set of extensions. I must also initialise the solutions sequence. : def valid_permutations(n: int) -> list: """Return all valid permutations of 1,..., n in the order␣ ˓→generated.""" candidate = [] extensions = set(range(1, n+1)) # {1,..., n} solutions = [] extend(candidate, extensions, n, solutions) return solutions print('Solutions:', valid_permutations(3)) Visiting node [] {1, 2, 3} Visiting node {2, 3} Visiting node [1, 2] {3} Visiting node [1, 2, 3] set() Testing candidate [1, 2, 3] Visiting node [1, 3] {2} Visiting node [1, 3, 2] set() Testing candidate [1, 3, 2] Visiting node {1, 3} Visiting node [2, 1] {3} Visiting node [2, 1, 3] set() Testing candidate [2, 1, 3] Visiting node [2, 3] {1} Visiting node [2, 3, 1] set() Testing candidate [2, 3, 1] Visiting node {1, 2} Visiting node [3, 1] {2} Visiting node [3, 1, 2] set() Testing candidate [3, 1, 2] Visiting node [3, 2] {1} Visiting node [3, 2, 1] set() Testing candidate [3, 2, 1] Solutions: [[1, 2, 3], [3, 2, 1]] As you can see, the algorithm tests all 3! = 6 permutations of numbers 1 to 3, but only two of them are solutions. If you follow the nodes visited with your finger on the tree diagram, you see that the algorithm is doing a pre-order traversal of the tree. After visiting a leaf and testing the complete 22.1. Generate sequences 653 Algorithms, data structures and computability candidate in it, the algorithm ‘unwinds’ (because leaves have no children) to the last node with yet unvisited subtrees and traverses the next subtree. For example, if you look at the printed output and at the tree diagram, after producing permutation [1, 3, 2], there are no further subtrees to explore for partial candidate. The execution of the algorithm is back in the for-loop of the call on the root node extend([], {1,2,3}, 3, solutions) and goes into the next iteration, with item being 2. The next recursive call is extend(, {1, 3}, 3, solutions), which starts traversing the middle subtree. At this point you may be rightly thinking that this recursive brute-force search is less efficient than an iterative one, because it also generates and visits all partial candidates as the initially empty candidate is extended one item at a time. For n = 3, there are 10 partial and only 6 complete candidates. One advantage of the incremental generation approach is that it can also solve problems where the solution sequences don’t have all items, i.e. where some partial candidates are solutions too. 22.1.2 Accept partial candidates Let’s change the problem so that solutions can be sequences with only some of the numbers from 1 to n, as long as they satisfy both constraints. All permutations satisfying the range and parity constraints are still solutions, but shorter sequences may be solutions too. For example, for n = 4, (1, 2, 3, 4) is the only permutation that is a solution, but sequences (1, 2, 3) and (3, 2, 1) are solutions too: they each have a difference of at least 4 / 2 = 2 between the first and last numbers and they alternate odd and even numbers. To solve this problem I must make two changes. First, the tree traversal must check every candidate, not just the complete candidates. (I don’t repeat the docstring and print statements.) : def extend(candidate: list, extensions: set, n: int, solutions: list) ˓→> None: if is_solution(candidate, n): solutions.append(candidate) for item in extensions: extend(candidate + [item], extensions - {item}, n, solutions) The change I made was removing the if-else statement that checked if the candidate is complete. The for-loop won’t do anything for complete candidates because they have no extensions. The second change is to the function that tests the range constraint, which requires the sequence to have at least two numbers. Previously, we could assume that as part of the preconditions, because only complete candidates (permutations of n > 2) were tested. Now that we test all partial candidates, we must explicitly check they have at least two numbers. : def satisfies_range(numbers: list, n: int) -> bool: """Check if first and last of numbers are at least n/2 apart. (continues on next page) 654 Chapter 22. Backtracking Algorithms, data structures and computability (continued from previous page) Preconditions: numbers is a list of integers; n > 2 """ return len(numbers) > 1 and abs(numbers - numbers[-1]) >= n / 2 Let’s confirm that we now find more sequences for n = 4. : valid_permutations(4) : [[1, 2, 3], [1, 2, 3, 4], [1, 4], [1, 4, 3], [3, 2, 1], [3, 4, 1]] To sum up, a constraint satisfaction problem on sequences with unique items can be solved with a brute-force search that generates all candidates in a recursive and incremental way, starting from the empty sequence and extending it by one item at a time. Depending on the problem, all candidates or only the complete ones are tested against the constraints. The next section shows the main advantage of incremental generation: a simple change will make the search much more efficient. 22.2 Prune the search space Backtracking takes the brute-force search of the previous section and adds a simple but powerful idea: since the candidates are generated incrementally, one item at a time, stop extending a candidate as soon as it’s clear it won’t lead to a solution. This substantially reduces the number of candidates generated, making backtracking much more efficient than brute-force. Let’s see backtracking in action on the problem of the previous section: find all sequences of non-repeated numbers, taken from 1 to n > 2, such that 1. the first and last numbers are at least n / 2 apart (range) 2. the numbers are odd, even, odd, even,... (parity). The sequences don’t have to be permutations of 1 to n: they can include only some of the n numbers. Here again is the code that checks these constraints. : def satisfies_range(candidate: list, n: int) -> bool: """Check if first and last numbers in candidate are at least n/2␣ ˓→apart. Preconditions: candidate is a list of integers; n > 2 """ return len(candidate) > 1 and abs(candidate - candidate[-1]) >=␣ ˓→n / 2 (continues on next page) 22.2. Prune the search space 655 Algorithms, data structures and computability (continued from previous page) def satisfies_parity(candidate: list) -> bool: """Check if candidate is an odd, even, odd,... sequence. Preconditions: candidate is a list of integers """ for index in range(len(candidate)): if index % 2 == candidate[index] % 2: return False return True 22.2.1 Local and global constraints The key insight that enables pruning the search space is that the two constraints for this problem are of different nature. The range constraint involves the first and last numbers of the sequence and therefore can only be checked on the whole sequence: it’s a global constraint. The second constraint is about the parity of each number, independently of the other numbers: it’s a local constraint. If a partial candidate P doesn’t satisfy a global constraint, an extension of P may satisfy it because of the added items. For example, for n = 3 the sequence (1, 2) doesn’t satisfy the range constraint but (1, 2, 3) does. We therefore must keep extending a candidate that fails a global constraint. However, if partial candidate P doesn’t satisfy a local constraint, neither does any candidate C that extends P because the item in P that breaks the local constraint is also in C. For example, if P starts with an even number, or has two consecutive odd numbers, then so does any extension of P. This means that there’s no point in extending a partial candidate that violates a local constraint: it won’t lead to a solution. If a candidate fails the local constraints, a backtracking algorithm goes immediately back (hence its name) to a previous partial candidate and tries a different way to extend it. In terms of tree traversal, if a node has a candidate that fails the local constraints, a backtracking algorithm doesn’t traverse the subtree rooted at that node: it instead goes back to the node’s parent. From there it starts traversing the next sibling subtree. If there’s no sibling, the algorithm backtracks to the grandparent, and so on. The changes to the previous recursive generate-and-test algorithm are minor: I simply add a new base case. If the candidate fails the local constraints, the algorithm returns (backtracks) immediately instead of recursively extending the candidate. : def extend(candidate: list, extensions: set, n: int, solutions: list) ˓→> None: """Add to solutions all valid permutations that extend candidate. Preconditions: n > 2 and - candidate is a list of integers between 1 and n (continues on next page) 656 Chapter 22. Backtracking Algorithms, data structures and computability (continued from previous page) - extensions is a set of integers between 1 and n - candidate and extensions have no integer in common """ print('Visiting node', candidate, extensions) # base case 1: backtrack if not satisfies_parity(candidate): return # base case 2: candidate is solution # local constraint is satisfied, so only check global constraint if satisfies_range(candidate, n): solutions.append(candidate) for item in extensions: extend(candidate + [item], extensions - {item}, n, solutions) The main function remains the same. : def valid_permutations(n: int) -> list: """Return all valid permutations of 1,..., n in the order␣ ˓→generated.""" candidate = [] extensions = set(range(1, n+1)) # {1,..., n} solutions = [] extend(candidate, extensions, n, solutions) return solutions print('Solutions:', valid_permutations(3)) Visiting node [] {1, 2, 3} Visiting node {2, 3} Visiting node [1, 2] {3} Visiting node [1, 2, 3] set() Visiting node [1, 3] {2} Visiting node {1, 3} Visiting node {1, 2} Visiting node [3, 1] {2} Visiting node [3, 2] {1} Visiting node [3, 2, 1] set() Solutions: [[1, 2, 3], [3, 2, 1]] Now only 10 of the 16 nodes of the full tree are visited. As you can see, sequences (1, 3), (2) and (3, 1) aren’t further extended because they break the parity constraint. As suggested in Chapter 15, we can further reduce the search space by not even generating those sequences, since they will be rejected. 22.2. Prune the search space 657 Algorithms, data structures and computability 22.2.2 Avoid visits The algorithm currently first extends a candidate and then checks whether it should backtrack. Since the local constraint applies to each number individually, we can check the parity of the chosen extension number before appending it to the sequence. In other words, instead of creating and visiting a node and then backtracking if needed, we avoid creating the node in the first place. This further prunes the search space. Here’s the new version, without repeating the docstring. The check for the local parity constraint moves to the for-loop: if the chosen number can extend the current sequence, it will. : def extend(candidate: list, extensions: set, n: int, solutions: list) ˓→> None: print('Visiting node', candidate, extensions) if satisfies_range(candidate, n): solutions.append(candidate) for item in extensions: if can_extend(item, candidate): # added line extend(candidate + [item], extensions - {item}, n,␣ ˓→solutions) Instead of an explicit base case that goes back to a previous candidate if the current one doesn’t satisfy the local constraint, the new version avoids extending candidates with items that fail the local constraint. The new auxiliary function checks the local constraint on a single number instead of a sequence and hence is simpler than satisfies_parity. : def can_extend(item: int, candidate: list) -> bool: """Check if extending candidate with item can lead to a solution."" ˓→" # the number and the index where it will be must have different␣ ˓→parity return item % 2 != len(candidate) % 2 valid_permutations(3) Visiting node [] {1, 2, 3} Visiting node {2, 3} Visiting node [1, 2] {3} Visiting node [1, 2, 3] set() Visiting node {1, 2} Visiting node [3, 2] {1} Visiting node [3, 2, 1] set() : [[1, 2, 3], [3, 2, 1]] As expected, no partial candidate with consecutive numbers of the same parity or starting with an even number is generated. Now only 7 of the 16 nodes are created and visited. The search space has more than halved. 658 Chapter 22. Backtracking Algorithms, data structures and computability To emphasise how much more efficient backtracking can be, consider n = 10 and only complete candidates, i.e. permutations of the ten numbers. There are 5 × 9! ≈ 1.8 million permutations starting with one of the five even numbers (2, 4, 6, 8, 10) followed by a permutation of the other nine numbers. There are further 5 × 4 × 8! ≈ 800 thousand permutations starting with two consecutive odd numbers, followed by a permutation of the other eight numbers. Backtracking won’t generate these 2.6 million permutations and thousands of other ones with consecutive numbers of the same parity, whereas brute-force search generates all 10! ≈ 3.6 million permutations. To sum up, backtracking can efficiently find sequences of items subject to global and local constraints because it generates candidates incrementally. Global constraints are checked on all candidate sequences or only on the complete ones, depending on the problem. Local constraints are checked for each extension being considered, to avoid generating candidates that won’t lead to solutions. This usually prunes vast parts of the search space. If a problem has no local constraints then backtracking becomes brute-force search because all candidates have to be generated. The next section shows a typical problem that can be solved with backtracking. 22.3 Trackword Trackword is a puzzle to find as many words with three to nine letters as we can in a 3×3 grid of letters. We can start in any square and move to any adjacent square, without visiting any square twice. The grid always contains a 9-letter word. Here’s an example created by a fan site. The figure shows the grid, the path for the 9-letter word ‘retrieves’ and two paths for the word ‘tree’. The grid also includes words ire, set, tee, sever, reset, reverse, serve, etc. Figure 22.3.1 For this and other word problems we need a list of valid words. The notebooks folder for this chapter includes the public-domain Enable list, which contains long and inflected words (past tense, plural, etc.). The words in the list are in lowercase, so the grid will be in lowercase too. Info: Enable and other word lists are available from the The National Puzzler’s League. The following code reads words with three to nine letters from the file. It uses Python constructs outside the scope of M269. 22.3. Trackword 659 Algorithms, data structures and computability : vocabulary = set() with open('enable1.txt') as file: for line in file: word = line.strip() if 3 None: """Extend the path with the squares. Add valid words to solutions." ˓→"" (continues on next page) 22.3. Trackword 661 Algorithms, data structures and computability (continued from previous page) if is_word(path, grid, valid): # check the global constraints solutions.append(path) As I write the last line I stop in my tracks. The candidates are paths in the grid, but the problem asks for the words, so I must convert each path of squares to the corresponding string of letters. : def path_to_string(path: list, grid: list) -> str: """Return the sequence of letters visited by the path in the grid." ˓→"" string = '' for square in path: string = string + grid[square][square] return string Now I can write the complete backtracking algorithm. : def extend(path: list, squares: set, grid: list, valid: set, solutions: ˓→ list) -> None: """Extend the path with the squares. Add valid words to solutions." ˓→"" if is_word(path, grid, valid): solutions.append(path_to_string(path, grid)) for square in squares: if can_extend(square, path, grid, valid): extend(path + [square], squares - {square}, grid, valid,␣ ˓→solutions) Finally, let’s implement the auxiliary functions. 22.3.3 The constraints The global constraints on a path are: it has three to nine letters and they form a valid word. Since only words with three to nine letters were read from the file, I can simply check if the path forms a word. : def is_word(path: list, grid: list, valid: set) -> bool: """Check if the letters in the path form a valid word.""" return path_to_string(path, grid) in valid As for the local constraint, we must check if the next square is adjacent to the last square in the path so far. If the path is empty, it can be extended with any square. : def can_extend(square: tuple, path: list, grid: list, valid: set) ->␣ ˓→bool: """Check if square is adjacent to the last square of path.""" if path == []: (continues on next page) 662 Chapter 22. Backtracking Algorithms, data structures and computability (continued from previous page) return True last_square = path[-1] last_row = last_square last_column = last_square row = square column = square return abs(row - last_row) < 2 and abs(column - last_column) < 2 That’s it! Let’s check it works. There are probably many words in this grid, so let’s just see a few of them. : words = trackword(['set', 'err', 'vei'], vocabulary) print('Total paths:', len(words)) print('Found 9-letter word?', 'retrieves' in words) print('First 10:', words[:10]) Total paths: 156 Found 9-letter word? True First 10: ['ere', 'err', 'errs', 'err', 'ere', 'ers', 'ere', 'eerie', ˓→'eerier', 'res'] Some words occur repeatedly in the output list because there are various paths to obtain them. We can easily compute how many unique words there are: : len(set(words)) # convert to set to remove duplicates : 59 Exercise 22.3.1 (optional) Looking at the code again, two improvements come to my mind. First, return a set instead of a sequence of words, to avoid duplicates. Second, remove unnecessary parameters. For example, function can_extend doesn’t need the grid and valid arguments. Copy all functions to the cell below, make the suggested changes and any others you wish and run the code to check it still finds 59 words in the grid. : # copy the functions to here len(trackword(['set', 'err', 'vei'], vocabulary)) == 59 22.3. Trackword 663 Algorithms, data structures and computability Exercise 22.3.2 (optional) Implement a more substantial change: represent each candidate as a sequence–string pair. This is a space–time tradeoff to avoid repeatedly converting a path to a string. When extending a candidate, append a square to the sequence and the corresponding letter to the string. 22.3.4 Template In general, to solve a problem with backtracking, first think of what the items, candidates and extensions represent and which are the global and local constraints. Then follow this backtracking solution template, replacing the generic function and variable names and docstrings with problem-specific ones, and removing unnecessary parameters. def problem(instance: object) -> list: """Return all solutions for the problem instance, in the order␣ ˓→generated.""" candidate = [] extensions =... solutions = [] extend(candidate, extensions, instance, solutions) return solutions def extend(candidate: list, extensions: set, instance: object,␣ ˓→solutions: list) -> None: """Add to solutions all extensions of candidate that solve the␣ ˓→problem instance.""" # remove next line if partial candidates can be solutions if len(extensions) == 0: if satisfies_global(candidate, instance): solutions.append(candidate) for item in extensions: if can_extend(item, candidate, instance): extend(candidate + [item], extensions - {item}, instance,␣ ˓→solutions) def satisfies_global(candidate: list, instance: object) -> bool: """Check if candidate satisfies the global constraints.""" return... def can_extend(item: object, candidate: list, instance: object) ->␣ ˓→bool: """Check if item may extend candidate towards a solution.""" return... 664 Chapter 22. Backtracking Algorithms, data structures and computability 22.4 Optimise I’ve shown you how to solve constraint satisfaction problems on sequences with backtracking. It shouldn’t be a surprise that backtracking can also solve optimisation problems. Instead of collecting all solutions in a list or set, the algorithm keeps only the best solution found so far. Let’s see an example. 22.4.1 The problem I take the earlier problem of finding sequences of numbers from 1 to n that satisfy range and parity constraints, and add an optimisation criterion: we want a sequence with a lowest sum. For example, for n = 4, the best sequence is (1, 4) because its sum 5 is lowest among all other sequences that satisfy both constraints. An optimisation problem asks to find a solution that minimises or maximises some value. We thus need a function that calculates the value of a candidate. For this problem, it’s simply the sum of the numbers in the sequence. : def sum(numbers: list) -> int: """Return the sum of the numbers.""" total = 0 for number in numbers: total = total + number return total With respect to the earlier solution for the constraint satisfaction problem, I need to make two changes. First, instead of appending each solution found to a sequence, the backtracking algorithm must receive the current best solution as input and update it when it finds a better solution. Second, the main function must compute some initial best solution and pass it to the backtracking function. Let’s do these changes in turn. 22.4.2 Keep the best The backtracking algorithm must compare each found solution to the current best and update the latter if the new solution is better. If it’s a minimisation problem, better means having a lower value, otherwise better means higher. To avoid recomputing the best solution’s value every time it’s compared to a new solution, I will trade space for time and keep a solution–value pair instead of just the best solution. Usually I represent a pair with a tuple, but to be able to update the pair, I use a list of length two with constants to name the indices. : SOLUTION = 0 VALUE = 1 I could instead have defined a class with two data fields, similar to the node classes for linked lists and binary trees. I’ll leave that as an optional exercise, if you wish to practise using 22.4. Optimise 665 Algorithms, data structures and computability classes. The backtracking function is largely the same as before. The solutions parameter is replaced by the best solution–value pair, which is updated when the current candidate is a better solution. I add some print statements to later follow the algorithm’s actions. : def extend(candidate: list, extensions: set, n: int, best: list) ->␣ ˓→None: """Update best if needed, and extend candidate.""" print('Visiting node', candidate, extensions) if satisfies_range(candidate, n): candidate_value = sum(candidate) if candidate_value < best[VALUE]: print('New best', candidate, 'with value', candidate_value) best[SOLUTION] = candidate best[VALUE] = candidate_value for item in extensions: if can_extend(item, candidate): extend(candidate + [item], extensions - {item}, n, best) The current best solution must be initialised before starting the search. Since the search updates the current best every time a better one is found, the initial best can be any solution, preferably one that is easy to construct. For this problem, sequence (1, 2,... , n) is a solution: it satisfies both constraints. : def best_sequence(n: int) -> list: """Return a lowest sum sequence that satisfies both constraints.""" candidate = [] extensions = set(range(1, n+1)) solution = list(range(1, n+1)) best = [solution, sum(solution)] extend(candidate, extensions, n, best) return best The constraint checking functions are as before. : def satisfies_range(candidate: list, n: int) -> bool: """Check if first and last numbers in candidate are at least n/2␣ ˓→apart. Preconditions: candidate is a list of integers; n > 2 """ return len(candidate) > 1 and abs(candidate - candidate[-1]) >=␣ ˓→n / 2 def can_extend(item: int, candidate: list) -> bool: """Check if extending candidate with item can lead to a solution."" (continues on next page) 666 Chapter 22. Backtracking Algorithms, data structures and computability (continued from previous page) ˓→" # the number and the index where it will be put must have␣ ˓→different parity return item % 2 != len(candidate) % 2 best_sequence(4) Visiting node [] {1, 2, 3, 4} Visiting node {2, 3, 4} Visiting node [1, 2] {3, 4} Visiting node [1, 2, 3] {4} New best [1, 2, 3] with value 6 Visiting node [1, 2, 3, 4] set() Visiting node [1, 4] {2, 3} New best [1, 4] with value 5 Visiting node [1, 4, 3] {2} Visiting node [1, 4, 3, 2] set() Visiting node {1, 2, 4} Visiting node [3, 2] {1, 4} Visiting node [3, 2, 1] {4} Visiting node [3, 2, 1, 4] set() Visiting node [3, 4] {1, 2} Visiting node [3, 4, 1] {2} Visiting node [3, 4, 1, 2] set() : [[1, 4], 5] As desired, the result is the sequence (1, 4) with lowest sum 5. This version visits 15 nodes. Can we further prune the search space? (All together now: oh yes, we can!) 22.4.3 Avoid worse candidates When using backtracking for an optimisation problem, we must think whether we can avoid generating candidates that won’t lead to a better solution than the current one. In this problem, as a sequence is extended, its sum grows because all extensions are positive numbers. When a sequence reaches or surpasses the sum of the current best solution, we know this sequence can’t lead to a better solution, with a lower sum, so we can stop extending it. To do this check, the can_extend function needs another parameter: the current best. : def can_extend(item: int, candidate: list, best: list) -> bool: """Check if item can extend candidate to a better solution than␣ ˓→best.""" return item % 2 != len(candidate) % 2 and item + sum(candidate) ␣ ˓→None: """Update best if a better extension of candidate is found.""" print('Visiting node', candidate, extensions) if satisfies_range(candidate, n): candidate_value = sum(candidate) if candidate_value < best[VALUE]: print('New best', candidate, 'with value', candidate_value) best[SOLUTION] = candidate best[VALUE] = candidate_value for item in extensions: if can_extend(item, candidate, best): # changed line extend(candidate + [item], extensions - {item}, n, best) best_sequence(4) Visiting node [] {1, 2, 3, 4} Visiting node {2, 3, 4} Visiting node [1, 2] {3, 4} Visiting node [1, 2, 3] {4} New best [1, 2, 3] with value 6 Visiting node [1, 4] {2, 3} New best [1, 4] with value 5 Visiting node {1, 2, 4} : [[1, 4], 5] As you can see, only six of the previous 15 nodes are created and visited. Once a solution with sum 5 is found, partial candidates starting with 3 aren’t generated because number 3 can only be followed by 2 or 4, both leading to sequences with sum at least 5 and thus not improving on the current best. Is this the best (pun intended) we can do to prune the search space? (All together: oh no, it isn’t!) 22.4.4 Start well I mentioned earlier that we can start with any solution because the search continuously improves on it until it finds the best solution. However, now that the algorithm uses the best solution to not generate candidates leading to equally good or worse solutions, we should start with an initial solution close to a best one, to further prune the search space. The trick is to think of a good solution that is easy to construct. Based on the fact that (1, 4) is a best solution for n = 4, I choose (1, n) or (1, 2, n) as the initial solution, depending on whether n is even or odd. (Remember that n > 2.) 668 Chapter 22. Backtracking Algorithms, data structures and computability : def best_sequence(n: int) -> list: """Return a lowest sum sequence that satisfies both constraints.""" candidate = [] extensions = set(range(1, n+1)) if n % 2 == 0: solution = [1, n] else: solution = [1, 2, n] best = [solution, sum(solution)] extend(candidate, extensions, n, best) return best best_sequence(4) Visiting node [] {1, 2, 3, 4} Visiting node {2, 3, 4} Visiting node [1, 2] {3, 4} Visiting node {1, 2, 4} : [[1, 4], 5] The final version only visits four of the initial version’s 15 nodes: the search space has been reduced by over 70%! To sum up, backtracking can solve optimisation problems by keeping track of the current best solution and its value, and by updating both when a better solution is found. For optimisation problems where extending a candidate worsens its value, e.g. makes the value larger but it’s a minimisation problem, the search space can be pruned by stopping extending a candidate when its value is equal to or worse than the current best. Further pruning can be obtained by constructing an initial solution close to a best one. 22.5 Back to the TSP In this section, we solve an optimisation problem with backtracking: the travelling salesman problem (TSP). Function: tsp Inputs: graph, a weighted undirected graph Preconditions: graph is complete; the nodes are 0, 1,...; the weights are positive Output: shortest, a sequence of nodes of graph Postconditions: shortest is a tour that starts and ends at node 0 no other tour of graph has a lower total weight than shortest Real networks seldom have connections between all nodes, but we can model any network as a complete graph by adding the missing edges with infinite weight. This prevents them from 22.5. Back to the TSP 669 Algorithms, data structures and computability being selected for a shortest tour. Here’s a graph to test the algorithm. Edges 0–2 and 1–4 have infinite weight, to show they’re missing in the network being modelled. Figure 22.5.1 The graph has 4! = 24 tours that start and end at node 0, but only those of finite length exist in the network modelled by the graph: (0, 1, 2, 3, 4, 0), (0, 1, 2, 4, 3, 0), (0, 1, 3, 2, 4, 0), (0, 3, 1, 2, 4, 0) and the reverse sequences. The shortest tours are (0, 3, 1, 2, 4, 0) and (0, 4, 2, 1, 3, 0) with length 50. : import math %run -i../m269_digraph.py # (un)weighted directed graph classes %run -i../m269_ungraph.py # undirected graph classes inherit from␣ ˓→directed example = WeightedUndirectedGraph() for node in range(5): example.add_node(node) example.add_edge(0, 1, 20) example.add_edge(0, 2, math.inf) (continues on next page) 670 Chapter 22. Backtracking Algorithms, data structures and computability (continued from previous page) example.add_edge(0, 3, 10) example.add_edge(0, 4, 5) example.add_edge(1, 2, 10) example.add_edge(1, 3, 5) example.add_edge(1, 4, math.inf) example.add_edge(2, 3, 10) example.add_edge(2, 4, 20) example.add_edge(3, 4, 10) Previously, we started with high-level questions about what the items, the candidates and the solutions are, to have a good understanding of the problem before solving it. Another approach is to let the backtracking code template guide the questions. 22.5.1 The main function The TSP is an optimisation problem on tours, which can be represented as sequences of nodes. So let’s start with the main function template for optimisation problems on sequences. SOLUTION = 0 VALUE = 1 def problem(instance: object) -> list: """Return the best solution and its value for the problem instance. ˓→""" candidate = [] extensions =... solution =... best = [solution, value(solution)] extend(candidate, extensions, instance, best) return best Let’s go through it line by line. The problem is the TSP and the instance is a weighted undirected graph, so the function header becomes: def tsp(graph: WeightedUndirectedGraph) -> list: Usually, the initial candidate is the empty sequence and the initial extensions form a set of items to be added to the sequence, each item occurring at most once in a solution. For the TSP problem that’s not true: node 0 occurs at the start and end of each tour. To be able to generate tours, the root candidate must be sequence (0) and the extensions are all the nodes in the graph, so that node 0 can be added a second time to the sequence. candidate = extensions = graph.nodes() 22.5. Back to the TSP 671 Algorithms, data structures and computability As for initialising the best solution, I could construct the tour (0, 1, 2,... , n–1, 0) and compute its length. solution = sorted(graph.nodes()) + best = [solution, value(solution)] # function value to be written However, I take the opportunity to show you a trick that works for any optimisation problem on sequences, especially when there’s no easy way of constructing a good initial solution. We start with a ‘pseudo-solution’ (usually the empty sequence) and an infinitely high or low value, depending on whether it’s a minimisation or maximisation problem. This guarantees that the first solution found is necessarily better. The TSP is a minimisation problem, so we start with an infinitely high value. solution = [] best = [solution, math.inf] The problem only asks for the tour, not its length, so the last template line becomes: return best[SOLUTION] Let’s put all this together and make the docstring and variable names less generic. : SOLUTION = 0 VALUE = 1 def tsp(graph: WeightedUndirectedGraph) -> list: """Return a tour-length pair with shortest length. Preconditions: graph is complete, has nodes 0, 1,..., and positive weights Postconditions: tour starts and ends at node 0 """ path = nodes = graph.nodes() solution = [] shortest = [solution, math.inf] extend(path, nodes, graph, shortest) return shortest[SOLUTION] The backtracking algorithm, implemented by function extend, is mostly boilerplate. The problem-specific computations are in the auxiliary functions that compute the value of a candidate and check the constraints, so let’s tackle them next. 672 Chapter 22. Backtracking Algorithms, data structures and computability 22.5.2 The value function For any optimisation problem, we must compute the value of a candidate to know whether it improves on the best solution so far. For the TSP, what is a candidate and what does the value function compute? A candidate is a sequence of nodes representing a path starting at node 0. The function computes the total weight of the edges between consecutive nodes. The next exercise asks you to implement the value function. Before that, I recommend you uncomment and run the next code line to remind yourself of the available graph methods and their parameters. : # help(WeightedUndirectedGraph) Exercise 22.5.1 For this and the following exercises, you’re given a code template that you must adapt and complete for the TSP. Adapting the code means to: replace the generic docstrings and identifiers with problem-specific ones (you can press F, not Shift-F, in command mode to find and replace text in the current cell or in all cells) remove any unnecessary parameters. Adapt and complete the following value function template for the TSP. : def value(candidate: list, instance: object) -> int: """Return the value of the candidate sequence for the problem␣ ˓→instance.""" pass # replace with your code value([0, 1, 2, 3, 4, 0], example) == 55 Hint Answer 22.5.3 Checking the constraints Two auxiliary functions check the global and local constraints on candidates. For the TSP, candidates are paths (sequences of nodes) starting at node 0, and solutions are candidates representing tours. The questions to think about are as follows. Can partial candidates be solutions? A tour has all the graph’s nodes, so solutions must be complete candidates; that is, when there’s no further node to add to the sequence. What are the constraints on a complete candidate for it to be a solution? 22.5. Back to the TSP 673 Algorithms, data structures and computability The candidate path ends with node 0 and each node has an edge to the next one. Which constraints are local (can be checked on partial candidates) and which are global (must be checked on complete candidates)? The existence of edges between consecutive nodes is a local constraint. The sequence ending with node 0 is a global constraint. Exercise 22.5.2 Adapt and complete the next code template for the TSP. : def satisfies_global(candidate: list, instance: object) -> bool: """Check if the candidate satisfies the global constraints.""" pass # replace with your code Answer Section 22.4.3 mentions that if extending a candidate worsens its value, then it shouldn’t be further extended when it reaches the best value so far. For the TSP, does extending a candidate worsen its value? The value of a candidate path is its length: the total weights of its edges. The problem definition tells us weights are positive, so extending a path increases its length, which for a minimisation problem is a worse value. Exercise 22.5.3 Adapt and complete the next code template for the TSP. : def can_extend(item: object, candidate: list, instance: object, best:␣ ˓→list) -> bool: """Check if item can extend candidate into a better solution than␣ ˓→best.""" # replace... with a check for the local constraints # use < for a minimisation problem return... and value(candidate + [item]) > best[VALUE] Hint Answer 674 Chapter 22. Backtracking Algorithms, data structures and computability 22.5.4 The backtracking function Now we have all auxiliary functions in place for the backtracking algorithm. Exercise 22.5.4 Adapt the next code template to the TSP. Don’t forget to change the calls to satisfies_global, value and can_extend if you changed their names in the cells above. : def extend(candidate: list, extensions: set, instance: object, best:␣ ˓→list) -> None: """Update best if candidate is a better solution, otherwise extend␣ ˓→it.""" print('Visiting node', candidate, extensions) # remove the next line if partial candidates can be solutions if len(extensions) == 0: if satisfies_global(candidate, instance): candidate_value = value(candidate, instance) # in the next line, use < for minimisation problems if candidate_value > best[VALUE]: print('New best with value', candidate_value) best[SOLUTION] = candidate best[VALUE] = candidate_value for item in extensions: if can_extend(item, candidate, instance, best): extend(candidate + [item], extensions - {item}, instance,␣ ˓→best) Answer Finally uncomment the next line and run it to check your solution. : # tsp(example) in [ [0,3,1,2,4,0], [0,4,2,1,3,0] ] The current algorithm still does much wasted work. It keeps extending paths where the second node 0 appears early on, only to later fail the global constraint test. The next exercise asks you to avoid generating such paths. Note: After applying backtracking, look at the candidates generated and think if there are problem-specific ways of further pruning the search space. 22.5. Back to the TSP 675 Algorithms, data structures and computability Exercise 22.5.5 Make a copy of your extend function and change it so that a path is extended with node 0 only if it’s the last remaining extension. This guarantees that all complete candidates end with node 0 and you can remove the call to the global constraint check. Run the example again to see a substantial reduction in the search space. Answer 22.6 Generate subsets Backtracking can also solve problems on sets instead of sequences. The only thing that changes is the core tree-traversal algorithm. The rest – handling constraints, computing the value of each candidate, keeping track of the best solution so far, etc. – remains the same. So let’s see how a tree traversal can generate all subsets of the given items. When we tried to greedily solve the knapsack problem, we first put all items into a sequence, sorted from best to worst. The algorithm then picked one item at a time. If the knapsack still had capacity for that item, it was added to the output set; otherwise it was skipped. A tree-traversal algorithm to generate all subsets works in the same way. It starts with an empty candidate set and with all items in a sequence of extensions. Recursively, the algorithm takes each item in turn from the extensions and adds it to the candidate set or skips it. Here’s the tree of candidate–extension pairs for generating subsets of {1, 2, 3}. In each node, the candidate is the subset constructed so far and the sequence has the numbers yet to consider. Figure 22.6.1 The root’s candidate is the empty set and the root’s extensions are all items (here the numbers from 1 to 3), in any order. Each node has two children, both with the first extension removed. The left child adds the extension to the candidate; the right child doesn’t. Like for the tree of permutations, the complete candidates, with empty extensions, are in the leaves of this tree. The other nodes have partial candidates. Due to skipping items, some subsets occur multiple times, e.g. two nodes have subset {1, 2} and four nodes have the empty set. We only consider the leaves because only then do we know that no further items will be added. 676 Chapter 22. Backtracking Algorithms, data structures and computability The tree traversal for subsets looks like this, using s[i:] as shorthand for s[i:len(s)] (Section 4.9.1). : def extend(candidate: set, extensions: list, solutions: list) -> None: """Extend candidate with all subsets of extensions and add to␣ ˓→solutions.""" print('Visiting node', candidate, extensions) if len(extensions) == 0: # complete candidate solutions.append(candidate) else: item = extensions # head of sequence rest = extensions[1:] # tail of sequence extend(candidate.union({item}), rest, solutions) # add item extend(candidate, rest, solutions) # skip item Like for generating permutations, the base case is when the extensions are empty and the reduction step (the assignment to rest) removes one extension. Like for generating permutations, we must start the algorithm in the root node. : def all_subsets(n: int) -> list: """Return all subsets of 1,..., n in the order generated.""" candidate = set() extensions = list(range(1, n+1)) solutions = [] extend(candidate, extensions, solutions) return solutions print('Subsets:', all_subsets(3)) Visiting node set() [1, 2, 3] Visiting node {1} [2, 3] Visiting node {1, 2} Visiting node {1, 2, 3} [] Visiting node {1, 2} [] Visiting node {1} Visiting node {1, 3} [] Visiting node {1} [] Visiting node set() [2, 3] Visiting node {2} Visiting node {2, 3} [] Visiting node {2} [] Visiting node set() Visiting node {3} [] Visiting node set() [] Subsets: [{1, 2, 3}, {1, 2}, {1, 3}, {1}, {2, 3}, {2}, {3}, set()] If you follow how nodes are visited in the tree diagram, you’ll see the algorithm is making a pre-order traversal. It chooses to add the item (left subtree) before choosing to skip it (right subtree), so the full subset {1, 2, 3} is generated first and the empty subset is generated last. 22.6. Generate subsets 677 Algorithms, data structures and computability The algorithm could of course first choose to skip and then choose to add the item. It would generate the same subsets, in reverse order. You can swap the order of the code lines that add and skip an item, to see for yourself. Changing the order of the extensions, e.g. extensions = list(range(n, 0, -1)), also changes the order in which subsets are generated. I’ll leave it to you to try it out. Now that we have a recursive algorithm to generate each subset incrementally, we can search for a best subset that satisfies some constraints in exactly the same way as we did for sequences. The next section provides an example: the knapsack problem. 22.7 Back to the knapsack To recap, the 0/1 knapsack problem is as follows. Function: 0/1 knapsack Inputs: items, a set of pairs of integers; capacity, an integer Preconditions: no integer is negative; each pair represents a weight and a value Output: packed, a set of pairs of integers Postconditions: packed is a subset of items the sum of the weights in packed doesn’t exceed capacity the sum of the values in packed is largest among all sets satisfying the previous two conditions I’ll use the following example: : ITEMS = {(1, 2), (2, 3), (3, 4), (4, 20), (5, 30)} If the knapsack has a capacity of 4, it can hold the first and second items (weight 1 + 2 = 3, value 2 + 3 = 5) or the first and third items (weight 1 + 3 = 4, value 2 + 4 = 6) or the fourth item (weight 4, value 20). The latter is the desired output, because it has the largest value. Let’s follow the same procedure as before to solve this problem, with ‘stop and think’ lines for you to think along. 22.7.1 The problem The knapsack problem is obviously an optimisation problem on subsets of items, where the value of the items in the subset is the quantity being maximised. Each candidate is the subset of items already put in the knapsack and the corresponding extensions are the items yet to consider. 1. What are the global and local constraints for this problem? 2. Can partial candidates be solutions or only complete candidates? 678 Chapter 22. Backtracking Algorithms, data structures and computability 1. The only constraint is that the weight of the items in the subset cannot exceed the knapsack’s capacity. It’s a local constraint because it can be checked as each item to be added is considered. 2. As mentioned in the previous section, although partial candidates are subsets too, the solutions are the complete candidates, after all items have been considered. We can start implementing the solution, from the auxiliary functions towards the backtracking and main functions. 22.7.2 The value function Every optimisation problem needs a function to compute the value of a solution, so let’s start with that. As usual, I define constants for the indices of pairs, to make the code more readable. : WEIGHT = 0 VALUE = 1 def value(items: set) -> int: """Return the total value of the items.""" total = 0 for item in items: total = total + item[VALUE] return total 22.7.3 The constraints functions Next are the auxiliary functions to check the global and local constraints. Looking at the answers to the earlier questions, how does this problem differ from the other problems in this chapter? There are no global constraints. Any candidate that satisfies the local constraint (fits the capacity) is a solution, but perhaps not a best one. This means we don’t need a satisfies_global function. We must only check if an item can extend a candidate (the items already in the knapsack) towards a better solution than the current one. Let’s break that down into two parts. How can we check whether an item can extend a candidate towards a solution? Or put differently, how do we know if there’s no point in extending the given candidate with the given item? If the weight of the item plus the weight of the candidate exceeds the capacity, then the item shouldn’t be added to the knapsack. 22.7. Back to the knapsack 679 Algorithms, data structures and computability Now let’s assume that an item can extend a candidate towards a solution. Is there a way to know it won’t lead to a better solution than the current best? (Hint: does extending a candidate worsen its value?) The problem definition above states that items cannot have negative values. If an item can extend a candidate, we must extend it, because the new item may lead to a better (i.e. higher value) solution. There is no way of pruning the search space early. In summary, an item can extend a candidate if the sum of their weights doesn’t exceed the capacity. : def can_extend(item: tuple, candidate: set, capacity: int) -> bool: """Check if adding the item to candidate won't exceed the capacity." ˓→"" total = item[WEIGHT] for another_item in candidate: total = total + another_item[WEIGHT] return total None: """Update best if candidate is a better solution, then try to␣ ˓→extend it.""" print('Visiting node', candidate, extensions) if len(extensions) == 0: if satisfies_global(candidate, instance): candidate_value = value(candidate, instance) if candidate_value == best[VALUE]: # replace == with < or > print('New best with value', candidate_value) best[SOLUTION] = candidate best[VALUE] = candidate_value else: item = extensions rest = extensions[1:] if can_extend(item, candidate, instance, best): extend(candidate.union({item}), rest, instance, best) extend(candidate, rest, instance, best) 680 Chapter 22. Backtracking Algorithms, data structures and computability Let’s think what changes are needed for the knapsack problem. Which comparison should be used: less than or greater than? It’s a maximisation problem so the candidate is the new best if its value is greater than the current one. Looking at the auxiliary functions written before, can any code or parameters be removed? Yes, the value function only needs the candidate parameter, the can_extend function only needs the item, candidate and capacity, and the satisfies_global function isn’t needed at all. Finally, given the previous changes, what should the instance: object parameter in the function header be replaced with? It should be capacity: int, which is needed by can_extend. Here’s the backtracking function for the knapsack problem. : SOLUTION = 0 VALUE = 1 def extend(candidate: set, extensions: list, capacity: int, best:␣ ˓→list) -> None: """Update best if candidate is a better solution, then try to␣ ˓→extend it.""" print('Visiting node', candidate, extensions) if len(extensions) == 0: candidate_value = value(candidate) if candidate_value > best[VALUE]: print('New best with value', candidate_value) best[SOLUTION] = candidate best[VALUE] = candidate_value else: item = extensions rest = extensions[1:] if can_extend(item, candidate, capacity): extend(candidate.union({item}), rest, capacity, best) extend(candidate, rest, capacity, best) 22.7. Back to the knapsack 681 Algorithms, data structures and computability 22.7.5 The main function We finally have to think of what the main function needs to do. What are the candidate and the extensions of the root node? The initial candidate is the empty set and the extensions are all the items given in the input, in a sequence. What is a possible initial best solution? We need a solution that can be easily constructed. The only one I could think of is the empty set: it’s a solution of every problem instance though hardly ever the best one (unless no item fits in the knapsack). The main function is thus: : def knapsack(items: set, capacity: int) -> list: """Return a subset of items and their total value. Preconditions: - items is a set of weight-value pairs, both integers - no integer is negative Postconditions: - the output is a set-integer pair - total weight of the output items