L11 - Dynamic Programming.pdf
Document Details

Uploaded by ZippyPelican
Tags
Related
Full Transcript
CP312 Algorithm Design and Analysis I LECTURE 11: DYNAMIC PROGRAMMING Dynamic Programming A technique for algorithm design to solve optimization-type problems Similar to divide-and-conquer: β¦ Solves problems by combining solutions with smaller subproblems Unlike divide-and-conquer, dynamic programmi...
CP312 Algorithm Design and Analysis I LECTURE 11: DYNAMIC PROGRAMMING Dynamic Programming A technique for algorithm design to solve optimization-type problems Similar to divide-and-conquer: β¦ Solves problems by combining solutions with smaller subproblems Unlike divide-and-conquer, dynamic programming is used when the subproblems are not independent and may repeat. β¦ The DP solves each subproblem and then stores the result so that it can later reuse it without resolving the same subproblem. Simple Example: Fibonacci Number Consider the algorithm for computing the Fibonacci number: Recursion Tree Fib(7) Fib(π): If π β€ 2 then return 1 Else return π π’π π β 1 + π π’π(π β 2) Fib(6) Fib(5) Fib(5) Fib(4) Fib(4) Fib(3) π π = π π β 1 + π π β 2 + Ξ(1) Running Time is Exponential = π(2π ) It keeps re-solving the same subproblem over and over again! Simple Example: Fibonacci Number Consider an alternative way for computing the Fibonacci number: MemFib(π): Initialize array π΄ of size π If π β€ 2 then return 1 If π΄[π] is not defined π΄ π = πππ¦π π’π π β 1 + πππ¦π π’π(π β 2) return π΄[π] Both have running time: π(π) Top-Down Memoization: Store the solutions of already solved subproblems then reuse the solutions when seen again. This is Dynamic Programming. It can be thought of as βsmart recursionβ IterFib(π): Initialize array π΄ of size π For π = 1 to π: If π β€ 2 then π΄π =1 If π΄[π] is not defined π΄ π =π΄ πβ1 +π΄ πβ2 return π΄[π] Bottom-Up Iteration: Solve subproblems starting with the base case and moving towards larger subproblems. Recursion Trees and Subproblem Graphs Recursion Tree Fib (5) Fib(5) Fib (4) Fib(4) Fib(3) Fib(2) Fib(3) Fib(2) Fib(2) Subproblem Graph Edges denote dependencies: e.g. Subproblem Fib(5) needs the solution of subproblems Fib(4) and Fib(3) Fib (3) Fib(1) Fib(1) Models the execution of a pure recursive algorithm Fib (2) Fib (1) Identifies the unique number of subproblems to build a memorized DP solution Steps for Applying Dynamic Programming 1. Identify optimal substructure β¦ An optimal solution to a problem contains optimal solutions to subproblems 2. Identify overlapping subproblems β¦ A recursive solution contains a βsmallβ number of distinct subproblems that are repeated many times. 3. Define a recursive formulation β¦ Assuming solutions to sub-problems, recursively define the final solution 4. Compute the value of the final optimal solution β¦ Store the solutions of sub-problems in a table and build the final solution out of the stored values (top-down or bottom-up) β¦ A subproblem graph helps in identifying in what order to compute the values Rod Cutting Problem Problem: Given a rod of length π and a table of prices for each length of rod, determine the maximum revenue to obtain by cutting up the rod and selling the pieces. Input: Initial length π and list of prices ππ for π = 1, β¦ , π Output: The maximum revenue resulting from any rod cutting Rod Cutting Problem Example: Given the following rod and list of prices: π=4 What are the possible ways of cutting this rod and what is the revenue for each way? The rod cutting that will yield the highest revenue: 10 Rod Cutting Problem NaΓ―ve Solution: We can cut up a rod of length π in 2πβ1 different ways, since we have an independent option of cutting, or not cutting, at distance π from the left end. Running time: π(2π ) Exponential! Steps for DP: Rod Cutting 1. Identify optimal substructure β¦ An optimal solution to a problem contains optimal solutions to subproblems 2. Identify overlapping subproblems β¦ A recursive solution contains a βsmallβ number of distinct subproblems that are repeated many times. 3. Define a recursive formulation β¦ Assuming solutions to sub-problems, recursively define the final solution 4. Compute the value of the final optimal solution β¦ Store the solutions of sub-problems in a table and build the final solution out of the stored values (top-down or bottom-up) β¦ A subproblem graph helps in identifying in what order to compute the values Rod Cutting Problem We can come up with a DP algorithm based on the following observations: β¦ The optimal solution contains a left piece at some length π and then a remaining part of length π β π β¦ We only then have to consider how to divide the right part. We have optimal substructure (proof omitted) And we have overlapping subproblems π πβπ Rod Cutting: Recursive Formulation Let π (π) denote the revenue we obtain by first cutting the rod at position π and then recursively cut the right part of length π β π π π = ππ + π (π β π) Profit from left part Profit from remaining part How can we choose among all possibilities for the first initial cut? We try them all and pick the best: πβ1 1 2 πβ2 β¦ π π = max ππ + π π β π 1β€πβ€π πβ1 1 π π = max ππ + π π β π 1β€πβ€π Rod Cutting: Recursive Formulation Straightforward Recursion Recursion Tree CR(4) Cut-Rod(π, π): If π == 0 then return 0 q = ββ For π = 1 to π π = max(π, π π + Cut-Rod(π, π β π)) return π How to speed this up? CR(3) CR(1) CR(2) CR(1) CR(0) CR(0) CR(0) CR(2) CR(0) CR(1) CR(0) CR(1) CR(0) CR(0) Same problems solved over and over again. Exponential time! CR(0) Rod Cutting: Memoized DP From the subproblem graph, we can see: β¦ The dependencies between subproblems. For example, we need the solution to CR(2), CR(1), and CR(0) to solve CR(3). β¦ The number of subproblems is π + 1 for a problem (rod) size π β¦ The amount of memory to set up for memoization will be equal to the number of subproblems. 0 π CR(0) π CR(1) CR(2) β¦ β¦ β¦ β¦ CR (4) CR (3) CR (2) CR (1) CR(n) CR (0) Solving Bottom-up Memoized Top-down Subproblem Graph Rod Cutting: Memoized DP Use an array π[0, β¦ , π] of solutions, initialized with -1 Mem-Cut-Rod(π, π, π): If π π β₯ 0 then return π[π] If π == 0 π=0 Else π = ββ for π = 1 to π π = max(π, π π + Mem-Cut-Rod(π, π β π, π)) π π =π return π π Recursive: Memoized Top-down max ππ + π π β π 1β€πβ€π Rod Cutting: Bottom-up DP Use an array π[0, β¦ , π] of solutions, initialized with -1 π BottomUp-Cut-Rod(π, π, π): π 0 =0 for π = 1 to π π = ββ for π = 1 to π π = max π, π π + π π β π π π =π return π[π] Iterative Bottom-up max ππ + π π β π 1β€πβ€π Longest Common Subsequence (LCS) Problem: Given two sequences, find a longest (not necessarily contiguous) subsequence that is common in both sequences. Input: Two arrays π₯[1, β¦ π] and π¦[1, β¦ , π] Output: A longest common subsequence in π₯ and π¦ Example: π₯ = π΄, π΅, πΆ, π΅, π·, π΄, π΅ π¦ = π΅, π·, πΆ, π΄, π΅, π΄ πΏπΆπ(π₯, π¦): π©, πͺ, π©, π¨ Longest Common Subsequence (LCS) NaΓ―ve solution: Brute-force β¦ Check every subsequence π₯[1, β¦ , π] to see if it is also a subsequence of π¦[1, β¦ , π] Analysis: β¦ There are ππ subsequences of π₯ (each bit-vector of length π determines a distinct subsequence of π₯) β¦ For each subsequence of π₯, we make π(π) comparisons to check if it is also in π¦ β¦ Worst-case running-time = πΆ πππ Exponential Time! Can we do better? Steps for Applying Dynamic Programming 1. Identify optimal substructure β¦ An optimal solution to a problem contains optimal solutions to subproblems 2. Identify overlapping subproblems β¦ A recursive solution contains a βsmallβ number of distinct subproblems that are repeated many times. 3. Define a recursive formulation β¦ Assuming solutions to sub-problems, recursively define the final solution 4. Compute the value of the final optimal solution β¦ Store the solutions of sub-problems in a table and build the final solution out of the stored values (top-down or bottom-up) β¦ A subproblem graph helps in identifying in what order to compute the values Longest Common Subsequence (LCS) Simplification: 1. Look at the length of a longest-common subsequence 2. Extend the algorithm to find the LCS itself Notation: denote the length of a sequence π by π Strategy: Consider prefixes of π₯ and π¦ β¦ Define π π, π = πΏπΆπ π₯ 1, β¦ , π , π¦ 1, β¦ , π β¦ Then π π, π = πΏπΆπ π₯, π¦ Longest Common Subsequence (LCS) Theorem: π π β 1, π β 1 + 1 π π, π = α max π π β 1, π , π π, π β 1 Proof: Case π₯ π = π¦[π] 1 2 if π₯ π = π¦[π] otherwise π π β¦ π₯: 1 π¦: 2 = π π β¦ π π, π = α π π β 1, π β 1 + 1 max π π β 1, π , π π, π β 1 if π₯ π = π¦[π] otherwise Longest Common Subsequence (LCS) Proof (continued) for Case π₯ π = π¦[π] 1. Let π§ 1, β¦ , π = πΏπΆπ π₯ 1, β¦ , π , π¦ 1, β¦ , π where π π, π = π. β¦ Then π§ π = π₯ π = π¦[π] 2. 3. Thus, π§[1, β¦ , π β 1] is a common subsequence of π₯[1, β¦ , π β 1] and π¦[1, β¦ , π β 1] Furthermore, π§ 1, β¦ , π β 1 = πΏπΆπ π₯ 1, β¦ , π β 1 , π¦[1, β¦ , π β 1] β¦ If it is not, then this means there exists a longer common subsequence π€ of π₯ 1, β¦ , π β 1 and π¦[1, β¦ , π β 1] such that π€ > π β 1 whereby appending x[i] or y[j] to w makes π€ + 1 > π (Contradiction!) 4. Therefore, π π β 1, π β 1 = π β 1, which implies that: π π, π = π π β 1, π β 1 + 1 Longest Common Subsequence (LCS) Property 1 Optimal Substructure An optimal solution to a problem contains optimal solutions to subproblems If π§ = πΏπΆπ(π₯, π¦) then any prefix of π§ is an LCS of a prefix of π₯ and a prefix of π¦ π π, π = α π π β 1, π β 1 + 1 max π π β 1, π , π π, π β 1 if π₯ π = π¦[π] otherwise Recursive algorithm for LCS LCS π₯, π¦, π, π : if π₯ π == π¦[π] π π, π = LCS π₯, π¦, π β 1, π β 1 + 1 else π π, π = max{ LCS π₯, π¦, π β 1, π , LCS π₯, π¦, π, π β 1 } Worst-case: π₯ π β π¦[π] in which case the algorithm evaluates two subproblems, each with only one parameter decremented. Recursion Tree for LCS π = 3, π = 4 Height = π + π β work potentially exponential 3,4 3,3 2,4 Same subproblem 1,4 2,3 1,3 2,3 2,2 1,3 3,2 2,2 But weβre solving subproblems already solved! β =π+π Longest Common Subsequence (LCS) Property 2 Overlapping Subproblems A recursive solution contains a βsmallβ number of distinct subproblems that are repeated many times The number of distinct LCS subproblems for two strings of lengths π and π is only ππ Memoization algorithm for LCS Memoization: After computing a solution to a subproblem, store it in a table. Subsequent calls check the table to avoid redoing work. LCS π₯, π¦, π, π : if π π, π = πππΏπΏ if π₯ π == π¦[π] Same as before π π, π = LCS π₯, π¦, π β 1, π β 1 + 1 else π π, π = max{ LCS π₯, π¦, π β 1, π , LCS π₯, π¦, π, π β 1 } Time = Ξ ππ = constant work per table entry Space = Ξ(ππ) Memoization algorithm for LCS π₯ The values in the table are computed top-down π΄ Time = Ξ(ππ) π΅ πΆ π΅ π· π΄ π΅ π΅ π· π¦ πΆ π΄ LCS π₯, π¦, π, π : π΅ if π π, π = πππΏπΏ if π₯ π == π¦[π] π΄ π π, π = LCS π₯, π¦, π β 1, π β 1 + 1 else π π, π = max{ LCS π₯, π¦, π β 1, π , LCS π₯, π¦, π, π β 1 } Start by running LCS(π₯, π¦, π, π) Memoization algorithm for LCS π₯ The values in the table are computed top-down π΄ Time = Ξ(ππ) π΅ π· π¦ πΆ π΄ LCS π₯, π¦, π, π : π΅ if π π, π = πππΏπΏ if π₯ π == π¦[π] π΄ π π, π = LCS π₯, π¦, π β 1, π β 1 + 1 else π π, π = max{ LCS π₯, π¦, π β 1, π , LCS π₯, π¦, π, π β 1 } π΅ πΆ π΅ π· π΄ π΅ Memoization algorithm for LCS π₯ The values in the table are computed top-down Time = Ξ(ππ) π¦ π΄ π΅ πΆ 0 0 0 0 π΅ 0 0 1 1 π· 0 0 1 1 πΆ 0 0 1 2 π΄ 0 1 1 1 2 1 2 LCS π₯, π¦, π, π : π΅ 0 if π π, π = πππΏπΏ if π₯ π == π¦[π] 0 π΄ π π, π = LCS π₯, π¦, π β 1, π β 1 + 1 else π π, π = max{ LCS π₯, π¦, π β 1, π , LCS π₯, π¦, π, π β 1 } π΅ π· π΄ π΅ β¦ 4 Bottom-up algorithm for LCS π₯ Alternatively: Compute the table bottom-up 0 Time = Ξ(ππ) π¦ π΅ 0 π· 0 πΆ 0 π΄ 0 π΅ 0 π΄ 0 π΄ π΅ πΆ π΅ π· π΄ π΅ 0 0 0 0 0 0 0 Bottom-up algorithm for LCS π₯ Alternatively: Compute the table bottom-up Time = Ξ(ππ) π¦ π΄ π΅ πΆ π΅ π· π΄ π΅ 0 0 0 0 0 0 0 0 π΅ 0 0 1 1 1 1 1 1 π· 0 0 1 1 1 2 2 2 πΆ 0 0 1 2 2 2 2 2 π΄ 0 1 1 2 2 2 3 3 π΅ 0 1 2 2 3 3 3 4 π΄ 0 1 2 2 3 3 4 4 Reconstructing the LCS π₯ Reconstruct LCS by tracing backwards π¦ π΄ π΅ πΆ π΅ π· π΄ π΅ 0 0 0 0 0 0 0 0 π΅ 0 0 1 1 1 1 1 1 π· 0 0 1 1 1 2 2 2 πΆ 0 0 1 2 2 2 2 2 π΄ 0 1 1 2 2 2 3 3 π΅ 0 1 2 2 3 3 3 4 π΄ 0 1 2 2 3 3 4 4 0/1 Knapsack Problem Problem: Given a set of items, find a combination of items to add to a bag of some capacity that yields the most value. Input: A set of π weight/value pairs S = 1, π£1 , π€1 , 2, π£2 , π€2 , β¦ , π, π£π , π€π and capacity π Output: A subset π β {1,2, β¦ , π} that: Maximizes Οπβπ π£π Subject to Οπβπ π€π β€ π The Knapsack Problem Weight (kg) 6 2 4 3 11 Value ($) 20 8 14 13 35 We have a knapsack that can only carry so much weight Capacity: 10 kg The Knapsack Problem Weight (kg) 6 2 4 3 11 Value ($) 20 8 14 13 35 Capacity: 10 kg Total weight: 9 Total value: 35 Suppose I only have one copy of each item What is the most valuable way to fill the bag? 0/1 Knapsack Problem NaΓ―ve solution: Brute force: Try all possible subsets of π β¦ E.g. Try {1} then {1,2} then {1,2,3} then {1,3} etc. Running time? β¦ Ξ(2π ) Not good! Steps for DP: 0/1 Knapsack Problem 1. Identify optimal substructure β¦ An optimal solution to a problem contains optimal solutions to subproblems 2. Identify overlapping subproblems β¦ A recursive solution contains a βsmallβ number of distinct subproblems that are repeated many times. 3. Define a recursive formulation β¦ Assuming solutions to sub-problems, recursively define the final solution 4. Compute the value of the final optimal solution β¦ Store the solutions of sub-problems in a table and build the final solution out of the stored values (top-down or bottom-up) β¦ A subproblem graph helps in identifying in what order to compute the values 0/1 Knapsack Problem 1. Identify optimal substructure with overlapping subproblems. First solve the problem for small knapsacks Then larger knapsacks Then larger knapsacks 0/1 Knapsack Problem 1. Identify optimal substructure with overlapping subproblems. First solve the problem for small knapsacks Then larger knapsacks Then larger knapsacks First solve the problem for few items Then more items Then more items We need a two-dimensional table 0/1 Knapsack Problem 1. Identify optimal substructure with overlapping subproblems. Case 1: If the optimal solution for π items does not use item π The first π items π Capacity: π₯ Value: π Then the following is an optimal solution for π β 1 items: The first π β 1 items Capacity: π₯ Value: π 0/1 Knapsack Problem 1. Identify optimal substructure with overlapping subproblems. Case 2: If the optimal solution for π items uses item π The first π items π Capacity: π₯ Value: π Then the following is an optimal solution for π β 1 items: The first π β 1 items Capacity: π₯ β π€π Value: π β π£π Steps for DP: 0/1 Knapsack Problem 1. Identify optimal substructure β¦ An optimal solution to a problem contains optimal solutions to subproblems 2. Identify overlapping subproblems β¦ A recursive solution contains a βsmallβ number of distinct subproblems that are repeated many times. 3. Define a recursive formulation β¦ Assuming solutions to sub-problems, recursively define the final solution 4. Compute the value of the final optimal solution β¦ Store the solutions of sub-problems in a table and build the final solution out of the stored values (top-down or bottom-up) β¦ A subproblem graph helps in identifying in what order to compute the values 0/1 Knapsack Problem 3. Define a recursive formulation. Let πΎ[π, π₯] be the optimal value for capacity π₯ with π items 0 πΎ π, π₯ = ΰ΅ max πΎ π β 1, π₯ , πΎ π β 1, π₯ β π€π + π£π Item π is not used Item π is used if π₯ or π are 0 otherwise Steps for DP: 0/1 Knapsack Problem 1. Identify optimal substructure β¦ An optimal solution to a problem contains optimal solutions to subproblems 2. Identify overlapping subproblems β¦ A recursive solution contains a βsmallβ number of distinct subproblems that are repeated many times. 3. Define a recursive formulation β¦ Assuming solutions to sub-problems, recursively define the final solution 4. Compute the value of the final optimal solution β¦ Store the solutions of sub-problems in a table and build the final solution out of the stored values (top-down or bottom-up) β¦ A subproblem graph helps in identifying in what order to compute the values 0 πΎ π, π₯ = ΰ΅ max πΎ π β 1, π₯ , πΎ π β 1, π₯ β π€π + π£π if π₯ or π are 0 otherwise 0/1 Knapsack Problem 4. Compute the value of the optimal solution bottom-up Knapsack π, weights, values : π = weights. length for π₯ = 0 to π and π = 0 to π πΎ π, π₯ = 0 for π₯ = 1 to π for π = 1 to π πΎ π, π₯ = πΎ[π β 1, π₯] π€π = weights π π£π = values[π] if π€π β€ π₯ πΎ π, π₯ = max πΎ π β 1, π₯ , πΎ π β 1, π₯ β π€π + π£π return πΎ[π, π] Capacity: π weights 6 2 4 3 11 values 20 8 14 13 35 Runtime: π(ππ) Steps for DP: 0/1 Knapsack Problem 1. Identify optimal substructure β¦ An optimal solution to a problem contains optimal solutions to subproblems 2. Identify overlapping subproblems β¦ A recursive solution contains a βsmallβ number of distinct subproblems that are repeated many times. 3. Define a recursive formulation β¦ Assuming solutions to sub-problems, recursively define the final solution 4. Compute the value of the final optimal solution β¦ Store the solutions of sub-problems in a table and build the final solution out of the stored values (top-down or bottom-up) β¦ A subproblem graph helps in identifying in what order to compute the values 0/1 Knapsack Problem Example: Capacity: 3 π₯ πΎ[π, π₯] π 0 1 2 3 0 0 0 0 0 1 2 0 3 0 0 weights 1 2 3 values 1 4 6 0 πΎ π, π₯ = ΰ΅ max πΎ π β 1, π₯ , πΎ π β 1, π₯ β π€π + π£π if π₯ or π are 0 otherwise 0/1 Knapsack Problem Example: Capacity: 3 π₯ πΎ[π, π₯] π 0 1 2 3 0 0 0 0 0 1 2 0 1 3 0 0 weights 1 2 3 values 1 4 6 0 πΎ π, π₯ = ΰ΅ max πΎ π β 1, π₯ , πΎ π β 1, π₯ β π€π + π£π if π₯ or π are 0 otherwise 0/1 Knapsack Problem Example: Capacity: 3 π₯ πΎ[π, π₯] π 0 1 2 3 0 0 0 0 0 1 2 0 1 0 1 3 0 weights 1 2 3 values 1 4 6 0 πΎ π, π₯ = ΰ΅ max πΎ π β 1, π₯ , πΎ π β 1, π₯ β π€π + π£π if π₯ or π are 0 otherwise 0/1 Knapsack Problem Example: Capacity: 3 π₯ πΎ[π, π₯] π 0 1 2 3 0 0 0 0 0 1 2 0 1 0 1 3 0 1 weights 1 2 3 values 1 4 6 0 πΎ π, π₯ = ΰ΅ max πΎ π β 1, π₯ , πΎ π β 1, π₯ β π€π + π£π if π₯ or π are 0 otherwise 0/1 Knapsack Problem Example: Capacity: 3 π₯ πΎ[π, π₯] π 0 1 2 3 0 0 0 0 0 1 2 0 1 1 0 1 3 0 1 weights 1 2 3 values 1 4 6 0 πΎ π, π₯ = ΰ΅ max πΎ π β 1, π₯ , πΎ π β 1, π₯ β π€π + π£π if π₯ or π are 0 otherwise 0/1 Knapsack Problem Example: Capacity: 3 π₯ πΎ[π, π₯] π 0 1 2 3 0 0 0 0 0 1 2 0 1 1 0 1 4 3 0 1 weights 1 2 3 values 1 4 6 0 πΎ π, π₯ = ΰ΅ max πΎ π β 1, π₯ , πΎ π β 1, π₯ β π€π + π£π if π₯ or π are 0 otherwise 0/1 Knapsack Problem Example: Capacity: 3 π₯ πΎ[π, π₯] π 0 1 2 3 0 0 0 0 0 1 2 0 1 1 0 1 4 3 0 1 4 weights 1 2 3 values 1 4 6 0 πΎ π, π₯ = ΰ΅ max πΎ π β 1, π₯ , πΎ π β 1, π₯ β π€π + π£π if π₯ or π are 0 otherwise 0/1 Knapsack Problem Example: Capacity: 3 π₯ πΎ[π, π₯] π 0 1 2 3 0 0 0 0 0 1 2 0 1 1 1 0 1 4 3 0 1 4 weights 1 2 3 values 1 4 6 0 πΎ π, π₯ = ΰ΅ max πΎ π β 1, π₯ , πΎ π β 1, π₯ β π€π + π£π if π₯ or π are 0 otherwise 0/1 Knapsack Problem Example: Capacity: 3 π₯ πΎ[π, π₯] π 0 1 2 3 0 0 0 0 0 1 2 0 1 1 1 0 1 4 5 3 0 1 4 weights 1 2 3 values 1 4 6 0 πΎ π, π₯ = ΰ΅ max πΎ π β 1, π₯ , πΎ π β 1, π₯ β π€π + π£π if π₯ or π are 0 otherwise 0/1 Knapsack Problem Example: Capacity: 3 π₯ πΎ[π, π₯] π 0 1 2 3 0 0 0 0 0 1 2 0 1 1 1 0 1 4 5 3 0 1 4 6 weights 1 2 3 values 1 4 6 0 πΎ π, π₯ = ΰ΅ max πΎ π β 1, π₯ , πΎ π β 1, π₯ β π€π + π£π if π₯ or π are 0 otherwise Greedy vs. Dynamic Programming 0/1 Knapsack (using dynamic programming) Fractional Knapsack (using greedy approach) π₯ πΎ[π, π₯] 0 1 3 Next best choice 0 π 2 1 2 Next best choice 3 π π = Ξ ππ π π = Ξ(π lg π) Matrix Chain Multiplication Problem: Given a sequence of matrices, output their product. Input: A sequence of π matrices (π1 , β¦ , ππ ) such that for every pair (π, π) where π < π, if the dimension of ππ is (ππ , ππ ) and the dimension of ππ is (ππ , ππ ) then ππ = ππ Output: A sequence of matrix multiplications that minimizes the number of multiplication operations. Matrix Chain Multiplication To multiply an π Γ π matrix π΄ with a π Γ π matrix π΅ to get a π Γ π matrix πΆ we use the formula: π πΆπ,π = ΰ· π΄π,π π΅π,π π=1 Our complexity measure is the number of multiplications: β¦ π multiplications are needed to compute each πΆπ,π β¦ (π Γ π) Γ π multiplications are needed to compute all elements of πΆ Matrix Chain Multiplication Example: Suppose we are trying to compute the product of the 3-matrix sequence (π, π, π) β¦ Dimensions of π is 100 Γ 1 β¦ Dimensions of π is 1 Γ 100 β¦ Dimensions of π is 100 Γ 1 One Solution: β¦ Multiply π and π to get a 100 Γ 100 matrix ππ β¦ No. of multiplications = 100 Γ 1 Γ 100 = 10,000 β¦ Multiply ππ and π to get the 100 Γ 1 result β¦ No. of multiplications = 100 Γ 100 Γ 1 = 10,000 β¦ Total cost: 20,000 multplications Matrix Chain Multiplication Example: Suppose we are trying to compute the product of the 3-matrix sequence (π, π, π) β¦ Dimensions of π is 100 Γ 1 β¦ Dimensions of π is 1 Γ 100 β¦ Dimensions of π is 100 Γ 1 Another Solution: β¦ Multiply π and π to get a 1 Γ 1 matrix ππ β¦ No. of multiplications = 1 Γ 100 Γ 1 = 100 β¦ Multiply π and ππ to get the 100 Γ 1 result β¦ No. of multiplications = 100 Γ 1 Γ 1 = 100 β¦ Total cost: 200 multplications Matrix Chain Multiplication Example: Suppose we are trying to compute the product of the 4-matrix sequence (π, π, π, π) β¦ β¦ β¦ β¦ Dimensions of π is 10 Γ 20 Dimensions of π is 20 Γ 50 Dimensions of π is 50 Γ 1 Dimensions of π is 1 Γ 100 To represent the order of the multiplication, we use parentheses: β¦ π Γ (π Γ π Γ π ) leads to 125,000 multiplications β¦ π Γ π Γ π Γ π leads to 2,200 multiplications Matrix Chain Multiplication NaΓ―ve Approach: Try all possible ways of parenthesization. β¦ Is this a good idea? Let π(π) be the number of different ways to parenthesize π matrices. Then: Οπβ1 π π = α π=1 π π π(π β π) if π β₯ 2 1 if π = 1 This recurrence is related to the Catalan numbers and solves to Exponential! π π = π 4π /π1.5 Matrix Chain Multiplication Let us try to apply DP to solve this using polynomial time. Let π1 , β¦ , ππ be the π matrices whose product π1 Γ β― Γ ππ we want to compute. Denote the dimensions of matrix ππ by ππβ1 Γ ππ Let ππ,π where π β€ π be the matrix corresponding to the partial result ππ Γ ππ+1 Γ β― Γ ππ Matrix Chain Multiplication Steps for Applying Dynamic Programming: 1. Identify optimal substructure with overlapping subproblems. 2. Define a recursive formulation β¦ Assuming solutions to sub-problems, recursively define the final solution 3. Compute the value of the optimal solution bottom-up β¦ Store the solutions of sub-problems in a table (memorization) and build the final solution out of the stored values. Matrix Chain Multiplication Any optimal parenthesization of ππ Γ ππ+1 Γ β― Γ ππ must split the product into two parts: ππ Γ ππ+1 Γ β― Γ ππ and ππ+1 Γ ππ+2 Γ β― Γ ππ for some π β [π, π] What is the cost of such split: β¦ Cost = cost of computing ππ,π + cost of computing ππ+1,π + cost of computing ππ,π ππ+1,π 1. Do we have optimal substructure? 2. Do we have overlapping subproblems? Matrix Chain Multiplication 1. Do we have optimal substructure? β¦ Yes! Proof sketch: suppose not. If the parenthesization of each of the subproducts is not optimal then we could substitute a better one and obtain a solution to the original problem with smaller cost. Thus the problem has optimal substructure. Matrix Chain Multiplication 2. Do we have overlapping subproblems? β¦ Yes! Proof sketch: Recall that there are π π = π(4π /π1.5 ) different ways to parenthesize (i.e. that many different subproblems), but there are Ξ(π2 ) unique subproblems since a subproblem of the form ππ,π and π, π β [1, π]. Thus we are solving some subproblems multiple times. By tabulating the solutions, we can solve every subproblem just once. Matrix Chain Multiplication Having defined the type of subproblems it is very easy to recursively define the cost of the optimal solution. β¦ Let ππ,π be the minimum number of multiplications for computing the product ππ,π = ππ Γ ππ+1 Γ β― Γ ππ β¦ Our goal is to compute π1,π How do we define ππ,π recursively? β¦ Hint: consider the cases π = π and π β π Matrix Chain Multiplication Case 1: π = π β¦ The product ππ,π = ππ Γ ππ+1 Γ β― Γ ππ consists of just one matrix, so there is nothing to multiply. β¦ The cost ππ,π = 0 Case 2: π β π β¦ We use the optimal substructure property β¦ There is going to be some index π such that the product ππ Γ ππ+1 Γ β― Γ ππ is split into two parts ππ Γ ππ+1 Γ β― Γ ππ and ππ+1 Γ ππ+2 Γ β― Γ ππ β¦ The cost of this split will be ππ,π = ππ,π + ππ+1,π + ππβ1 ππ ππ β¦ How do we find the best split? Matrix Chain Multiplication Case 2: π β π β¦ How do we find the best split? Try all possible π and take the minimum Thus, the recursive formulation is: ππ,π 0 = α min π + π π,π π+1,π + ππβ1 ππ ππ πβ€π