Processing math: 75%

Lecture 6

Dynamic Programming

Dynamic Programming is a method of constructing efficient algorithms that tends to be especially useful in counting and optimization problems.

We will start with a very simple example. The Fibonacci sequence is given by F1=1, F2=1, Fn+2=Fn+Fn+1. How to compute the n-th Fibonacci number?

INPUT: n
OUTPUT: Fn

We could compute Fn with the following recursive program (this method is also called backtracking):
def fib(n):
  if n <= 2:
    return 1
  else:
    return fib(n-2) + fib(n-1)
What is the running time of this algorithm? The value of F5 will be computed in the following way: F5=F3+F4=(F1+F2)+(F2+F3)=(1+1)+(1+(F1+F2))=(1+1)+(1+(1+1)).

As we can see, our algorithm basically adds Fn ones; thus, the running time will be proportional to Fn. It is known that Fn=Θ(ϕn) where ϕ=1+52=1.618.... The memory complexity is O(n), because of the recursion stack.

However, this algorithm can be easily improved. Note how we have computed F3 twice in our situation -- we could save time by computing each Fk just once:
def fib(n):
  fibs = [0] * (n+1)
  fibs[1] = 1
  fibs[2] = 1
  for i in range(2, n+1):
    fibs[i] = fibs[i-1] + fibs[i-2]
  return fibs[n]
int fib(int n) {
  vector fibs(n+1);
  fibs[1] = fibs[2] = 1;
  for(int i=3; i<=n; i++) fibs[i] = fibs[i-2] + fibs[i-1];
  return fibs[n];
  }
It is easy to see that our algorithm runs in time O(n) now. Memory is also O(n).

In fact, we can improve our algorithm even further -- we only ever need the last two computed Fibonacci numbers:
def fib(int n):
  a = 1, b = 1;
  for _ in range(n):
    (a,b) = (b, a+b)
  return a
This kind of optimization is called Dynamic Programming. It works as follows:

The Fibonacci example was very simple -- below we can see several more complex problems which can also be solved with this approach.

Note: It is possible to compute Fn, and solve linear recurrences in general, in O(logn) -- but not by Dynamic Programming approach.

Longest Common Subsequence (LCS)

INPUT: Two sequences of integers a[0..n1] and b[0..m1]
OUTPUT: A sequence of integers c[0..n1] which is the longest subsequence of both a and b. (In case of a tie, any longest subsequence can be returned.)

For example, for a=[1,2,3,4,5,6,7,8] and b=[1,4,7,2,5,8,3,6], the longest common subsequence is [1,2,5,6]. This problem has important applications in computational biology (comparing genomes).

How to solve this? For a sequence s, let s be its last element, and s be the remaining ones. If c is a common subsequence of a and b, one of the following always holds: It is easy to see (or prove by induction) that we can therefore compute the LCS of a and b recursively. The LCS of a and b has to be one of the following: We always take the option which gives us the longest common subsequence.

A recursive solution will run in exponential time, but we can see that every subproblem boils down to computing the LCS of a[0..i1] and b[0..j1]. We can use DP to store the results in a two-dimensional array dp, at position [i][j]. Therefore, we can compute the length of LCS using the following DP program:
dp = np.zeros([n+1, m+1])
for i in range(1, n+1):
  for j in range(1, m+1):
     dp[i][j] = max(dp[i-1][j], dp[i][j-1])
     if a[i-1] == b[j-1]:
       dp[i][j] = dp[i-1][j-1] + 1
Once we compute the length of LCS, the actual LCS can be reconstructed by checking how we have achieved dp[n][m], and going back to dp[0][0]. See this visualization (C++ source is also included).

This program runs in time O(nm) and memory O(nm). If we are just interested in the length of the LCS, we can improve the memory complexity to O(m) (or O(n)) by remembering only the last two rows of the array dp.

Knapsack Problem

The Knapsack Problem is a very natural optimization problem. You have a bunch of valuable items you could take, but you cannot take them all, because of limited storage space. Which subset of items should you take in order to maximize the value of your subset?

INPUT n items numbered from 0 to n1, for every item, a weight wi and value vi; capacity M
OUTPUT A subset S of {0..n1} such iSwiM and iSvi is mazimized

We assume that weights wi and capacity M are integers. A simple solution to the Knapsack problem is to take the item which has the greatest value-to-weight ratio and it still fits in our storage space, and then fill the remaining storage using the same method. This is called greedy approach (always start with the move which looks the most promising given some incomplete information). There are many problems where greedy approach works, but not here: if capacity is 100, weights are [50,50,51] and values are [50,50,52], this approach would make us take the item with value 52, while we could also fit the two items with value 50, for almost twice as good total value!

To apply the Dynamic Programming approach, we need to define subproblems. Let's decide first whether we want to take the item number n1 or not. If we take it, we need to fill the remaining space (Mwi) optimally with items with smaller numbers. If we do not take it, we need to fill the space M optimally with items with smaller numbers. Therefore, the appropriate subproblem is an instance of the same problem where we have less items to choose, and possibly less carrying capacity.

Let Bm,k be the solution to the knapsack problem when we restrict our capacity to m and our items to 0,..,k1. Since we can either take the item k1 or not, we get the following: Bm,0=0 Bm,k=max if m \geq w_{k-1} B_{m,k} = B_{m,k-1} otherwise To find the best possible value, we need to compute B_{M,n}. Computing all the values of B_{m,k} yields an algorithm which runs in time O(Mn) and memory O(Mn). As in the LCS problem, we can then use the whole B table to retrieve the optimal subset. If we care only about the best value, since we only use B_{m,k-1} when computing B_{m,k}, we can reduce the memory complexity to O(M).

Above, we could either take an item or not. A similar algorithm can be used in the case when there are many copies of every item, and we can take as many as we want.

Matrix Chain Multiplication

We are given a chain of matrices: A_0 of dimensions d_0 \times d_1, A_1 of dimensions d_1 \times d_2, ..., A_{n-1} of dimensions d_{n-1} \times d_n. We can compute the product of two matrices of dimensions a \times b and b \times c using the brute algorithm, using abc number multiplications. Since matrix multiplication is commutative, we can multiply them in any order we want -- what ordering should we use to minimize the number of number multiplications?

Let t(i,j) be the number of number multiplications necessary to obtain A_i \ldots A_{j-1}. Suppose the last multiplication will multiply A_i \ldots A_{k-1} by A_k \ldots A_{j-1}; the smallest number of number multiplications possible by using this method is t(i,k) + t(k,j) + d_i d_k d_j. Again, we could compute t(i,j) recursively (in exponential time), or use DP to avoid recomputing the same values of t(i,j) again and again. This time it is a bit more challenging to write a program which computes the values of t(i,j) in the correct order (i.e., without using a value which we have not yet computed) -- we can do this e.g. like this:
t = np.zeros(n+1, n+1)

for j in range(n+1):
  for i in range(j-1, -1, -1):
    if j=i+1:
      t[i][j] = 0
    else:
      t[i][j] = min([t[i][k]+t[k][j]+d[i]*d[j]*d[k] for k in range(i+1, j)])
This works in time O(n^3) and memory O(n^2). (This time it is not possible to reduce the memory complexity in any easy way, since usually all the computed values will still be necessary later.) See this visualization (C++ source is also included).

Recursion with Memoization

DP is one possible approach of optimizing recursive computations. There is another approach -- we modify our backtracking solution in the following way:
  1. once we compute our answer, we record it somewhere;
  2. before we start computing, we check our notes to see if we have already computed the answer -- if yes, just use the computed value.
In Python, we can use a dictionary to record the answers:
memo = {}
def fib(n):
  global memo
  if n in memo:
    return memo[n]
  if n <= 2:
    answer = 1
  else:
    answer = fib(n-2) + fib(n-1)
  memo[n] = answer
  return answer
This approach is called Recursion with Memoization. The two approaches are largely interchangeable, but there are significant differences: