Processing math: 75%

Lecture 6

Dynamic Programming

Dynamic Programming is a method of constructing efficient algorithms that tends to be especially useful in counting and optimization problems.

We will start with a very simple example. The Fibonacci sequence is given by

$F_1=1$ ,

$F_2=1$ ,

$F_{n+2} = F_n + F_{n+1}$ . How to compute the

$n$ -th Fibonacci number?

INPUT:

$n$
OUTPUT:

$F_n$

We could compute

$F_n$ with the following recursive program (this method is also called backtracking):

def fib(n):
  if n <= 2:
    return 1
  else:
    return fib(n-2) + fib(n-1)

What is the running time of this algorithm? The value of

$F_5$ will be computed in the following way:

$F_5 = F_3 + F_4 = (F_1 + F_2) + (F_2 + F_3) = (1 + 1) + (1 + (F_1 + F_2)) = (1+1) + (1+(1+1))$ .

As we can see, our algorithm basically adds

$F_n$ ones; thus, the running time will be proportional to

$F_n$ . It is known that

$F_n = \Theta(\phi_n)$ where

$\phi = {1+\sqrt{5} \over 2} = 1.618...$ . The memory complexity is

$O(n)$ , because of the recursion stack.

However, this algorithm can be easily improved. Note how we have computed

$F_3$ twice in our situation -- we could save time by computing each

$F_k$ just once:

def fib(n):
  fibs = [0] * (n+1)
  fibs[1] = 1
  fibs[2] = 1
  for i in range(2, n+1):
    fibs[i] = fibs[i-1] + fibs[i-2]
  return fibs[n]

int fib(int n) {
  vector fibs(n+1);
  fibs[1] = fibs[2] = 1;
  for(int i=3; i<=n; i++) fibs[i] = fibs[i-2] + fibs[i-1];
  return fibs[n];
  }

It is easy to see that our algorithm runs in time

$O(n)$ now. Memory is also

$O(n)$ .

In fact, we can improve our algorithm even further -- we only ever need the last two computed Fibonacci numbers:

def fib(int n):
  a = 1, b = 1;
  for _ in range(n):
    (a,b) = (b, a+b)
  return a

This kind of optimization is called Dynamic Programming. It works as follows:

Start with a (typically very inefficient) recursive solution.
Identify the subproblems which are computed multiple times by our recursive solution. (In the Fibonacci example, computing fib(k) is such a subproblem.)
Solve all the subproblems from the simplest to the most complex one, and write down the results in a table. Instead of using recursion, use the result stored in the table.

The Fibonacci example was very simple -- below we can see several more complex problems which can also be solved with this approach.

Note: It is possible to compute

$F_n$ , and solve linear recurrences in general, in

$O(\log n)$ -- but not by Dynamic Programming approach.

Longest Common Subsequence (LCS)

INPUT: Two sequences of integers

$a[0..n-1]$ and

$b[0..m-1]$
OUTPUT: A sequence of integers

$c[0..n-1]$ which is the longest subsequence of both

$a$ and

$b$ . (In case of a tie, any longest subsequence can be returned.)

For example, for

$a = [1,2,3,4,5,6,7,8]$ and

$b = [1,4,7,2,5,8,3,6]$ , the longest common subsequence is [1,2,5,6]. This problem has important applications in computational biology (comparing genomes).

How to solve this? For a sequence

$s$ , let

$s^*$ be its last element, and

$s'$ be the remaining ones. If

$c$ is a common subsequence of

$a$ and

$b$ , one of the following always holds:

$a$ , $b$ , and $c$ are all empty.
$a^*$ is not used in $s$ , thus $s$ is a common subsequence of $a'$ and $b$ .
$b^*$ is not used in $s$ , thus $s$ is a common subsequence of $a$ and $b'$ .
the last element of $s$ is $a* = b*$ , and $s'$ is a common subsequence of $a'$ and $b'$ , plus $a*=b*$ .

It is easy to see (or prove by induction) that we can therefore compute the LCS of

$a$ and

$b$ recursively. The LCS of

$a$ and

$b$ has to be one of the following:

an empty sequence,
the LCS of $a'$ and $b$ ,
the LCS of $a$ and $b'$ ,
if $a* = b*$ , the LCS of $a'$ and $b'$ plus the $a*$ appended.

We always take the option which gives us the longest common subsequence.

A recursive solution will run in exponential time, but we can see that every subproblem boils down to computing the LCS of

$a[0..i-1]$ and

$b[0..j-1]$ . We can use DP to store the results in a two-dimensional array dp, at position

$[i][j]$ . Therefore, we can compute the length of LCS using the following DP program:

dp = np.zeros([n+1, m+1])
for i in range(1, n+1):
  for j in range(1, m+1):
     dp[i][j] = max(dp[i-1][j], dp[i][j-1])
     if a[i-1] == b[j-1]:
       dp[i][j] = dp[i-1][j-1] + 1

Once we compute the length of LCS, the actual LCS can be reconstructed by checking how we have achieved

$dp[n][m]$ , and going back to

$dp[0][0]$ . See this visualization (C++ source is also included).

This program runs in time

$O(nm)$ and memory

$O(nm)$ . If we are just interested in the length of the LCS, we can improve the memory complexity to

$O(m)$ (or

$O(n)$ ) by remembering only the last two rows of the array

$dp$ .

Knapsack Problem

The Knapsack Problem is a very natural optimization problem. You have a bunch of valuable items you could take, but you cannot take them all, because of limited storage space. Which subset of items should you take in order to maximize the value of your subset?

INPUT

$n$ items numbered from 0 to

$n-1$ , for every item, a weight

$w_i$ and value

$v_i$ ; capacity

$M$
OUTPUT A subset

$S$ of

$\{0..n-1\}$ such

$\sum_{i \in S} w_i \leq M$ and

$\sum_{i \in S} v_i$ is mazimized

We assume that weights

$w_i$ and capacity

$M$ are integers. A simple solution to the Knapsack problem is to take the item which has the greatest value-to-weight ratio and it still fits in our storage space, and then fill the remaining storage using the same method. This is called greedy approach (always start with the move which looks the most promising given some incomplete information). There are many problems where greedy approach works, but not here: if capacity is 100, weights are [50,50,51] and values are [50,50,52], this approach would make us take the item with value 52, while we could also fit the two items with value 50, for almost twice as good total value!

To apply the Dynamic Programming approach, we need to define subproblems. Let's decide first whether we want to take the item number

$n-1$ or not. If we take it, we need to fill the remaining space (

$M-w_i$ ) optimally with items with smaller numbers. If we do not take it, we need to fill the space

$M$ optimally with items with smaller numbers. Therefore, the appropriate subproblem is an instance of the same problem where we have less items to choose, and possibly less carrying capacity.

Let

$B_{m,k}$ be the solution to the knapsack problem when we restrict our capacity to

$m$ and our items to

$0, .., k-1$ . Since we can either take the item

$k-1$ or not, we get the following:

$B_{m,0} = 0$

$B_{m,k} = \max(B_{m,k-1}, B_{m-w_{k-1}, k-1} + v_{k-1}$ if

$m \geq w_{k-1}$

$B_{m,k} = B_{m,k-1}$ otherwise To find the best possible value, we need to compute

$B_{M,n}$ . Computing all the values of

$B_{m,k}$ yields an algorithm which runs in time

$O(Mn)$ and memory

$O(Mn)$ . As in the LCS problem, we can then use the whole

$B$ table to retrieve the optimal subset. If we care only about the best value, since we only use

$B_{m,k-1}$ when computing

$B_{m,k}$ , we can reduce the memory complexity to

$O(M)$ .

Above, we could either take an item or not. A similar algorithm can be used in the case when there are many copies of every item, and we can take as many as we want.

Matrix Chain Multiplication

We are given a chain of matrices:

$A_0$ of dimensions

$d_0 \times d_1$ ,

$A_1$ of dimensions

$d_1 \times d_2$ , ...,

$A_{n-1}$ of dimensions

$d_{n-1} \times d_n$ . We can compute the product of two matrices of dimensions

$a \times b$ and

$b \times c$ using the brute algorithm, using

$abc$ number multiplications. Since matrix multiplication is commutative, we can multiply them in any order we want -- what ordering should we use to minimize the number of number multiplications?

Let

$t(i,j)$ be the number of number multiplications necessary to obtain

$A_i \ldots A_{j-1}$ . Suppose the last multiplication will multiply

$A_i \ldots A_{k-1}$ by

$A_k \ldots A_{j-1}$ ; the smallest number of number multiplications possible by using this method is

$t(i,k) + t(k,j) + d_i d_k d_j$ . Again, we could compute

$t(i,j)$ recursively (in exponential time), or use DP to avoid recomputing the same values of

$t(i,j)$ again and again. This time it is a bit more challenging to write a program which computes the values of

$t(i,j)$ in the correct order (i.e., without using a value which we have not yet computed) -- we can do this e.g. like this:

t = np.zeros(n+1, n+1)

for j in range(n+1):
  for i in range(j-1, -1, -1):
    if j=i+1:
      t[i][j] = 0
    else:
      t[i][j] = min([t[i][k]+t[k][j]+d[i]*d[j]*d[k] for k in range(i+1, j)])

This works in time

$O(n^3)$ and memory

$O(n^2)$ . (This time it is not possible to reduce the memory complexity in any easy way, since usually all the computed values will still be necessary later.) See this visualization (C++ source is also included).

Recursion with Memoization

DP is one possible approach of optimizing recursive computations. There is another approach -- we modify our backtracking solution in the following way:

once we compute our answer, we record it somewhere;
before we start computing, we check our notes to see if we have already computed the answer -- if yes, just use the computed value.

In Python, we can use a dictionary to record the answers:

memo = {}
def fib(n):
  global memo
  if n in memo:
    return memo[n]
  if n <= 2:
    answer = 1
  else:
    answer = fib(n-2) + fib(n-1)
  memo[n] = answer
  return answer

This approach is called Recursion with Memoization. The two approaches are largely interchangeable, but there are significant differences:

The idea of memoization might be easier to grasp by some people / for some problems.
In some problems it is not clear how to list, sequence, and index all the subproblems which could appear in our computation (without missing any necessary ones, and without solving too many subproblems which are not actually necessary). Recursion with memoization avoids this issue completely.
Filling arrays is easier for the computer than doing recursion and filling dictionaries. For this reason, DP tends to be faster and more memory efficient than memoization. On top of this, memoization also tends to have worse usage of memory because of the recursion stack.
In some settings, the depth of recursive calls is limited. For example, Python by default does not allow us to have recursion depth of more than 1000 (to prevent buggy recursive programs from using all the system memory), and thus it will fail if we try to compute, say, $F_{3000}$ . (In Python this problem can be circumvented with sys.setrecursionlimit(1500).)
We used a dictionary in our solution. In simple cases such as Fibonacci we could use an array as well -- the reason why we have used a dictionary is that a simple array might not be sufficient in general. Although Python makes it easy to use dictionaries, they are actually rather complex data structures -- this will be covered in the next lectures.