Topic F: Dynamic Programming

Section 1: DP Idea and Example
Section 2: Knapsack
Section 3: Longest Common Subsequence
Section 4: All-Pairs Shortest Paths

Section 1: DP Idea and Example

Dynamic programming (DP) is a general technique where we define the solution to a problem in terms of smaller subproblems. We'll start with an example, then describe the general approach.

Objectives: After learning this material, you should be able to:

Name and explain the components of a dynamic programming (DP) algorithm.
Solve the longest increasing subsequence problem using DP.

Longest increasing subsequence

This is the first problem we will solve with dynamic programming. Before defining it, some terminology: Given a list of numbers, also called a sequence, a subsequence is any result of deleting some elements of the list. For example, given the list $(1, 2, 3, 4, 5)$, the list $(1, 3, 5)$ is a subsequence but $(1, 5, 2)$ is not. A list of numbers is (weakly) increasing if each number is at least as large as the previous one.

The longest increasing subsequence (LIS) problem is:

Input: a nonempty list of integers.
Output: the length of its longest subsequence that is weakly increasing.

For the previous example, the longest increasing subsequence is the entire thing, $(1,2,3,4,5)$. If the input is $(1,6,5,3,4,8)$, then the longest increasing subsequence is $(1,3,4,8)$ with a length of $4$ elements.

Exercise 1.

First, let's look at the brute-force algorithm. It considers each possible subsequence of the list. Among all the subsequences that are increasing, it takes the longest one.

What is the time complexity of this algorithm? Suppose the input list has length $n$.

Solution.

We can bound it by $O(n 2^n)$. To get a subsequence, we pick a subset of the items $1,\dots,n$. There are $2^n$ subsets. For each subset, we check whether it's increasing and how long it is, which takes linear time in the size of the subset. In the worst case, the length of the subset is $n$, so we get $O(n 2^n)$.

Brute force is much too slow, so now we'll solve this problem using dynamic programming. The idea behind the algorithm is to consider all the prefixes of the input list. First, we solve a variant of the LIS problem for the prefix of length one; this is easy. Then we use this to solve it for the prefix of length two. And so on up to the end.

Consider the above example, input $(1,5,6,3,4,8)$. Suppose we're trying to solve the LIS problem for the prefix $(1,5,6,3,4)$, with the requirement that we must include the final element, $4$. Well, an increasing subsequence that includes $4$ can only be one of the following options:

No prior element -- the subsequence starts with $4$, so its length is just one.
$1$ as the prior element. We would take the LIS ending with $1$, and append $4$, making it one longer.
$3$ as the prior element. We would take the LIS ending with $3$, and append $4$, again making it one longer.

The solution for the prefix ending at $4$ is whichever of these options is the longest, which is of course the last one, giving a LIS of $(1,3,4)$. Now, we'll use this idea to go through the list and solve the problem up to each element so far, assuming we'll include that element.


// Algorithm 1: Longest Increasing Subsequence
1  longest_increasing_subsequence(A):
2      // A is a list of integers of length n
3      let L[1] = 1
4      for j = 2 to n:
5          let L[j] = 1
6          for i = 1 to j-1:
7              if A[i] <= A[j]:
8                  set L[j] = max(L[j], L[i] + 1)
9      return max(L)

The inner for loop on lines 6-8 implements the logic discussed above: it sets $L[j]$ to be $1$ if there are no prior elements weakly smaller, and otherwise $1$ plus the best of the previous eligible subsequences.

Proposition 1.

Algorithm 1 correctly solves the Longest Increasing Subsequence problem.

Proof.

We prove by induction the following statement: $L[j]$ is the length of the LIS of the input up to index $j$ that includes the element at index $j$. This will prove correctness because we return $\max_j L[j]$, i.e. the longest increasing subsequence that ends at any location.

Base case: $L[1] = 1$ is correct, because we can take the first element to be a subsequence of itself, with length one.

Inductive case: Suppose $L[1], \dots, L[j-1]$ are all correct. Now at index $j$, one possible subsequence is just the element $A[j]$ itself, which has length $1$. The other kind of possibility is a subsequence that starts earlier and ends at $j$, with the prior index included being $i$. This can only occur if $A[i] \leq A[j]$, and if so, its maximum length is $L[i] + 1$ because we appended the element at index $j$. The algorithm takes the maximum over these possibilities, so $L[j]$ is correct.

Now if each $L[j]$ is correct, then the algorithm is correct because it returns the maximum of $L[j]$ for all $j$, one of which must be the LIS of the input.

Proposition 2.

The running time of Algorithm 1 is $O(n^2)$ and space use is $O(n)$.

Proof.

Initialization runs in constant time and returning the answer runs in $O(n)$, finding the maximum element of $L$.

The outer for loop runs $n-1$ times, and each iteration, the inner for loop runs at most $n-1$ times, with constant-time operations. So the running time is $O(n^2)$.

The space usage is dominated by the array $L$, which is $O(n)$.

Exercise 2.

In Algorithm 1, why would it be wrong to return $L[n]$, the last entry of our answer array? Give an example input where returning $L[n]$ would be incorrect and explain how it fails.

Example solution.

$L[n]$ is the length of the LIS that ends exactly at location $n$, but the LIS of the entire sequence may end earlier. An example is the input $(1,9,10,5)$. Here the LIS has length $3$ and is $(1,9,10)$, but $L[n] = 2$ because the LIS ending at the last element is $(1,2)$.

Exercise 3.

Simulate Algorithm 1 on input $(5,1,3,2,4,0)$. What is $L$ and what is the final solution?

Solution

Index	1	2	3	4	5	6
Input	5	1	3	2	4	0
$L$	1	1	2	2	3	1

The final solution is $\max_j L[j] = 3$.

Reconstructing the subsequence itself

Algorithm 1 returns the length of the LIS, but not the subsequence itself. Luckily, we can modify it quite easily to do this as well. The approach is similar to the modification of breadth-first-search and Dijkstra's algorithm to return the shortest path itself (not just its length). We have to keep track, for each result we got, "how we got there". The result is Algorithm 2.


// Algorithm 2: LIS with Reconstruction
1  lis_2(A):
2      // A is a list of integers of length n
3      let L[1] = 1
4      let prev[j] = null for all j
5      for j = 2 to n:
6          let L[j] = 1
7          for i = 1 to j-1:
8              if A[i] <= A[j]:
9                  set L[j] = max(L[j], L[i] + 1)
10                 if L[j] == L[i] + 1:
11                     set prev[j] = i
12     let j = argmax(L)
13     return L[j] and lis_reconstruct(prev, j)


// Subroutine 1: LIS Reconstruction
1 lis_reconstruct(prev, j):
2   let S = empty list
3   while j is not null:
4       add j to front of S
5       j = prev[j]
6   return S

Exercise 4.

Revisiting our example, simulate Algorithm 2 on input $(5,1,3,2,4,0)$. Give $L$, $prev$, and the final output.

Solution

Index	1	2	3	4	5	6
Input	5	1	3	2	4	0
$L$	1	1	2	2	3	1
prev	null	null	2	2	3	null

The final solution is $\max_j L[j] = 3$ and the subsequence $(1,3,4)$. (Another answer is $(1,2,4)$.

Components of dynamic programming

Now that we've seen a dynamic programming algorithm, let's lay out the components that all DP algorithms have.

Dynamic programming algorithms always can be broken down into these components:

Subproblem definition. For example with LIS, subproblem $j$ was "compute the length of the LIS of the prefix of the input up to $j$, requiring it to include the final element." We stored the solutions in an array $L$.
Computing the final answer from the subproblem answers. For LIS, we took the maximum solution to any subproblem, i.e. $\max_j L[j]$.
Recurrence. The recurrence states how to solve any given subproblem. It always has two parts:

Base case / initialization. The base cases are the ones that don't rely on other subproblems. For LIS, the base case was a prefix of length $1$, and we initialized $L[1] = 1$.
Inductive case: how to solve a generic subproblem given the solution to "earlier" subproblems. For LIS, we said $L[j]$ was the maximum of $1$ and $1 + L[i]$ over any $i < j$ where $A[i] \leq A[j]$.

(Optional) reconstructing the object that witnesses the solution. DP algorithms usually return the size or value of some object, for example, the length of the LIS. Then, they can usually be modified in a straightforward, formulaic way to construct that actual object itself, for example, the actual subsequence as we did above. This modification usually proceeds by remembering which choices we made when solving a subproblem, e.g. when setting $L[j] = L[i] + 1$, remembering which index $i$ was used.

Every dynamic programming solution (at least in this class) is made up of the above components.

The key question you usually need to answer is: What are the subproblems, and what is the recurrence? Usually, the subproblems can be arranged in an array, since they must be solved in order. Often this array is multidimensional, as we will see. Sometimes the subproblem is essentially the same as the original problem, just on a prefix or subset of the input. But often the subproblem is slightly different, as in the LIS example where the subproblem required the subsequence to include the final element.

Once you define the above elements, the DP algorithm has essentially been defined:

Create an array to store the subproblem solutions. (Later, this may be a 2d array or more.)
Initialize the base case(s) of the recurrence.
Iterate through the subproblems in dependency order and solve using the inductive case of the recurrence.
Compute the final answer using the subproblem answers.

To reconstruct the witnessing object as well, it can generally be modified by creating a data structure that remembers the choices made, at each subproblem, when solving the recurrence; then backtracking through these choices.

Proofs of correctness. Every DP algorithm is proven correct with the following inductive proof:

(Base case) We prove the algorithm initializes the base cases or initial subproblems correctly.
(Inductive case) At each step, assuming the previous subproblems were solved correctly, we prove that the next subproblem is solved correctly.
By induction, steps (1) and (2) prove that all subproblems are solved correctly.
(Returning the final answer) We prove that, if all the subproblems were solved correctly, we return the final answer correctly.

For example, with the Longest Increasing Subsequence problem, the base case was that a prefix of length $1$ has a LIS of length $1$. Then, the inductive case was that for each subproblem $j=2,\dots,n$, assuming that $L[1],\dots,L[j-1]$ were all correct, the algorithm computes $L[j]$ correctly (inductive step). Finallly, we argued that if $L[j]$ is correct for each $j$, then it is correct to return $\max_j L[j]$.

Because the proof always follows this template, we will always prove correctness by filling in the above parts: correctness of the recurrence -- base case and inductive case -- and of returning the final answer. We also need to make sure that we solve the subproblems in dependency order.

Section 2: Knapsack

This section looks specifically at variants of the knapsack problem and their dynamic programming solutions.

Objectives. After learning this material, you should be able to:

Solve each knapsack variant using dynamic programming.
Identify the DP components of the knapsack algorithms.
Solve new DP problems involving 2d arrays.

Duplicates allowed

In the knapsack problem, we are given a set of items $i=1,\dots,n$ each with a value $v_i \in \mathbb{R}_+$ (a positive number) and a weight or size $w_i \in \mathbb{N}$ (a nonnegative integer).

We are given a number $W \in \mathbb{N}$ which is the maximum weight our knapsack can hold, also called the capacity or size of the knapsack. We must find the max-value subset of items that can fit in the knapsack.

In the duplicates allowed version, there are unlimited copies of each item available.

Exercise 5.

Given this input instance, what is the optimal solution? Suppose $W = 7$.

Item	Value	Weight
1	4	2
2	5	3
3	8	5

Solution.

The optimal solution is two copies of item 1 and one copy of item 2, for a value of 13. The total weight is $2 \cdot 2 + 3 = 7$, which is feasible as it matches the weight limit. We can check that every other feasible solution has lower value. For example, these are feasible solutions: three copies of item 1; or two copies of item 2; or one copy of item 3 and one copy of item 1.

Let's look for a DP solution. Recall the components of a DP solution: subproblem definition, computing the final value, the recurrence, and reconstructing the solution. Here a natural subproblem is to have a smaller-capacity knapsack. Let's try it: our subproblem definition is to let $C[w] = $ the maximum value we can fit in a knapsack of size $w$. With this subproblem, computing the final value is easy, as it is just $C[W]$.

For the recurrence, the base case is where $w = 0$, i.e. no items can fit, and the optimal value is zero. So we set $C[0] = 0$. For the inductive case: For $w \geq 1$, we set

\begin{equation} C[w] = \max\begin{cases} C[w-1] \\ \max_{i : w_i \leq w} ~ v_i + C[w - w_i] \end{cases} . \end{equation}

In other words, we can write the inductive case as an algorithm:

Set $C[w] = C[w-1]$, in other words, consider the optimal solution for a knapsack of size $w-1$.
That solution can fit in this knapsack as well, since this one is only larger.
For each item $i$:

Check if item $i$ can fit in this knapsack, i.e. check if $w_i \leq w$.
If not, skip this item and keep going.
Put item $i$ in the knapsack. We now have a space of $w - w_i$ remaining, and we have a value of $v_i$.
Fill the remaining space optimally. Luckily, we already solved that subproblem: it gives a value of $C[w - w_i]$.
Check if the resulting total value, $v_i + C[w - w_i]$, is better than our current solution. If so, keep it.

Claim 1.

The recurrence above is correct, i.e. $C[w] = $ the maximum value we can fit in a knapsack of size $w$.

Proof.

If $w=0$, then $C[w] = 0$ because no items can fit.

Now, given $w \geq 1$, either no items fit, or at least one item fits. If no item fits, then $C[w] = C[w-1] = 0$, which is correct.

So now suppose that at least one item fits. The optimal solution has at least one item, say $i$. Now for the remaining space $w-w_i$ (which is at least zero since $i$ fits in the knapsack), it must be used optimally. So the total value from the remaining space is $C[w - w_i]$, by inductive hypothesis. (If it were not used optimally, then we could get a better solution for the space and then add item $i$ to it and obtain a better solution for $C[w]$, which contradicts the assumption that this is optimal.)

So $C[w] = v_i + C[w - w_i]$. So the recurrence is correct, since the optimal solution is the result of picking the best such item $i$.

Combining these gives a dynamic programming algorithm:


// Algorithm 3: Knapsack with duplicates
1  knapsack_dups(v, w, W):
2      // v[i] = value, w[i] = weight, W = weight limit
3      let C[0] = 0
4      for x = 1 to W:
5          C[x] = C[x-1]
6          for i = 1 to n, if w[i] <= x:
7              C[x] = max(C[x], v[i] + C[x - w[i]])
8      return C[W]

Correctness: As usual in dynamic programming, correctness follows from correctness of the DP elements, which were argued above.

Efficiency: Space usage is dominated by $C$, which uses $O(W)$ space. For running time, we have nested loops, the outer one has $W$ iterations and the inner one has $n$ iterations, and the interior operations are constant time per iteration. So running time is $O(nW)$.

Exercise 6.

Modify the algorithm to reconstruct the actual list of items in the optimal knapsack.

Hint: Recall that for reconstruction, we should keep track of the choices our algorithm needed to make at each subproblem. At subproblem $C[j]$, what were our choices? Then, how do we backtrack from the very end, i.e. $C[W]$, to the beginning, to reconstruct the set?

Solution.

At each x, the choice was which item i to put in the knapsack. This took up w[i] space, and then we needed to add the rest of the solution, which was the optimal solution for C[x - w[i]].


// Algorithm 4: Knapsack with duplicates, with reconstruction
1  knapsack_dups_2(v, w, W):
2      // v[i] = value, w[i] = weight, W = weight limit
3      let C[0] = 0
4      let Item[0] = null
5      for x = 1 to W:
6          C[x] = C[x-1]
7          Item[x] = null
8          for i = 1 to n, if w[i] <= x:
9              C[x] = max(C[x], v[i] + C[x - w[i]])
10             if C[x] == v[i] + C[x - w[i]]:
11                 Item[x] = i
12    return kd_reconstruct(w, W, C, Item)


// Subroutine 2: Knapsack with duplicates reconstruction routine
1  kd_reconstruct(w, W, C, Item):
2      let x = W
3      let solution = empty list
4      while x > 0:
5          if Item[x] == none:
6              set x = x - 1
7          else:
8              add Item[x] to solution
9              set x = x - w[Item[x]]
10     return solution

Instead of a list, we could use an array where the $i$th entry counts how many copies of item $i$ are in the solution. This would be more space-efficient for large instances.

No Duplicates

In this version of the knapsack problem, there is just one copy of each item, but the rest of the problem (including the format of the input) is the same.

We might hope to modify the previous solution while keeping the subproblem $C[w]$ essentially the same. For example, by somehow remembering which items were used in $C[w]$. This turns out to fail, in part because there could be multiple optimal subsets for $C[w]$, and remembering all of them turns out to be prohibitive. Instead, we need a trick.

Subproblem definition. The trick is to introduce an extra dimension to our subproblems. Specifically, for our subproblem definition, let $C[k,w]$ be the maximum value one can obtain from a knapsack of size $w$ using only items from the subset $\{1,\dots,k\}$.

Computing the final solution. We will simply return $C[n,W]$ where $n$ is the number of items and $W$ is the knapsack capacity.

Recurrence. The base case is pretty straightforward: $C[k,0] = 0$ for all item indexes $k$ and $C[0,w] = 0$ for all capacities $w$.

For the inductive case, we set for $k \geq 1, w \geq 1$:

\begin{equation} C[k,w] = \max \begin{cases} C[k-1, w] \\ v_k + C[k-1, w - w_k] & \text{(if $w_k \leq w$)} \end{cases} . \end{equation}

Claim 2.

The recurrence is correct, i.e. $C[k,w] = $ the maximum value obtainable from a knapsack of size $w$ using only items $\{1,\dots,k\}$.

Proof.

For the optimal solution with items $1,\dots,k$ and capacity $w$, there are two possibilities: we either include item $k$, or we don't. If we don't, then the optimal solution uses only items $1,\dots,k-1$, so its value is $C[k-1,w]$.

If we do, then the remaining space is $w - w_k$, and to fill it, we are only allowed to use items $1,\dots,k-1$ because we just used item $k$. So the optimal way to fill the remaining space is $C[k-1, w - w_k]$, and our total value is $v_k + C[k-1, w - w_k]$. Note this is only possible if $w_k \leq w$, as otherwise item $k$ cannot fit.

Since these are the only two possibilities (or only one possibility if $w_k > w$), and the recurrence chooses the best of both, it is optimal.

Our algorithm is therefore:


// Algorithm 5: Knapsack, no duplicates
1  knapsack(v, w, W):
2      let C[0,x] = 0 for all x = 1 to W
3      let C[i,0] = 0 for all i = 1 to n
4      for i = 1 to n:
5          for x = 1 to W:
6              let C[i,x] = C[i-1,x]
7              if w[i] <= x:
8                  set C[i,x] = max(C[i,x], v[i] + C[i-1,w-w[i]])

Correctness. As usual for dynamic programming, correctness follows almost immediately from the above arguments that the three components (subproblem, final solution, recurrence) are correct.

Efficiency. Initialization takes $O(n + W)$ time, and returning the final result is constant time. There are nested loops of $n$ and $W$ iterations, with constant-time operations in the innermost loop, so runtime is $O(n W)$. Space is dominated by $C$, which uses $O(n W)$ space.

Exercise 7.

How do we modify the no-duplicates knapsack algorithm to return the optimal subset of items (i.e. reconstruct the solution), not just its value?

Solution.

We can create an array D[i,x] = False if C[i,x] == C[i-1,x], and otherwise D[i,x] = True, meaning that C[i,x] == v[i] + C[i-1,w-w[i]] and we used item i at this stage.

We then start with D[n,W]. If False, we go to D[n-1, W]. If True, we add item n to the knapsack and go to D[n-1, W - w[n]]. We continue in this way until we get to D[0,x] for any x.

Section 3: Longest Common Subsequence

This section considers another DP example, longest common subsequence (LCS).

Objectives. After learning this material, you should be able to:

Execute the LCS algorithm on example inputs.
Identify the DP components of the LCS algorithm.
Solve new DP problems similar to LCS.

The problem and algorithm

In the longest common subsequence problem, our goal is to compare two sequences to find the longest subsequence that they have in common. Recall that a subsequence does not have to be consecutive. This problem is a part of, for example, version control software like git that needs to track changes to a document.

Input: two sequences A and B (we will suppose the elements are characters or inetegers).
Output: the length of the longest sequence that is a subsequence of both A and B.

Exercise 8.

Let A = "ALGORITHM" and B = "ANARCHISM". What is their longest common subsequence?

Solution.

One answer is ARIM, for a length of 4. Another is ARHM, also with length 4.

As always, the first question is the subproblem definition. A natural first try is a smaller version of the original problem. In that case, let C[i,j] = the length of the longest common subsequence of A[1:i] and B[1:j], where A[1:i] denotes the prefix of A from characters 1 to i and similarly for B[1:j].

In this case, to comptue the final answer, we just return C[n,m], where n = len(A) and m = len(B).

For the recurrence base case, if either input has zero characters, then the answer is zero, so C[0,j] = 0 for all j and C[i,0] = 0 for all i.

For the recurrence inductive case, suppose $i,j \geq 1$. We need to consider cases. If A[i] == B[j], then one possibility for C[i,j] is a subsequence that ends with this character. The optimal length would be the length of a subsequence of A[1:i-1] and B[1:j-1], plus one more for this final character. This gives option a := 1 + C[i-1,j-1]. If A[i] != B[j], then we can set option a := 0.

Then, regardless of whether A[i] and B[j] are equal, we have two more options. If we do not include the last character of A in our subsequence, then our solution is the same as on A[1:i-1], which has value b := C[i-1,j]. And if we do not include the last character of B, then similarly the value is c := C[i,j-1]. (Note that if we do not include both, this will be covered by b and c.)

Putting these together, we can choose the best of these choices, C[i,j] = max{a, b, c}.

Combining all of the elements gives us this DP algorithm. Notice that we need to decide how to iterate through the subproblems. It's important to make sure that when we're solving subproblem (i,j), we've already solved all the subproblems it depends on: in this case, (i-1,j-1), (i-1,j), and (i,j-1).


// Algorithm 6: LCS
1  lcs(A, B):
2      // A has length n, B has length m
3      let C[i,0] = 0 for all i = 1 to n
4      let C[0,j] = 0 for all j = 1 to m
5      for i = 1 to n:
6          for j = 1 to m:
7              let a = 1 + C[i-1,j-1] if A[i] == B[j], else let a = 0
8              let b = C[i-1,j]
9              let c = C[i,j-1]
10             let C[i,j] = max{a, b, c}
11     return C[n,m]

Exercise 9.

Execute the algorithm on input A = ALGORITHM, B = ANARCHISM. Fill in the two-dimensional table and give the final answer. Briefly explain how you filled in the squares where $i \leq 2$ and $j \leq 2$.

Solution.

	''	A	L	G	O	R	I	T	H	M
''	0	0	0	0	0	0	0	0	0	0
A	0	1	1	1	1	1	1	1	1	1
N	0	1	1	1	1	1	1	1	1	1
A	0	1	1	1	1	1	1	1	1	1
R	0	1	1	1	1	2	2	2	2	2
C	0	1	1	1	1	2	2	2	2	2
H	0	1	1	1	1	2	2	2	3	3
I	0	1	1	1	1	2	3	3	3	3
S	0	1	1	1	1	2	3	3	3	3
M	0	1	1	1	1	2	3	3	3	4

The final answer is C[n,m] = 4.

To fill in those squares, first we used the base case to set C[i,j] = 0 if i == 0 or j == 0. Then for C[1,1], because the first characters are both A, they match and we use C[1,1] = 1 + C[0,0] = 1. For C[1,2], they don't match, so we use the max of C[1,1] and C[0,2], which is 1. For C[2,1], they don't match, so we use the max of C[1,1] and C[2,0], which is 1. For C[2,2], they don't match, so we use the max of C[1,2] and C[2,1], which is 1.

Exercise 10.

Explain how to modify the algorithm to reconstruct the longest subsequence itself. You do not need to write the entire code of a new algorithm, just describe how to do it. Briefly justify correctness.

Solution.

We make a second array D where D[i,j] tells us which choice we made when filling in C[i,j]. We can let D[i,j] = "a", "b", or "c" depending on which one achieved the max.

To reconstruct the subsequence afterward:

Start at i = n, j = m. Let S be an empty sequence.
If D[i,j] == "a": If A[i] == B[j], then we add this character to the beginning of S. Regardless, we then let i -= 1, j -= 1.
If D[i,j] == "b", then we let i -= 1.
If D[i,j] == "c", then we let j -= 1.
Repeat until i or j is zero, then stop and return S.

This is correct because: if D[i,j] == "a", then the optimal longest common subsequence of A[1:i] and B[1:j] includes the current character A[i] == B[j], preceded by the optimal subsequence of A[1:i-1] and B[1:j-1]. So we add this character to S, and then recurse to the case (i-1,j-1). Similarly for the other options.

Secction 4: All-pairs shortest paths

This section uses dynamic programming to solve all-pairs shortest paths.

Objectives. After learning this material, you should be able to:

Execute the Floyd-Warshall algorithm on example inputs.
Identify the DP components of the Floyd-Warshall algorithm.
Solve new DP problems involving adding an extra dimension.

The problem and algorithm

In the all-pairs shortest paths problem, our goal is to output a data structure giving the shortest path from any starting point to any end point.

Input: Graph $G = (V,E)$ with $n$ vertices, weighted, directed, with no negative cycles.
Output: two-dimensional array $D[u,v] = $ length of shortest path from $u$ to $v$.

The challenge here is that $D[u,v]$ does not give us any useful subproblem to work with. We need a new clever idea to introduce simpler subproblems that enable us to build up a solution. As with knapsack, we'll introduce an extra variable. The key idea is to consider paths that only use a subset of the vertices. We can grow the subset to build up more complex solutions.

Subproblem definition. Let $d[u,v,k] = $ length of the shortest path from $u$ to $v$ using as intermediate nodes only vertices $1,\dots,k$.

Final solution. In particular, with $n$ vertices, $d[u,v,n] = $ the length of the shortest path from $u$ to $v$ using all vertices. So if we set $D[u,v] = d[u,v,n]$ for all $u,v$, this will be correct.

Recurrence. For the base cases, we set $d[u,u,0] = 0$ for all $u$. Then, for all edges $(u,v)$ with length $w[u,v]$, we set $d[u,v,0] = w[u,v]$. For all other pairs, we set $d[u,v,0] = \infty$.

For the inductive case with $k \geq 1$, imagine we've solved $d[u,v,k-1]$ for all $u,v$ and now we want to compute $d[u,v,k]$. We set:

\begin{equation} d[u,v,k] = \min\begin{cases} d[u,v,k-1] \\ d[u,k,k-1] + d[k,v,k-1] \end{cases} . \end{equation}

Informally, this says we can either use the old route that didn't include $k$ at all, or we can include $k$. If we do, then we must route from $u$ to $k$ somehow, using distance $d[u,k,k-1]$, and then route to $v$ somehow, using distance $d[k,v,k-1]$.

Claim 3.

The recurrence is correct, i.e. $d[u,v,k] = $ the length of the shortest path from $u$ to $v$ that goes through only vertices $1,\dots,k$.

Proof.

The shortest path from $u$ to $v$, using only intermediate vertices $1,\dots,k$, either uses vertex $k$ or it doesn't. Suppose it doesn't. Then $d[u,v,k] = d[u,v,k-1]$, by definition.

Suppose it does. Then the shortest path using $1,\dots,k$ has the form $u, \dots, k, \dots, v$. Then the portion $u,\dots,k$ must be a shortest path from $u$ to $k$ using intermediate vertices $1,\dots,k-1$. (Otherwise, we could take the shortest path and shorten the distance from $u$ to $v$, a contradiction.) Similarly, the portion $k,\dots,v$ must be a shortest path from $k$ to $v$. So in this case, $d[u,v,k] = d[u,k,k-1] + d[k,v,k-1]$.

Since the shortest path must be one of these two cases, it is the smaller of the two.

Putting the pieces together, we get this algorithm:


// Algorithm 6: Floyd-Warshall
1  floyd_warshall(G, w):
2      let d[u,v,0] = 0 if u==v, w[u,v] if (u,v) is an edge, or infinity otherwise
3      for k = 1 to n:
4          for u = 1 to n:
5              for v = 1 to n:
6                  set d[u,v,k] = min(d[u,v,k-1], d[u,k,k-1] + d[k,v,k-1])
7      let D[u,v] = d[u,v,n] for all u,v
8      return D

Correctness. As usual with dynamic programming, correctness follows from above arguments that the subproblem, final solution, and recurrence are correct.

Efficiency. Initialization requires up to $O(n^2)$ time, since we set $d[u,v,0]$ for all pairs of nodes. Similarly, returning the solution requires constructing an $O(n^2)$ array, which has the same running time. There are three nested loops, each with $n$ iterations, and constant-time operations within each. So the running time is dominated by $O(n^3)$.

The space includes $D$ and local variables, but is dominated by $d$ which uses $O(n^3)$ space.

Reconstructing the solution

In this case, we obtained the lengths of the shortest paths, but not the actual paths themselves. As usual, reconstructing the solution will involve remembering the choices made when solving the subproblems, but here the full procedure is a bit unusual.

A merge approach. The most direct approach, applying our usual DP approach, is as follows. Let us create a variable inter[u,v] standing for "intermediate" vertices between $u$ and $v$. Initially, we set inter[u,v] = none for all u,v. Whenever we make a modification d[u,v,k] = d[u,k,k-1] + d[k,v,k-1], we set inter[u,v] = k.

Now, we can reconstruct the path as follows:

If inter[u,v] == none, then we must have followed an edge directly from $u$ to $v$, so the path is just $u,v$.
Otherwise, if inter[u,v] == k, then we make a recursive call to reconstruct the path from $u$ to $k$, another to get the path from $k$ to $v$, and we concatenate these.

A "next" approach. Notice that if a shortest path is of the form $u,x,\dots,v$, then it is also true that $x,\dots,v$ is a shortest path from $x$ to $v$. This implies that we only need to know, for each pair $u,v$, what the "next" vertex is on a shortest path. If we find that it is $x$, then we continue by finding the next vertex on the path from $x$ to $v$, etc.

So initialize next[u,v] = v if there is an edge $(u,v)$ and otherwise next[u,v] = none. Whenever we make a modification d[u,v,k] = d[u,k,k-1] + d[k,v,k-1], we can set next[u,v] = next[u,k], since the shortest path to $v$ proceeds by first taking the shortest path to $k$.

In this case, reconstruction is even easier:

Begin the path with u.
Let x = next[u,v].
Add x to the path.
If x == v, stop.
Otherwise, let x = next[x,v], go to step 3, and continue.

	''	A	L	G	O	R	I	T	H	M
''	0	0	0	0	0	0	0	0	0	0
A	0	1	1	1	1	1	1	1	1	1
N	0	1	1	1	1	1	1	1	1	1
A	0	1	1	1	1	1	1	1	1	1
R	0	1	1	1	1	2	2	2	2	2
C	0	1	1	1	1	2	2	2	2	2
H	0	1	1	1	1	2	2	2	3	3
I	0	1	1	1	1	2	3	3	3	3
S	0	1	1	1	1	2	3	3	3	3
M	0	1	1	1	1	2	3	3	3	4

	''	A	L	G	O	R	I	T	H	M
''	0	0	0	0	0	0	0	0	0	0
A	0	1	1	1	1	1	1	1	1	1
N	0	1	1	1	1	1	1	1	1	1
A	0	1	1	1	1	1	1	1	1	1
R	0	1	1	1	1	2	2	2	2	2
C	0	1	1	1	1	2	2	2	2	2
H	0	1	1	1	1	2	2	2	3	3
I	0	1	1	1	1	2	3	3	3	3
S	0	1	1	1	1	2	3	3	3	3
M	0	1	1	1	1	2	3	3	3	4

	''	A	L	G	O	R	I	T	H	M
''	0	0	0	0	0	0	0	0	0	0
A	0	1	1	1	1	1	1	1	1	1
N	0	1	1	1	1	1	1	1	1	1
A	0	1	1	1	1	1	1	1	1	1
R	0	1	1	1	1	2	2	2	2	2
C	0	1	1	1	1	2	2	2	2	2
H	0	1	1	1	1	2	2	2	3	3
I	0	1	1	1	1	2	3	3	3	3
S	0	1	1	1	1	2	3	3	3	3
M	0	1	1	1	1	2	3	3	3	4