Chapter 6: Dynamic Programming

The Sledgehammers of Algorithm Design

In the preceding chapters we have seen some elegant design principles-such as divide-andconquer, graph exploration, and greedy choice-that yield definitive algorithms for a variety of important computational tasks. The drawback of these tools is that they can only be used on very specific types of problems. We now turn to the two sledgehammers of the algorithms craft, dynamic programming and linear programming, techniques of very broad applicability that can be invoked when more specialized methods fail. Predictably, this generality often comes with a cost in efficiency.

This paragraph introduces dynamic programming (DP) and linear programming as powerful, general-purpose problem-solving techniques. It contrasts them with more specialized methods like greedy algorithms or divide-and-conquer, which are elegant but only work for specific problem structures. Dynamic programming is presented as a "sledgehammer"—a versatile and powerful tool that can solve a very wide range of problems, especially when other methods fail.

6.1 Shortest paths in dags, revisited

A Familiar Problem: Shortest Paths in DAGs

At the conclusion of our study of shortest paths (Chapter 4), we observed that the problem is especially easy in directed acyclic graphs (dags). Let's recapitulate this case, because it lies at the heart of dynamic programming.

This paragraph sets the stage by revisiting a problem that is easy to solve: finding the shortest path in a Directed Acyclic Graph (DAG). The text highlights that the solution to this specific problem contains the core logic that defines the entire paradigm of dynamic programming.

The Key Property of a DAG: Linearization

The special distinguishing feature of a dag is that its nodes can be linearized; that is, they can be arranged on a line so that all edges go from left to right (Figure 6.1). To see why this helps with shortest paths, suppose we want to figure out distances from node $S$ to the other nodes. For concreteness, let's focus on node $D$. The only way to get to it is through its predecessors, $B$ or $C$; so to find the shortest path to $D$, we need only compare these two routes:

The key feature of a DAG that makes it easy to work with is that it has no cycles. This means its vertices can be linearized (also known as being topologically sorted)—arranged in a straight line such that all directed edges only point from left to right. This ordering is fundamental because it guarantees that by the time you process any node, you have already processed all the nodes that could possibly lead to it.

A DAG and its topological ordering. — Figure 6.1: A dag and its linearization (topological ordering).

The Recurrence Relation

$$ \operatorname{dist}(D)=\min \{\operatorname{dist}(B)+1, \operatorname{dist}(C)+3\} . $$ A similar relation can be written for every node. If we compute these dist values in the left-to-right order of Figure 6.1, we can always be sure that by the time we get to a node $v$, we already have all the information we need to compute dist $(v)$. We are therefore able to compute all distances in a single pass:

This section provides the core mathematical relationship, or recurrence relation, for this problem. The shortest path to any node D must pass through one of its immediate predecessors (B or C in the example). Therefore, we only need to compare the path lengths of those incoming routes:

Path through B: dist(B) + length(B, D)
Path through C: dist(C) + length(C, D)

We take the minimum of these options. Because we process the nodes in their linearized order, we are guaranteed to have already calculated the final values for dist(B) and dist(C) by the time we get to D.

The Single-Pass Algorithm

                \begin{algorithm}
                    \caption{Shortest paths in a dag.}
                    \begin{algorithmic}
                        \State initialize all $dist(\cdot)$ values to $\infty$
                        \State $dist(s) \gets 0$
                        \For{each $v \in V \setminus \{s\}$, in linearized order}
                            \State $dist(v) \gets \min_{(u,v) \in E} \{dist(u) + l(u,v)\}$
                        \EndFor
                    \end{algorithmic}
                \end{algorithm}

This pseudocode shows the complete algorithm. It works in a single pass through the topologically sorted vertices:

Initialize all distances to infinity, except for the start node s, which is 0.
Go through the vertices one by one according to their linearized order.
For each vertex v, calculate its shortest distance by applying the `min` formula, using the already-computed `dist` values of its predecessors.

Solving Subproblems in Order

Notice that this algorithm is solving a collection of subproblems, $\{\operatorname{dist}(u): u \in V\}$. We start with the smallest of them, dist $(s)$, since we immediately know its answer to be 0. We then proceed with progressively "larger" subproblems-distances to vertices that are further and further along in the linearization...

This paragraph explicitly connects the algorithm to the core idea of dynamic programming. The main problem ("find all shortest paths") is broken down into a collection of smaller subproblems, where each subproblem is: "what is the shortest distance to node u?". The algorithm solves these subproblems one by one in a specific order (the linearized order), starting with the smallest subproblem (dist(s)) and using its answer to help solve the next ones.

A General Technique

This is a very general technique. At each node, we compute some function of the values of the node's predecessors. It so happens that our particular function is a minimum of sums, but we could just as well make it a maximum, in which case we would get longest paths in the dag. Or we could use a product instead of a sum...

This paragraph highlights the flexibility of the dynamic programming framework. The same fundamental algorithm (traversing a DAG in linear order and applying a recurrence at each node) can solve many different problems simply by changing the function being computed. Instead of a `min` of `sums` for shortest paths, you could use a `max` of `sums` to find longest paths, or a `product` to find something else entirely.

What is Dynamic Programming?

Dynamic programming is a very powerful algorithmic paradigm in which a problem is solved by identifying a collection of subproblems and tackling them one by one, smallest first, using the answers to small problems to help figure out larger ones... In dynamic programming we are not given a dag; the dag is implicit. Its nodes are the subproblems we define, and its edges are the dependencies between the subproblems...

This is the central definition of dynamic programming. It is a technique built on three key ideas:

Break the problem into subproblems: The first step is to define the smaller pieces of the problem (e.g., `dist(u)`).
Solve subproblems in order: You must solve them from "smallest" to "largest," so that answers to dependencies are always ready when you need them.
Store and reuse solutions: Solutions to smaller subproblems are stored and used to build up solutions to larger ones.

The paragraph makes a beautiful point: for most DP problems, the DAG isn't given to you—it's an **implicit** structure where the nodes are the subproblems and the edges are their dependencies. Dynamic programming is the process of solving the subproblems in the correct topological order of this implicit DAG.

On to an Example

But it's time we saw an example.

This is a transition, indicating that the text will now move from the abstract definition of dynamic programming to a concrete example to illustrate how these ideas are applied in practice.

6.2 Longest increasing subsequences

Defining the Problem

In the longest increasing subsequence problem, the input is a sequence of numbers $a_{1}, \ldots, a_{n}$. A subsequence is any subset of these numbers taken in order, of the form $a_{i_{1}}, a_{i_{2}}, \ldots, a_{i_{k}}$ where $1 \leq i_{1} < i_{2} < \cdots < i_{k} \leq n$, and an increasing subsequence is one in which the numbers are getting strictly larger. The task is to find the increasing subsequence of greatest length. For instance, the longest increasing subsequence of $5,2,8,6,3,6,9,7$ is $2,3,6,9$:

This paragraph defines the Increasing Subsequence (LIS) problem. It's important to distinguish between a "substring" (which must be contiguous) and a "subsequence" (which can have gaps). For the sequence 5, 2, 8, 6, the subsequence 2, 6 is valid because 2 appears before 6 in the original sequence. The goal is to find the longest such subsequence where the values are strictly increasing.

for i in range(n):
    print(A[i])

A New Perspective: From Sequences to Graphs

In this example, the arrows denote transitions between consecutive elements of the optimal solution. More generally, to better understand the solution space, let's create a graph of all permissible transitions: establish a node $i$ for each element $a_{i}$, and add directed edges $(i, j)$ whenever it is possible for $a_{i}$ and $a_{j}$ to be consecutive elements in an increasing subsequence, that is, whenever $i < j$ and $a_{i} < a_{j}$ (Figure 6.2).

This is a crucial step in finding a solution: transforming the problem into a different domain. Here, the sequence problem is converted into a graph problem. Each number in the sequence becomes a node in a graph. A directed edge is drawn from node `i` to node `j` only if `j` can follow `i` in an increasing subsequence (meaning `j` appears after `i` in the sequence and has a larger value).

A directed acyclic graph representing the LIS problem. — Figure 6.2: The dag of increasing subsequences.

The Key Insight: LIS as Longest Path in a DAG

Notice that (1) this graph $G=(V, E)$ is a dag, since all edges $(i, j)$ have $i < j$, and (2) there is a one-to-one correspondence between increasing subsequences and paths in this dag. Therefore, our goal is simply to find the longest path in the dag!

This paragraph reveals the solution path. The graph we constructed is a **Directed Acyclic Graph (DAG)** because edges only point from smaller indices to larger ones, so it's impossible to form a cycle. Crucially, any path through this DAG corresponds exactly to a valid increasing subsequence. Therefore, the LIS problem has now been simplified to finding the **longest path in a DAG**—a well-understood graph problem.

The Dynamic Programming Algorithm

Here is the algorithm: $$ \begin{aligned} & \text { for } j=1,2, \ldots, n \text { : } \\ & \quad L(j)=1+\max \{L(i):(i, j) \in E\} \\ & \text { return } \max _{j} L(j) \end{aligned} $$ $L(j)$ is the length of the longest path—the longest increasing subsequence—ending at $j$...

This presents the core dynamic programming algorithm. Let's break down the logic:

We define $L(j)$ as the length of the longest increasing subsequence that **ends at position** $j$.
We iterate from $j=1$ to $n$, calculating each $L(j)$ in order.
To calculate $L(j)$, we look at all previous nodes $i$ that could connect to $j$ (i.e., where $i < j$ and $a_i < a_j$). We find the one that has the longest path ending at it (the maximum $L(i)$ value).
The length of the path to $j$ is then $1$ (for $j$ itself) plus the length of the best path leading into it.
The final answer is the maximum value in our `L` array, since the longest subsequence could end at any position.

The Essence of Dynamic Programming

This is dynamic programming. In order to solve our original problem, we have defined a collection of subproblems $\{L(j): 1 \leq j \leq n\}$ with the following key property...: ${ }^{(*)}$ There is an ordering on the subproblems, and a relation that shows how to solve a subproblem given the answers to "smaller" subproblems...

This paragraph explicitly defines why the previous algorithm is a perfect example of dynamic programming. It satisfies the key properties:

A collection of subproblems: Our subproblems are "What is the LIS ending at position $j$?" for all $j$ from 1 to $n$.
An ordering: We solve the subproblems in a specific, linear order: for $L(1)$, then $L(2)$, then $L(3)$, and so on.
A recurrence relation: To solve for a subproblem $L(j)$, we only rely on the answers to "smaller" subproblems—those $L(i)$ where $i < j$, which we have already calculated and stored.

This "bottom-up" approach of solving smaller pieces first and using their results to build up to the final answer is the hallmark of dynamic programming.

Efficiency and Path Reconstruction

How long does this step take? It requires the predecessors of $j$ to be known... The computation of $L(j)$ then takes time proportional to the indegree of $j$, giving an overall running time linear in $|E|$. This is at most $O\left(n^{2}\right)$... we can recover the subsequence itself... with the same bookkeeping device we used for shortest paths... While computing $L(j)$, we should also note down $\operatorname{prev}(j)$, the next-to-last node on the longest path to $j$.

The runtime of this algorithm is analyzed. For each element $j$, we look at all previous elements $i < j$. In the worst case (a sorted array), every previous element is a valid predecessor. This results in a nested loop structure, giving a total runtime of $O(n^2)$. To reconstruct the actual subsequence (not just its length), we can use a `prev` array, just as in Dijkstra's algorithm. When we find the best predecessor `i` for `j`, we store it: `prev[j] = i`. After the `L` array is full, we can find the max value, start from that index, and follow the `prev` pointers backward to find the path.

Recursion? No, thanks.

Why Not Just Use Recursion?

...the formula for $L(j)$ also suggests an alternative, recursive algorithm. Wouldn't that be even simpler? Actually, recursion is a very bad idea: the resulting procedure would require exponential time! To see why, suppose that the dag contains edges $(i, j)$ for all $i < j$... the formula for subproblem $L(j)$ becomes $L(j)=1+\max \{L(1), L(2), \ldots, L(j-1)\}$. The following figure unravels the recursion for $L(5)$. Notice that the same subproblems get solved over and over again!

This section addresses a common question: if we have a recurrence relation, why not just write a simple recursive function? The text explains this is a terrible idea for dynamic programming problems. A naive recursive implementation would re-calculate the same subproblems an exponential number of times. For example, to calculate `L(5)`, it would call `L(4)`, `L(3)`, `L(2)`, and `L(1)`. To calculate `L(4)`, it would again call `L(3)`, `L(2)`, and `L(1)`. The redundant computations explode exponentially.

Divide-and-Conquer vs. Dynamic Programming

Then why did recursion work so well with divide-and-conquer? The key point is that in divide-and-conquer, a problem is expressed in terms of subproblems that are substantially smaller... In contrast, in a typical dynamic programming formulation, a problem is reduced to subproblems that are only slightly smaller... However, it turns out that most of these nodes are repeats... Efficiency is therefore obtained by explicitly enumerating the distinct subproblems and solving them in the right order.

This is a critical distinction between two major algorithmic paradigms:

Divide and Conquer (e.g., Mergesort): Works well with recursion because it breaks a problem into a few, *independent* subproblems that are much smaller (e.g., half the size). The subproblems don't overlap.
Dynamic Programming (e.g., LIS): Is needed when the subproblems are only slightly smaller (e.g., size `n-1` for a problem of size `n`) and are heavily *overlapping*. DP's power comes from using a table (or array) to store the results of subproblems, ensuring each one is calculated only once.

                \begin{algorithm}
                \caption{Breadth‑First Search}
                \begin{algorithmic}   
                \Procedure{BFS}{$G, s$}
                    \ForAll{$u \in V$}
                    \State $\text{dist}(u) \gets \infty$
                    \EndFor
                    \State $\text{dist}(s) \gets 0$
                    \State $Q \gets [s]$
                    \While{$Q$ is not empty}
                    \State $u \gets \text{eject}(Q)$
                    \ForAll{$(u,v) \in E$}
                        \If{$\text{dist}(v)=\infty$}
                        \State $\text{inject}(Q,v)$
                        \State $\text{dist}(v) \gets \text{dist}(u)+1$
                        \EndIf
                    \EndFor
                    \EndWhile
                \EndProcedure
                \end{algorithmic}
                \end{algorithm}