11: Dynamic Programming
Dynamic programming (DP) solves optimization problems by decomposing them into overlapping subproblems and storing intermediate results to avoid redundant computation. It is the standard approach for sequence alignment, shortest paths, and many combinatorial optimization problems that arise in ML pipelines and systems.
When to Use Dynamic Programming
A problem is amenable to DP when it has:
-
Optimal substructure. The optimal solution to the problem contains optimal solutions to subproblems. If the shortest path from A to C goes through B, then the sub-path from A to B must itself be shortest.
-
Overlapping subproblems. The same subproblems are solved multiple times in a naive recursive approach. Without memoization, the recursive Fibonacci algorithm computes twice, three times, etc., leading to exponential time.
DP trades space for time: store subproblem solutions in a table to avoid recomputation.
Implementation Strategies
Top-Down (Memoization)
Write the recursive solution, then cache results:
def fib(n, memo={}):
if n <= 1: return n
if n not in memo:
memo[n] = fib(n-1, memo) + fib(n-2, memo)
return memo[n]Time: , Space: . The recursion tree is pruned: each subproblem is solved once.
Bottom-Up (Tabulation)
Fill a table iteratively from the base cases:
def fib(n):
dp = [0] * (n + 1)
dp[1] = 1
for i in range(2, n + 1):
dp[i] = dp[i-1] + dp[i-2]
return dp[n]Same time and space complexity, but avoids recursion overhead and stack depth limits. Bottom-up is generally preferred for production code.
Space optimization. When the recurrence only depends on a fixed number of previous entries, reduce space by keeping only those entries. Fibonacci needs only the last 2 values: space.
Classic Problems
Longest Common Subsequence (LCS)
Given sequences and , find the length of their longest common subsequence.
Recurrence:
Time: , Space: (reducible to ).
ML connection. LCS and edit distance are used in NLP for string similarity, text alignment, and evaluation metrics (ROUGE-L uses LCS for summarization evaluation).
Edit Distance (Levenshtein Distance)
Minimum number of insertions, deletions, and substitutions to transform string into string .
Time: . Used in spell checking, DNA sequence alignment, and fuzzy string matching.
0/1 Knapsack
Given items with weights and values , and a capacity , maximize total value without exceeding capacity.
Time: . This is pseudo-polynomial: polynomial in and but exponential in the number of bits needed to represent . The knapsack problem is NP-hard, so no truly polynomial algorithm exists (unless P = NP).
Matrix Chain Multiplication
Given matrices with dimensions , find the parenthesization that minimizes the total number of scalar multiplications.
Time: , Space: . The optimal substructure: the best way to multiply must use an optimal split point and optimal parenthesizations of both halves.
Shortest Paths
Single-Source: Bellman-Ford
For a graph with vertices and edges (possibly negative weights):
Iterate times over all edges. Time: . Detects negative cycles (if any decreases on the -th iteration).
All-Pairs: Floyd-Warshall
Time: , Space: . Considers all intermediate vertices .
DP in Machine Learning
| Algorithm | DP Component | Application |
|---|---|---|
| Viterbi | Most probable state sequence in HMMs | Speech recognition, POS tagging |
| CTC decoding | Best alignment between input and output | Speech-to-text |
| Beam search | Approximate sequence decoding | Text generation, machine translation |
| Needleman-Wunsch | Global sequence alignment | Bioinformatics, token alignment |
| Value iteration | Bellman equation for MDPs | Reinforcement learning |
The Viterbi algorithm finds the most likely hidden state sequence in a Hidden Markov Model. It is a DP over the trellis of states and time steps:
where is the probability of the most likely path ending in state at time , is the transition probability, and is the emission probability. Time: where is sequence length and is the number of states.
Value iteration in reinforcement learning applies the Bellman optimality equation:
This is DP over states: each iteration improves the value estimate by propagating information one step backward from successor states.
Complexity Analysis
| Problem | Time | Space | Type |
|---|---|---|---|
| Fibonacci | 1D, linear | ||
| LCS / Edit distance | 2D, quadratic | ||
| Knapsack | Pseudo-polynomial | ||
| Matrix chain | Interval DP | ||
| Floyd-Warshall | All-pairs | ||
| Viterbi | Trellis DP |
Summary
Dynamic programming reduces exponential brute-force search to polynomial time by exploiting overlapping subproblems and optimal substructure. The key steps: define the state space, write the recurrence relation, determine the base cases, and choose top-down (memoization) or bottom-up (tabulation) implementation. In ML, DP appears in sequence decoding (Viterbi, CTC, beam search), reinforcement learning (value iteration), and evaluation metrics (ROUGE-L).