15: Approximation Algorithms

Motivation and Approximation Ratios

When a problem is NP-hard, we abandon the requirement of exact optimality and instead seek solutions that are provably close to optimal. An approximation algorithm is a polynomial-time algorithm that returns a feasible solution whose cost is within a guaranteed factor of the optimum.

For a minimization problem, an algorithm has approximation ratio $\rho(n)$ if for every instance of size $n$ , the algorithm’s cost $C$ satisfies $C \leq \rho(n) \cdot C^*$ , where $C^*$ is the optimal cost. For maximization, the condition is $C^* \leq \rho(n) \cdot C$ . In both cases $\rho(n) \geq 1$ , and a ratio closer to 1 indicates a tighter guarantee.

An approximation scheme is a family of algorithms parametrized by $\epsilon > 0$ achieving ratio $(1 + \epsilon)$ . A polynomial-time approximation scheme (PTAS) runs in time polynomial in $n$ for each fixed $\epsilon$ (but possibly exponential in $1/\epsilon$ ). A fully polynomial-time approximation scheme (FPTAS) runs in time polynomial in both $n$ and $1/\epsilon$ .

Vertex Cover: A 2-Approximation

Problem. Given an undirected graph $G = (V, E)$ , find a minimum-size set $S \subseteq V$ such that every edge has at least one endpoint in $S$ .

Algorithm (Maximal Matching). Repeatedly select an arbitrary uncovered edge $(u,v)$ , add both $u$ and $v$ to the cover, and remove all edges incident to $u$ or $v$ . Continue until no edges remain.

Analysis. Let $M$ be the set of edges selected. Since no two edges in $M$ share an endpoint (they form a matching), $|M|$ edges require at least $|M|$ distinct vertices in any vertex cover (each edge must be covered). The algorithm outputs $2|M|$ vertices. Therefore the approximation ratio is:

$\frac{|S|}{|S^*|} = \frac{2|M|}{|S^*|} \leq \frac{2|M|}{|M|} = 2$

This bound is tight: consider a complete bipartite graph $K_{n,n}$ . The algorithm may select all $n$ matching edges and return $2n$ vertices, while the optimal cover has size $n$ . Improving the ratio below 2 for vertex cover is a major open problem; under the Unique Games Conjecture, no polynomial-time algorithm achieves a ratio better than $2 - \epsilon$ .

Traveling Salesman Problem

Problem. Given a complete graph with edge weights satisfying the triangle inequality ( $d(u,w) \leq d(u,v) + d(v,w)$ ), find a minimum-weight Hamiltonian cycle.

2-Approximation via MST

Compute a minimum spanning tree $T$ of $G$ . Since deleting any edge from an optimal tour yields a spanning tree, $w(T) \leq w(\text{OPT})$ . Doubling every edge of $T$ gives an Eulerian graph whose total weight is $2 \cdot w(T) \leq 2 \cdot w(\text{OPT})$ . An Euler tour of this graph visits every vertex at least once; shortcutting repeated vertices (using the triangle inequality to ensure shortcuts do not increase cost) yields a Hamiltonian cycle of weight at most $2 \cdot w(\text{OPT})$ .

Christofides’ Algorithm: 3/2-Approximation

Christofides (1976) improves on the doubling approach:

Compute a minimum spanning tree $T$ .
Let $O$ be the set of odd-degree vertices in $T$ (by the handshaking lemma, $|O|$ is even).
Find a minimum-weight perfect matching $M$ on the vertices in $O$ .
Combine $T$ and $M$ to get an Eulerian multigraph $H$ (all vertices now have even degree).
Find an Euler tour of $H$ and shortcut to a Hamiltonian cycle.

Analysis. We have $w(T) \leq w(\text{OPT})$ . For the matching, consider the optimal tour restricted to vertices in $O$ : this defines a Hamiltonian cycle on $O$ of cost at most $w(\text{OPT})$ (by the triangle inequality, shortcuts only reduce cost). This cycle decomposes into two perfect matchings, each of weight at most $w(\text{OPT})/2$ . Since $M$ is a minimum matching, $w(M) \leq w(\text{OPT})/2$ . Therefore:

$w(H) = w(T) + w(M) \leq w(\text{OPT}) + \frac{w(\text{OPT})}{2} = \frac{3}{2} w(\text{OPT})$

For the general (non-metric) TSP, no constant-factor approximation exists unless $\text{P} = \text{NP}$ .

Set Cover: Greedy $O(\log n)$ Approximation

Problem. Given a universe $U$ of $n$ elements and a collection $\mathcal{S}$ of subsets of $U$ , find the fewest subsets whose union is $U$ .

Greedy Algorithm. Repeatedly select the subset covering the most uncovered elements until $U$ is covered.

Theorem. The greedy algorithm achieves an approximation ratio of $H(n) = \sum_{k=1}^n 1/k = O(\ln n)$ , where $n = |U|$ .

Proof sketch. Assign a cost of $1/|S_i \setminus C|$ to each newly covered element when set $S_i$ is chosen (where $C$ is the set of already covered elements). If the optimal solution uses $k$ sets, then at each step, some set covers at least a $1/k$ fraction of remaining elements. A harmonic series argument shows the total greedy cost is at most $H(n) \cdot k$ .

This $\ln n$ bound is essentially tight: under standard complexity assumptions, no polynomial-time algorithm achieves a ratio of $(1 - \epsilon) \ln n$ for any $\epsilon > 0$ .

Load Balancing

Problem. Assign $n$ jobs with processing times $p_1, \ldots, p_n$ to $m$ identical machines to minimize the makespan (maximum load on any machine).

Greedy (List Scheduling). Assign each job to the currently least-loaded machine.

Analysis. Let $T^* = \max\left(\max_j p_j, \frac{1}{m}\sum_j p_j\right)$ be a lower bound on the optimal makespan. The greedy makespan satisfies $T \leq 2T^*$ , giving a 2-approximation. The Longest Processing Time (LPT) rule — sorting jobs in decreasing order before applying greedy — improves the ratio to $4/3$ .

PTAS and FPTAS

Knapsack FPTAS. The 0/1 knapsack problem admits an FPTAS. The idea is to scale and round the item values: divide each value $v_i$ by a factor $K = \epsilon \cdot v_{\max} / n$ , round down to integers, and solve the resulting problem exactly via dynamic programming in $O(n^2 / \epsilon)$ time. The rounding introduces at most $\epsilon \cdot v_{\max}$ total error, so the solution is within a $(1-\epsilon)$ factor of optimal.

Not all NP-hard problems admit a PTAS. Under appropriate complexity assumptions, problems like general TSP, graph coloring, and maximum clique have no constant-factor approximation, let alone a PTAS.

Inapproximability

The PCP theorem (Arora et al., 1998) establishes that certain problems cannot be approximated beyond specific thresholds unless $\text{P} = \text{NP}$ . For example:

Maximum 3-SAT cannot be approximated beyond a $7/8$ ratio.
Maximum Clique cannot be approximated within $n^{1-\epsilon}$ for any $\epsilon > 0$ .
Set Cover cannot be approximated within $(1-\epsilon) \ln n$ .

These inapproximability results, derived from the theory of probabilistically checkable proofs, precisely delineate what is achievable in polynomial time.

Connection to Machine Learning

Approximate nearest neighbor search. Finding the exact nearest neighbor in high-dimensional spaces requires time exponential in the dimension (the curse of dimensionality). Approximation algorithms trade exactness for efficiency:

Locality-Sensitive Hashing (LSH) uses random hash functions that map nearby points to the same bucket with high probability. For a $(1+\epsilon)$ -approximate nearest neighbor, LSH achieves sublinear query time $O(n^{1/(1+\epsilon)})$ with appropriate hash families.
Hierarchical Navigable Small World (HNSW) graphs build a multi-layer proximity graph structure supporting approximate nearest neighbor queries in $O(\log n)$ time empirically. While lacking worst-case guarantees as strong as LSH, HNSW achieves superior practical performance and is the backbone of modern vector databases (Faiss, Pinecone, Milvus).

These methods are essential to large-scale ML systems: retrieval-augmented generation, recommendation engines, and embedding-based search all depend on approximate nearest neighbor algorithms processing billions of vectors efficiently.

Next: 16: Randomized Algorithms