15: Approximation Algorithms

Motivation and Approximation Ratios

When a problem is NP-hard, we abandon the requirement of exact optimality and instead seek solutions that are provably close to optimal. An approximation algorithm is a polynomial-time algorithm that returns a feasible solution whose cost is within a guaranteed factor of the optimum.

For a minimization problem, an algorithm has approximation ratio ρ(n)\rho(n) if for every instance of size nn, the algorithm’s cost CC satisfies Cρ(n)CC \leq \rho(n) \cdot C^*, where CC^* is the optimal cost. For maximization, the condition is Cρ(n)CC^* \leq \rho(n) \cdot C. In both cases ρ(n)1\rho(n) \geq 1, and a ratio closer to 1 indicates a tighter guarantee.

An approximation scheme is a family of algorithms parametrized by ϵ>0\epsilon > 0 achieving ratio (1+ϵ)(1 + \epsilon). A polynomial-time approximation scheme (PTAS) runs in time polynomial in nn for each fixed ϵ\epsilon (but possibly exponential in 1/ϵ1/\epsilon). A fully polynomial-time approximation scheme (FPTAS) runs in time polynomial in both nn and 1/ϵ1/\epsilon.

Vertex Cover: A 2-Approximation

Problem. Given an undirected graph G=(V,E)G = (V, E), find a minimum-size set SVS \subseteq V such that every edge has at least one endpoint in SS.

Algorithm (Maximal Matching). Repeatedly select an arbitrary uncovered edge (u,v)(u,v), add both uu and vv to the cover, and remove all edges incident to uu or vv. Continue until no edges remain.

Analysis. Let MM be the set of edges selected. Since no two edges in MM share an endpoint (they form a matching), M|M| edges require at least M|M| distinct vertices in any vertex cover (each edge must be covered). The algorithm outputs 2M2|M| vertices. Therefore the approximation ratio is:

SS=2MS2MM=2\frac{|S|}{|S^*|} = \frac{2|M|}{|S^*|} \leq \frac{2|M|}{|M|} = 2

This bound is tight: consider a complete bipartite graph Kn,nK_{n,n}. The algorithm may select all nn matching edges and return 2n2n vertices, while the optimal cover has size nn. Improving the ratio below 2 for vertex cover is a major open problem; under the Unique Games Conjecture, no polynomial-time algorithm achieves a ratio better than 2ϵ2 - \epsilon.

Traveling Salesman Problem

Problem. Given a complete graph with edge weights satisfying the triangle inequality (d(u,w)d(u,v)+d(v,w)d(u,w) \leq d(u,v) + d(v,w)), find a minimum-weight Hamiltonian cycle.

2-Approximation via MST

Compute a minimum spanning tree TT of GG. Since deleting any edge from an optimal tour yields a spanning tree, w(T)w(OPT)w(T) \leq w(\text{OPT}). Doubling every edge of TT gives an Eulerian graph whose total weight is 2w(T)2w(OPT)2 \cdot w(T) \leq 2 \cdot w(\text{OPT}). An Euler tour of this graph visits every vertex at least once; shortcutting repeated vertices (using the triangle inequality to ensure shortcuts do not increase cost) yields a Hamiltonian cycle of weight at most 2w(OPT)2 \cdot w(\text{OPT}).

Christofides’ Algorithm: 3/2-Approximation

Christofides (1976) improves on the doubling approach:

  1. Compute a minimum spanning tree TT.
  2. Let OO be the set of odd-degree vertices in TT (by the handshaking lemma, O|O| is even).
  3. Find a minimum-weight perfect matching MM on the vertices in OO.
  4. Combine TT and MM to get an Eulerian multigraph HH (all vertices now have even degree).
  5. Find an Euler tour of HH and shortcut to a Hamiltonian cycle.

Analysis. We have w(T)w(OPT)w(T) \leq w(\text{OPT}). For the matching, consider the optimal tour restricted to vertices in OO: this defines a Hamiltonian cycle on OO of cost at most w(OPT)w(\text{OPT}) (by the triangle inequality, shortcuts only reduce cost). This cycle decomposes into two perfect matchings, each of weight at most w(OPT)/2w(\text{OPT})/2. Since MM is a minimum matching, w(M)w(OPT)/2w(M) \leq w(\text{OPT})/2. Therefore:

w(H)=w(T)+w(M)w(OPT)+w(OPT)2=32w(OPT)w(H) = w(T) + w(M) \leq w(\text{OPT}) + \frac{w(\text{OPT})}{2} = \frac{3}{2} w(\text{OPT})

For the general (non-metric) TSP, no constant-factor approximation exists unless P=NP\text{P} = \text{NP}.

Set Cover: Greedy O(logn)O(\log n) Approximation

Problem. Given a universe UU of nn elements and a collection S\mathcal{S} of subsets of UU, find the fewest subsets whose union is UU.

Greedy Algorithm. Repeatedly select the subset covering the most uncovered elements until UU is covered.

Theorem. The greedy algorithm achieves an approximation ratio of H(n)=k=1n1/k=O(lnn)H(n) = \sum_{k=1}^n 1/k = O(\ln n), where n=Un = |U|.

Proof sketch. Assign a cost of 1/SiC1/|S_i \setminus C| to each newly covered element when set SiS_i is chosen (where CC is the set of already covered elements). If the optimal solution uses kk sets, then at each step, some set covers at least a 1/k1/k fraction of remaining elements. A harmonic series argument shows the total greedy cost is at most H(n)kH(n) \cdot k.

This lnn\ln n bound is essentially tight: under standard complexity assumptions, no polynomial-time algorithm achieves a ratio of (1ϵ)lnn(1 - \epsilon) \ln n for any ϵ>0\epsilon > 0.

Load Balancing

Problem. Assign nn jobs with processing times p1,,pnp_1, \ldots, p_n to mm identical machines to minimize the makespan (maximum load on any machine).

Greedy (List Scheduling). Assign each job to the currently least-loaded machine.

Analysis. Let T=max(maxjpj,1mjpj)T^* = \max\left(\max_j p_j, \frac{1}{m}\sum_j p_j\right) be a lower bound on the optimal makespan. The greedy makespan satisfies T2TT \leq 2T^*, giving a 2-approximation. The Longest Processing Time (LPT) rule — sorting jobs in decreasing order before applying greedy — improves the ratio to 4/34/3.

PTAS and FPTAS

Knapsack FPTAS. The 0/1 knapsack problem admits an FPTAS. The idea is to scale and round the item values: divide each value viv_i by a factor K=ϵvmax/nK = \epsilon \cdot v_{\max} / n, round down to integers, and solve the resulting problem exactly via dynamic programming in O(n2/ϵ)O(n^2 / \epsilon) time. The rounding introduces at most ϵvmax\epsilon \cdot v_{\max} total error, so the solution is within a (1ϵ)(1-\epsilon) factor of optimal.

Not all NP-hard problems admit a PTAS. Under appropriate complexity assumptions, problems like general TSP, graph coloring, and maximum clique have no constant-factor approximation, let alone a PTAS.

Inapproximability

The PCP theorem (Arora et al., 1998) establishes that certain problems cannot be approximated beyond specific thresholds unless P=NP\text{P} = \text{NP}. For example:

  • Maximum 3-SAT cannot be approximated beyond a 7/87/8 ratio.
  • Maximum Clique cannot be approximated within n1ϵn^{1-\epsilon} for any ϵ>0\epsilon > 0.
  • Set Cover cannot be approximated within (1ϵ)lnn(1-\epsilon) \ln n.

These inapproximability results, derived from the theory of probabilistically checkable proofs, precisely delineate what is achievable in polynomial time.

Connection to Machine Learning

Approximate nearest neighbor search. Finding the exact nearest neighbor in high-dimensional spaces requires time exponential in the dimension (the curse of dimensionality). Approximation algorithms trade exactness for efficiency:

  • Locality-Sensitive Hashing (LSH) uses random hash functions that map nearby points to the same bucket with high probability. For a (1+ϵ)(1+\epsilon)-approximate nearest neighbor, LSH achieves sublinear query time O(n1/(1+ϵ))O(n^{1/(1+\epsilon)}) with appropriate hash families.

  • Hierarchical Navigable Small World (HNSW) graphs build a multi-layer proximity graph structure supporting approximate nearest neighbor queries in O(logn)O(\log n) time empirically. While lacking worst-case guarantees as strong as LSH, HNSW achieves superior practical performance and is the backbone of modern vector databases (Faiss, Pinecone, Milvus).

These methods are essential to large-scale ML systems: retrieval-augmented generation, recommendation engines, and embedding-based search all depend on approximate nearest neighbor algorithms processing billions of vectors efficiently.


Next: 16: Randomized Algorithms