| |

9: Divide and Conquer

Divide and conquer breaks problems into smaller subproblems, solves them recursively, and combines solutions. The challenge: analyzing recurrences to prove efficiency.


The Divide and Conquer Paradigm

Steps:

  1. Divide the problem into smaller subproblems
  2. Conquer each subproblem recursively
  3. Combine solutions to solve the original problem

Key insight: Subproblems must be substantially smaller (often half the size) for efficiency gains.


Merge Sort

Algorithm

MergeSort(A, lo, hi):
    if lo < hi:
        mid = (lo + hi) / 2
        MergeSort(A, lo, mid)
        MergeSort(A, mid+1, hi)
        Merge(A, lo, mid, hi)

Merge operation: Combine two sorted halves in O(n)O(n) time by comparing front elements.

Recurrence Analysis

T(n)=2T(n/2)+O(n)T(n) = 2T(n/2) + O(n)

Recursion tree method:

  • Level 0: nn work (one problem of size nn)
  • Level 1: nn work (two problems of size n/2n/2)
  • Level ii: nn work (2i2^i problems of size n/2in/2^i)
  • Height: logn\log n levels

Total: O(nlogn)O(n \log n)


Counting Inversions

Problem Definition

Given array A[1..n]A[1..n], count pairs (i,j)(i, j) where i<ji < j but A[i]>A[j]A[i] > A[j].

Applications: Measuring similarity between rankings (e.g., collaborative filtering).

Brute Force

Check all (n2)\binom{n}{2} pairs: O(n2)O(n^2)

Divide and Conquer Approach

Key insight: Inversions come in three types:

  1. Both elements in left half
  2. Both elements in right half
  3. One in each half (split inversions)

Algorithm:

CountInversions(A):
    if |A| = 1: return 0

    Split A into L and R
    (L', countL) = CountInversions(L)
    (R', countR) = CountInversions(R)
    (A', countSplit) = MergeAndCount(L', R')

    return countL + countR + countSplit

MergeAndCount: During merge, when we take from right array, count remaining elements in left array (all form inversions).

T(n)=2T(n/2)+O(n)=O(nlogn)T(n) = 2T(n/2) + O(n) = O(n \log n)


Master Theorem

For recurrences of the form T(n)=aT(n/b)+f(n)T(n) = aT(n/b) + f(n) where a1a \geq 1, b>1b > 1:

Let ccrit=logbac_{\text{crit}} = \log_b a

Case 1: If f(n)=O(nc)f(n) = O(n^c) where c<ccritc < c_{\text{crit}}, then T(n)=Θ(nccrit)T(n) = \Theta(n^{c_{\text{crit}}})

Case 2: If f(n)=Θ(nccritlogkn)f(n) = \Theta(n^{c_{\text{crit}}} \log^k n), then T(n)=Θ(nccritlogk+1n)T(n) = \Theta(n^{c_{\text{crit}}} \log^{k+1} n)

Case 3: If f(n)=Ω(nc)f(n) = \Omega(n^c) where c>ccritc > c_{\text{crit}}, and af(n/b)kf(n)af(n/b) \leq kf(n) for some k<1k < 1, then T(n)=Θ(f(n))T(n) = \Theta(f(n))

Intuition

Compare f(n)f(n) (combine work) to nccritn^{c_{\text{crit}}} (recursive work):

  • Case 1: Recursion dominates → total is Θ(nccrit)\Theta(n^{c_{\text{crit}}})
  • Case 2: Equal work at each level → multiply by logn\log n
  • Case 3: Combine dominates → total is Θ(f(n))\Theta(f(n))

Examples

Recurrenceaabbccritc_{\text{crit}}f(n)f(n)CaseT(n)T(n)
T(n)=2T(n/2)+nT(n) = 2T(n/2) + n221nn2Θ(nlogn)\Theta(n \log n)
T(n)=4T(n/2)+nT(n) = 4T(n/2) + n422nn1Θ(n2)\Theta(n^2)
T(n)=4T(n/2)+n3T(n) = 4T(n/2) + n^3422n3n^33Θ(n3)\Theta(n^3)
T(n)=3T(n/2)+nT(n) = 3T(n/2) + n32log23\log_2 3nn1Θ(nlog23)\Theta(n^{\log_2 3})

Closest Pair in R2\mathbb{R}^2

Problem Definition

Given nn points in the plane, find the pair with minimum Euclidean distance.

Brute force: O(n2)O(n^2)

Algorithm

ClosestPair(P):
    Sort P by x-coordinate → Px
    Sort P by y-coordinate → Py
    return ClosestPairRec(Px, Py)

ClosestPairRec(Px, Py):
    if |Px| ≤ 3: return brute force

    Divide Px into left half Lx and right half Rx
    Construct Ly, Ry from Py (preserving y-order)

    δL = ClosestPairRec(Lx, Ly)
    δR = ClosestPairRec(Rx, Ry)
    δ = min(δL, δR)

    δS = ClosestSplitPair(Px, Py, δ)

    return min(δ, δS)

The Key Insight: The δ\delta-Strip

For split pairs, both points must be within distance δ\delta of the dividing line.

Crucial lemma: In the δ\delta-strip sorted by yy-coordinate, we only need to check each point against the next 7 points.

Why 7? Partition the strip into δ×δ\delta \times \delta boxes. Each box contains at most one point (otherwise we’d have found a closer pair). A 2δ×δ2\delta \times \delta rectangle contains at most 8 points.

ClosestSplitPair(Px, Py, δ):
    x* = rightmost x-coordinate in left half
    S = points in Py with x ∈ [x* - δ, x* + δ]

    best = δ
    for i = 1 to |S|:
        for j = 1 to min(7, |S| - i):
            best = min(best, dist(S[i], S[i+j]))

    return best

Running Time

T(n)=2T(n/2)+O(n)=O(nlogn)T(n) = 2T(n/2) + O(n) = O(n \log n)

The initial sorting is O(nlogn)O(n \log n). Maintaining sorted order during recursion is O(n)O(n) per level.


Karatsuba’s Algorithm (Integer Multiplication)

Problem

Multiply two nn-digit integers xx and yy.

Grade school: O(n2)O(n^2) single-digit multiplications.

Divide and Conquer Attempt

Write x=10n/2a+bx = 10^{n/2} \cdot a + b and y=10n/2c+dy = 10^{n/2} \cdot c + d where a,b,c,da, b, c, d have n/2n/2 digits.

xy=10nac+10n/2(ad+bc)+bdxy = 10^n \cdot ac + 10^{n/2}(ad + bc) + bd

Naive: 4 recursive multiplications → T(n)=4T(n/2)+O(n)=O(n2)T(n) = 4T(n/2) + O(n) = O(n^2)

Karatsuba’s Insight

Compute ad+bcad + bc using only one multiplication:

(a+b)(c+d)=ac+ad+bc+bd(a + b)(c + d) = ac + ad + bc + bd

So: ad+bc=(a+b)(c+d)acbdad + bc = (a+b)(c+d) - ac - bd

Algorithm:

  1. Compute acac (one multiplication)
  2. Compute bdbd (one multiplication)
  3. Compute (a+b)(c+d)(a+b)(c+d) (one multiplication)
  4. Compute ad+bc=(a+b)(c+d)acbdad + bc = (a+b)(c+d) - ac - bd (subtraction)

T(n)=3T(n/2)+O(n)T(n) = 3T(n/2) + O(n)

By Master Theorem (Case 1): T(n)=O(nlog23)O(n1.585)T(n) = O(n^{\log_2 3}) \approx O(n^{1.585})


Strassen’s Algorithm (Matrix Multiplication)

Problem

Multiply two n×nn \times n matrices AA and BB.

Standard: O(n3)O(n^3) operations.

Block Matrix Multiplication

Partition into n/2×n/2n/2 \times n/2 submatrices:

(C11C12C21C22)=(A11A12A21A22)(B11B12B21B22)\begin{pmatrix} C_{11} & C_{12} \\ C_{21} & C_{22} \end{pmatrix} = \begin{pmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{pmatrix} \begin{pmatrix} B_{11} & B_{12} \\ B_{21} & B_{22} \end{pmatrix}

Naive: 8 recursive multiplications → T(n)=8T(n/2)+O(n2)=O(n3)T(n) = 8T(n/2) + O(n^2) = O(n^3)

Strassen’s Insight

Reduce to 7 multiplications using clever linear combinations:

M1=(A11+A22)(B11+B22)M_1 = (A_{11} + A_{22})(B_{11} + B_{22}) M2=(A21+A22)B11M_2 = (A_{21} + A_{22})B_{11} M3=A11(B12B22)M_3 = A_{11}(B_{12} - B_{22}) M4=A22(B21B11)M_4 = A_{22}(B_{21} - B_{11}) M5=(A11+A12)B22M_5 = (A_{11} + A_{12})B_{22} M6=(A21A11)(B11+B12)M_6 = (A_{21} - A_{11})(B_{11} + B_{12}) M7=(A12A22)(B21+B22)M_7 = (A_{12} - A_{22})(B_{21} + B_{22})

Then: C11=M1+M4M5+M7C_{11} = M_1 + M_4 - M_5 + M_7 C12=M3+M5C_{12} = M_3 + M_5 C21=M2+M4C_{21} = M_2 + M_4 C22=M1M2+M3+M6C_{22} = M_1 - M_2 + M_3 + M_6

T(n)=7T(n/2)+O(n2)=O(nlog27)O(n2.807)T(n) = 7T(n/2) + O(n^2) = O(n^{\log_2 7}) \approx O(n^{2.807})


Median and Selection

Problem

Given unsorted array AA of nn distinct elements, find the kk-th smallest element.

Special case: Median when k=n/2k = \lceil n/2 \rceil.

Sorting: O(nlogn)O(n \log n) — can we do better?

QuickSelect (Randomized)

QuickSelect(A, k):
    if |A| = 1: return A[1]

    Choose pivot p uniformly at random
    Partition A into:
        L = elements < p
        R = elements > p

    if k ≤ |L|:
        return QuickSelect(L, k)
    else if k = |L| + 1:
        return p
    else:
        return QuickSelect(R, k - |L| - 1)

Expected time: O(n)O(n)

Worst case: O(n2)O(n^2) (bad pivot choices)

Deterministic Selection (Median of Medians)

Goal: Find a pivot guaranteed to eliminate a constant fraction of elements.

Select(A, k):
    if |A| ≤ 5: sort and return k-th element

    Divide A into groups of 5
    Find median of each group
    p = Select(medians, ⌈|medians|/2⌉)  // Median of medians

    Partition A around p into L and R

    Recurse on appropriate side

Analysis

Key insight: At least 3n/103n/10 elements are p\leq p, and at least 3n/103n/10 elements are p\geq p.

  • About n/5n/5 groups
  • Half of groups have median p\leq p
  • Each such group contributes at least 3 elements p\leq p

So L,R7n/10|L|, |R| \leq 7n/10.

T(n)T(n/5)+T(7n/10)+O(n)T(n) \leq T(n/5) + T(7n/10) + O(n)

Claim: T(n)=O(n)T(n) = O(n)

Proof: Guess T(n)cnT(n) \leq cn and verify: T(n)cn/5+c7n/10+an=cn(1/5+7/10)+an=0.9cn+anT(n) \leq c \cdot n/5 + c \cdot 7n/10 + an = cn(1/5 + 7/10) + an = 0.9cn + an

For c10ac \geq 10a: T(n)cnT(n) \leq cn


Summary

ProblemAlgorithmTime Complexity
SortingMerge SortO(nlogn)O(n \log n)
Counting InversionsModified Merge SortO(nlogn)O(n \log n)
Closest PairDivide by x-coordO(nlogn)O(n \log n)
Integer MultiplicationKaratsubaO(n1.585)O(n^{1.585})
Matrix MultiplicationStrassenO(n2.807)O(n^{2.807})
SelectionMedian of MediansO(n)O(n)

The power of divide and conquer: Reducing the number of subproblems (Karatsuba: 4→3, Strassen: 8→7) yields asymptotic improvements.