| |

9: Divide and Conquer

Divide and conquer breaks problems into smaller subproblems, solves them recursively, and combines solutions. The challenge: analyzing recurrences to prove efficiency.


The Divide and Conquer Paradigm

Steps:

  1. Divide the problem into smaller subproblems
  2. Conquer each subproblem recursively
  3. Combine solutions to solve the original problem

Key insight: Subproblems must be substantially smaller (often half the size) for efficiency gains.


Merge Sort

Algorithm

MergeSort(A, lo, hi):
    if lo < hi:
        mid = (lo + hi) / 2
        MergeSort(A, lo, mid)
        MergeSort(A, mid+1, hi)
        Merge(A, lo, mid, hi)

Merge operation: Combine two sorted halves in O(n)O(n) time by comparing front elements.

Recurrence Analysis

T(n)=2T(n/2)+O(n)T(n) = 2T(n/2) + O(n)

Recursion tree method:

  • Level 0: nn work (one problem of size nn)
  • Level 1: nn work (two problems of size n/2n/2)
  • Level ii: nn work (2i2^i problems of size n/2in/2^i)
  • Height: logโกn\log n levels

Total: O(nlogโกn)O(n \log n)


Counting Inversions

Problem Definition

Given array A[1..n]A[1..n], count pairs (i,j)(i, j) where i<ji < j but A[i]>A[j]A[i] > A[j].

Applications: Measuring similarity between rankings (e.g., collaborative filtering).

Brute Force

Check all (n2)\binom{n}{2} pairs: O(n2)O(n^2)

Divide and Conquer Approach

Key insight: Inversions come in three types:

  1. Both elements in left half
  2. Both elements in right half
  3. One in each half (split inversions)

Algorithm:

CountInversions(A):
    if |A| = 1: return 0

    Split A into L and R
    (L', countL) = CountInversions(L)
    (R', countR) = CountInversions(R)
    (A', countSplit) = MergeAndCount(L', R')

    return countL + countR + countSplit

MergeAndCount: During merge, when we take from right array, count remaining elements in left array (all form inversions).

T(n)=2T(n/2)+O(n)=O(nlogโกn)T(n) = 2T(n/2) + O(n) = O(n \log n)


Master Theorem

For recurrences of the form T(n)=aT(n/b)+f(n)T(n) = aT(n/b) + f(n) where aโ‰ฅ1a \geq 1, b>1b > 1:

Let ccrit=logโกbac_{\text{crit}} = \log_b a

Case 1: If f(n)=O(nc)f(n) = O(n^c) where c<ccritc < c_{\text{crit}}, then T(n)=ฮ˜(nccrit)T(n) = \Theta(n^{c_{\text{crit}}})

Case 2: If f(n)=ฮ˜(nccritlogโกkn)f(n) = \Theta(n^{c_{\text{crit}}} \log^k n), then T(n)=ฮ˜(nccritlogโกk+1n)T(n) = \Theta(n^{c_{\text{crit}}} \log^{k+1} n)

Case 3: If f(n)=ฮฉ(nc)f(n) = \Omega(n^c) where c>ccritc > c_{\text{crit}}, and af(n/b)โ‰คkf(n)af(n/b) \leq kf(n) for some k<1k < 1, then T(n)=ฮ˜(f(n))T(n) = \Theta(f(n))

Intuition

Compare f(n)f(n) (combine work) to nccritn^{c_{\text{crit}}} (recursive work):

  • Case 1: Recursion dominates โ†’ total is ฮ˜(nccrit)\Theta(n^{c_{\text{crit}}})
  • Case 2: Equal work at each level โ†’ multiply by logโกn\log n
  • Case 3: Combine dominates โ†’ total is ฮ˜(f(n))\Theta(f(n))

Examples

Recurrenceaabbccritc_{\text{crit}}f(n)f(n)CaseT(n)T(n)
T(n)=2T(n/2)+nT(n) = 2T(n/2) + n221nn2ฮ˜(nlogโกn)\Theta(n \log n)
T(n)=4T(n/2)+nT(n) = 4T(n/2) + n422nn1ฮ˜(n2)\Theta(n^2)
T(n)=4T(n/2)+n3T(n) = 4T(n/2) + n^3422n3n^33ฮ˜(n3)\Theta(n^3)
T(n)=3T(n/2)+nT(n) = 3T(n/2) + n32logโก23\log_2 3nn1ฮ˜(nlogโก23)\Theta(n^{\log_2 3})

Closest Pair in R2\mathbb{R}^2

Problem Definition

Given nn points in the plane, find the pair with minimum Euclidean distance.

Brute force: O(n2)O(n^2)

Algorithm

ClosestPair(P):
    Sort P by x-coordinate โ†’ Px
    Sort P by y-coordinate โ†’ Py
    return ClosestPairRec(Px, Py)

ClosestPairRec(Px, Py):
    if |Px| โ‰ค 3: return brute force

    Divide Px into left half Lx and right half Rx
    Construct Ly, Ry from Py (preserving y-order)

    ฮดL = ClosestPairRec(Lx, Ly)
    ฮดR = ClosestPairRec(Rx, Ry)
    ฮด = min(ฮดL, ฮดR)

    ฮดS = ClosestSplitPair(Px, Py, ฮด)

    return min(ฮด, ฮดS)

The Key Insight: The ฮด\delta-Strip

For split pairs, both points must be within distance ฮด\delta of the dividing line.

Crucial lemma: In the ฮด\delta-strip sorted by yy-coordinate, we only need to check each point against the next 7 points.

Why 7? Partition the strip into ฮดร—ฮด\delta \times \delta boxes. Each box contains at most one point (otherwise weโ€™d have found a closer pair). A 2ฮดร—ฮด2\delta \times \delta rectangle contains at most 8 points.

ClosestSplitPair(Px, Py, ฮด):
    x* = rightmost x-coordinate in left half
    S = points in Py with x โˆˆ [x* - ฮด, x* + ฮด]

    best = ฮด
    for i = 1 to |S|:
        for j = 1 to min(7, |S| - i):
            best = min(best, dist(S[i], S[i+j]))

    return best

Running Time

T(n)=2T(n/2)+O(n)=O(nlogโกn)T(n) = 2T(n/2) + O(n) = O(n \log n)

The initial sorting is O(nlogโกn)O(n \log n). Maintaining sorted order during recursion is O(n)O(n) per level.


Karatsubaโ€™s Algorithm (Integer Multiplication)

Problem

Multiply two nn-digit integers xx and yy.

Grade school: O(n2)O(n^2) single-digit multiplications.

Divide and Conquer Attempt

Write x=10n/2โ‹…a+bx = 10^{n/2} \cdot a + b and y=10n/2โ‹…c+dy = 10^{n/2} \cdot c + d where a,b,c,da, b, c, d have n/2n/2 digits.

xy=10nโ‹…ac+10n/2(ad+bc)+bdxy = 10^n \cdot ac + 10^{n/2}(ad + bc) + bd

Naive: 4 recursive multiplications โ†’ T(n)=4T(n/2)+O(n)=O(n2)T(n) = 4T(n/2) + O(n) = O(n^2)

Karatsubaโ€™s Insight

Compute ad+bcad + bc using only one multiplication:

(a+b)(c+d)=ac+ad+bc+bd(a + b)(c + d) = ac + ad + bc + bd

So: ad+bc=(a+b)(c+d)โˆ’acโˆ’bdad + bc = (a+b)(c+d) - ac - bd

Algorithm:

  1. Compute acac (one multiplication)
  2. Compute bdbd (one multiplication)
  3. Compute (a+b)(c+d)(a+b)(c+d) (one multiplication)
  4. Compute ad+bc=(a+b)(c+d)โˆ’acโˆ’bdad + bc = (a+b)(c+d) - ac - bd (subtraction)

T(n)=3T(n/2)+O(n)T(n) = 3T(n/2) + O(n)

By Master Theorem (Case 1): T(n)=O(nlogโก23)โ‰ˆO(n1.585)T(n) = O(n^{\log_2 3}) \approx O(n^{1.585})


Strassenโ€™s Algorithm (Matrix Multiplication)

Problem

Multiply two nร—nn \times n matrices AA and BB.

Standard: O(n3)O(n^3) operations.

Block Matrix Multiplication

Partition into n/2ร—n/2n/2 \times n/2 submatrices:

(C11C12C21C22)=(A11A12A21A22)(B11B12B21B22)\begin{pmatrix} C_{11} & C_{12} \\ C_{21} & C_{22} \end{pmatrix} = \begin{pmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{pmatrix} \begin{pmatrix} B_{11} & B_{12} \\ B_{21} & B_{22} \end{pmatrix}

Naive: 8 recursive multiplications โ†’ T(n)=8T(n/2)+O(n2)=O(n3)T(n) = 8T(n/2) + O(n^2) = O(n^3)

Strassenโ€™s Insight

Reduce to 7 multiplications using clever linear combinations:

M1=(A11+A22)(B11+B22)M_1 = (A_{11} + A_{22})(B_{11} + B_{22}) M2=(A21+A22)B11M_2 = (A_{21} + A_{22})B_{11} M3=A11(B12โˆ’B22)M_3 = A_{11}(B_{12} - B_{22}) M4=A22(B21โˆ’B11)M_4 = A_{22}(B_{21} - B_{11}) M5=(A11+A12)B22M_5 = (A_{11} + A_{12})B_{22} M6=(A21โˆ’A11)(B11+B12)M_6 = (A_{21} - A_{11})(B_{11} + B_{12}) M7=(A12โˆ’A22)(B21+B22)M_7 = (A_{12} - A_{22})(B_{21} + B_{22})

Then: C11=M1+M4โˆ’M5+M7C_{11} = M_1 + M_4 - M_5 + M_7 C12=M3+M5C_{12} = M_3 + M_5 C21=M2+M4C_{21} = M_2 + M_4 C22=M1โˆ’M2+M3+M6C_{22} = M_1 - M_2 + M_3 + M_6

T(n)=7T(n/2)+O(n2)=O(nlogโก27)โ‰ˆO(n2.807)T(n) = 7T(n/2) + O(n^2) = O(n^{\log_2 7}) \approx O(n^{2.807})


Median and Selection

Problem

Given unsorted array AA of nn distinct elements, find the kk-th smallest element.

Special case: Median when k=โŒˆn/2โŒ‰k = \lceil n/2 \rceil.

Sorting: O(nlogโกn)O(n \log n) โ€” can we do better?

QuickSelect (Randomized)

QuickSelect(A, k):
    if |A| = 1: return A[1]

    Choose pivot p uniformly at random
    Partition A into:
        L = elements < p
        R = elements > p

    if k โ‰ค |L|:
        return QuickSelect(L, k)
    else if k = |L| + 1:
        return p
    else:
        return QuickSelect(R, k - |L| - 1)

Expected time: O(n)O(n)

Worst case: O(n2)O(n^2) (bad pivot choices)

Deterministic Selection (Median of Medians)

Goal: Find a pivot guaranteed to eliminate a constant fraction of elements.

Select(A, k):
    if |A| โ‰ค 5: sort and return k-th element

    Divide A into groups of 5
    Find median of each group
    p = Select(medians, โŒˆ|medians|/2โŒ‰)  // Median of medians

    Partition A around p into L and R

    Recurse on appropriate side

Analysis

Key insight: At least 3n/103n/10 elements are โ‰คp\leq p, and at least 3n/103n/10 elements are โ‰ฅp\geq p.

  • About n/5n/5 groups
  • Half of groups have median โ‰คp\leq p
  • Each such group contributes at least 3 elements โ‰คp\leq p

So โˆฃLโˆฃ,โˆฃRโˆฃโ‰ค7n/10|L|, |R| \leq 7n/10.

T(n)โ‰คT(n/5)+T(7n/10)+O(n)T(n) \leq T(n/5) + T(7n/10) + O(n)

Claim: T(n)=O(n)T(n) = O(n)

Proof: Guess T(n)โ‰คcnT(n) \leq cn and verify: T(n)โ‰คcโ‹…n/5+cโ‹…7n/10+an=cn(1/5+7/10)+an=0.9cn+anT(n) \leq c \cdot n/5 + c \cdot 7n/10 + an = cn(1/5 + 7/10) + an = 0.9cn + an

For cโ‰ฅ10ac \geq 10a: T(n)โ‰คcnT(n) \leq cn โœ“


Summary

ProblemAlgorithmTime Complexity
SortingMerge SortO(nlogโกn)O(n \log n)
Counting InversionsModified Merge SortO(nlogโกn)O(n \log n)
Closest PairDivide by x-coordO(nlogโกn)O(n \log n)
Integer MultiplicationKaratsubaO(n1.585)O(n^{1.585})
Matrix MultiplicationStrassenO(n2.807)O(n^{2.807})
SelectionMedian of MediansO(n)O(n)

The power of divide and conquer: Reducing the number of subproblems (Karatsuba: 4โ†’3, Strassen: 8โ†’7) yields asymptotic improvements.