| |

10: Greedy Algorithms

Greedy algorithms build solutions incrementally, making locally optimal choices at each step. The challenge: proving these myopic decisions lead to globally optimal solutions.


The Greedy Paradigm

Goal: Find a solution xx that maximizes/minimizes an objective function ff.

Challenge: The space of possible solutions is exponentially large.

Insight: The solution xx is composed of several parts (e.g., xx is a set or sequence).

Approach:

  1. Compute xx one part at a time
  2. Select the next part “greedily” for immediate benefit
  3. Prove this yields an optimal solution despite no foresight

Greedy algorithms typically run in polynomial time. The hard part is proving optimality.


Interval Scheduling

Problem Definition

  • Job jj starts at time sjs_j and finishes at time fjf_j
  • Two jobs ii and jj are compatible if [si,fi)[s_i, f_i) and [sj,fj)[s_j, f_j) don’t overlap
  • Goal: Find maximum-size subset of mutually compatible jobs

Greedy Template

Consider jobs in some “natural” order. Take a job if it’s compatible with those already chosen.

Possible orderings:

  • Earliest start time: ascending order of sjs_j
  • Earliest finish time: ascending order of fjf_j
  • Shortest interval: ascending order of fjsjf_j - s_j
  • Fewest conflicts: ascending order of cjc_j (number of conflicting jobs)

Only earliest finish time (EFT) works. The others have counterexamples:

  • Earliest start time fails when one long job starts first
  • Shortest interval fails when a short job blocks two longer compatible jobs
  • Fewest conflicts fails in certain symmetric configurations

Algorithm: Earliest Finish Time (EFT)

Sort jobs by finish time: f_1 ≤ f_2 ≤ ... ≤ f_n
S ← ∅
for j = 1 to n:
    if job j is compatible with S:
        S ← S ∪ {j}
return S

Running time: O(nlogn)O(n \log n)

  • Sorting: O(nlogn)O(n \log n)
  • Compatibility check: O(1)O(1) per job (just check if sjfis_j \geq f_{i^*} where ii^* is the last added job)

Proof of Optimality by Contradiction

  1. Suppose greedy is not optimal
  2. Let greedy select jobs i1,i2,,iki_1, i_2, \ldots, i_k (sorted by finish time)
  3. Consider an optimal solution j1,j2,,jmj_1, j_2, \ldots, j_m that matches greedy for as many indices as possible
    • j1=i1,,jr=irj_1 = i_1, \ldots, j_r = i_r for the greatest possible rr
  4. Both ir+1i_{r+1} and jr+1j_{r+1} are compatible with the previous selection
  5. Since greedy chose ir+1i_{r+1} by EFT: fir+1fjr+1f_{i_{r+1}} \leq f_{j_{r+1}}
  6. Replace jr+1j_{r+1} with ir+1i_{r+1} in the optimal solution:
    • Still feasible: fir+1fjr+1sjtf_{i_{r+1}} \leq f_{j_{r+1}} \leq s_{j_t} for tr+2t \geq r+2
    • Still optimal: same number of jobs
    • But matches greedy for r+1r+1 indices — contradiction!

Proof of Optimality by Induction

Let SjS_j be the subset picked by greedy after considering the first jj jobs.

Definition: SjS_j is promising if T{j+1,,n}\exists T \subseteq \{j+1, \ldots, n\} such that Oj=SjTO_j = S_j \cup T is optimal.

Claim: For all t{0,1,,n}t \in \{0, 1, \ldots, n\}, StS_t is promising.

Base case: S0=S_0 = \emptyset is promising (any optimal solution extends it).

Induction step: Assume Sj1S_{j-1} is promising with optimal extension Oj1O_{j-1}.

Case 1: Greedy didn’t select job jj (conflicts with Sj1S_{j-1})

  • Since Sj1Oj1S_{j-1} \subseteq O_{j-1}, job jj also conflicts with Oj1O_{j-1}
  • So Oj=Oj1O_j = O_{j-1} also extends Sj=Sj1S_j = S_{j-1}

Case 2: Greedy selected job jj, so Sj=Sj1{j}S_j = S_{j-1} \cup \{j\}

  • Let rr be the earliest job in Oj1Sj1O_{j-1} \setminus S_{j-1}
  • Replace rr with jj in Oj1O_{j-1} to get OjO_j
  • OjO_j is still feasible (since fjfrf_j \leq f_r) and optimal
  • OjO_j extends SjS_j

Interval Partitioning

Problem Definition

  • Job jj starts at time sjs_j and finishes at time fjf_j
  • Goal: Partition jobs into fewest groups such that jobs in each group are mutually compatible

Think of scheduling lectures into classrooms.

Algorithm: Earliest Start Time (EST)

Sort lectures by start time: s_1 ≤ s_2 ≤ ... ≤ s_n
d ← 0  (number of classrooms)

for j = 1 to n:
    if lecture j is compatible with some classroom k:
        Schedule lecture j in classroom k
    else:
        Allocate new classroom d+1
        Schedule lecture j in classroom d+1
        d ← d + 1

return schedule

Implementation with priority queue:

  • Store classrooms in a min-heap keyed by latest finish time
  • Check if sjs_j \geq minimum key
  • O(n)O(n) priority queue operations → O(nlogn)O(n \log n) time

Proof of Optimality

Definition: The depth is the maximum number of lectures running at any time.

Lower bound: #classrooms neededdepth\text{\#classrooms needed} \geq \text{depth}

Claim: Greedy uses exactly depth classrooms.

Proof:

  • Let dd = number of classrooms used by greedy
  • Classroom dd was opened because lecture jj conflicted with all d1d-1 existing classrooms
  • All dd lectures end after sjs_j
  • Since we sorted by start time, they all start at or before sjs_j
  • At time sjs_j, we have dd mutually overlapping lectures
  • Therefore, depthd\text{depth} \geq d

Since #classroomsdepthd\text{\#classrooms} \geq \text{depth} \geq d, greedy is optimal.


Minimizing Lateness

Problem Definition

  • Single machine processes jobs
  • Job jj requires tjt_j time and is due at djd_j
  • If scheduled at sjs_j, finishes at fj=sj+tjf_j = s_j + t_j
  • Lateness: j=max(0,fjdj)\ell_j = \max(0, f_j - d_j)
  • Goal: Minimize maximum lateness L=maxjjL = \max_j \ell_j

Algorithm: Earliest Deadline First (EDF)

Sort jobs by deadline: d_1 ≤ d_2 ≤ ... ≤ d_n
t ← 0

for j = 1 to n:
    Assign job j to interval [t, t + t_j]
    s_j ← t
    f_j ← t + t_j
    t ← t + t_j

return intervals

Counterexamples for other orderings:

Shortest processing time first:

Jobtjt_jdjd_j
11100
21010

Processing job 1 first delays job 2 unnecessarily.

Smallest slack first:

Jobtjt_jdjd_j
112
21010

Slack (djtj)(d_j - t_j): job 1 has slack 1, job 2 has slack 0. But doing job 2 first causes lateness.

Proof of Optimality

Key observations:

  1. There is an optimal schedule with no idle time
  2. EDF has no idle time
  3. An inversion is a pair (i,j)(i,j) where di<djd_i < d_j but jj is scheduled before ii
  4. EDF has no inversions
  5. If a schedule has inversions, it has an adjacent inverted pair

Lemma: Swapping adjacent inverted jobs doesn’t increase lateness but reduces inversions by 1.

Proof:

  • Let k\ell_k and k\ell_k' denote lateness before and after swap
  • For ki,jk \neq i,j: k=k\ell_k = \ell_k' (finish times unchanged)
  • For job ii: ii\ell_i' \leq \ell_i (moved earlier)
  • For job jj: j=fjdj=fidjfidi=i\ell_j' = f_j' - d_j = f_i - d_j \leq f_i - d_i = \ell_i (using djdid_j \geq d_i from the inversion)
  • Therefore: L=max{i,j,maxki,jk}max{i,i,maxki,jk}LL' = \max\{\ell_i', \ell_j', \max_{k \neq i,j} \ell_k'\} \leq \max\{\ell_i, \ell_i, \max_{k \neq i,j} \ell_k\} \leq L

Proof by contradiction:

  1. Suppose EDF is not optimal
  2. Take optimal schedule SS^* with fewest inversions (no idle time WLOG)
  3. Since EDF is not optimal, SS^* has at least one inversion
  4. By observation 5, it has an adjacent inversion
  5. Swapping maintains optimality but reduces inversions — contradiction! ∎

Proof by reverse induction:

Claim: For each r{0,1,,(n2)}r \in \{0, 1, \ldots, \binom{n}{2}\}, there exists an optimal schedule with at most rr inversions.

  • Base case r=(n2)r = \binom{n}{2}: trivial
  • Induction: If claim holds for r=t+1r = t+1, take optimal with t+1\leq t+1 inversions
    • If t\leq t inversions, done
    • If exactly t+11t+1 \geq 1 inversions, swap adjacent inverted pair → tt inversions
  • Claim for r=0r = 0 proves EDF optimality ∎

Huffman Coding (Lossless Compression)

Problem Definition

  • Document uses nn distinct symbols with frequencies (w1,,wn)(w_1, \ldots, w_n)
  • Naive encoding: logn\lceil \log n \rceil bits per symbol
  • Goal: Find prefix-free encoding minimizing i=1nwii\sum_{i=1}^{n} w_i \cdot \ell_i

Prefix-free encoding: No codeword is a prefix of another. This enables unambiguous decoding.

Key insight: Prefix-free encodings correspond to binary trees where symbols are leaves.

Huffman’s Algorithm

Build priority queue with (symbol, frequency) pairs

while |queue| ≥ 2:
    (x, w_x) ← extract-min(queue)
    (y, w_y) ← extract-min(queue)
    Create new node z with children x and y
    Insert (z, w_x + w_y) into queue

return tree

Running time: O(nlogn)O(n \log n)

Can be made O(n)O(n) if symbols are pre-sorted by frequency (using two queues).

Example

Frequencies: (wa,wb,wc,wd,we,wf)=(42,20,5,10,11,12)(w_a, w_b, w_c, w_d, w_e, w_f) = (42, 20, 5, 10, 11, 12)

Steps:

  1. Merge cc (5) and dd (10) → node with weight 15
  2. Merge ee (11) and ff (12) → node with weight 23
  3. Merge 15 and bb (20) → node with weight 35
  4. Merge 23 and 35 → node with weight 58
  5. Merge aa (42) and 58 → root with weight 100

Result:

  • a0a \to 0
  • e100e \to 100
  • f101f \to 101
  • c1100c \to 1100
  • d1101d \to 1101
  • b111b \to 111

Higher frequency symbols get shorter codes.

Proof of Optimality

Induction on number of symbols nn.

Base case: n=2n = 2. Both 1-bit encodings are optimal.

Lemma 1: If wx<wyw_x < w_y, then xy\ell_x \geq \ell_y in any optimal tree.

Proof: If wx<wyw_x < w_y and x<y\ell_x < \ell_y, swapping reduces total length: wxy+wyx<wxx+wyyw_x \cdot \ell_y + w_y \cdot \ell_x < w_x \cdot \ell_x + w_y \cdot \ell_y

Lemma 2: There exists an optimal tree where the two lowest-frequency symbols xx and yy are siblings.

Proof:

  1. Take any optimal tree
  2. If xx doesn’t have longest encoding, swap with one that does (no cost increase by Lemma 1)
  3. xx must have a sibling (otherwise, could shorten its code)
  4. If sibling isn’t yy, swap with yy (no cost increase)

Induction step:

  • Let x,yx, y be the two lowest-frequency symbols Huffman merges into xyxy
  • Let HH be Huffman’s tree, TT be optimal tree where x,yx, y are siblings
  • Let HH' and TT' treat xyxy as one symbol with weight wx+wyw_x + w_y
  • By induction: Length(H)Length(T)\text{Length}(H') \leq \text{Length}(T')
  • Length(H)=Length(H)+(wx+wy)1\text{Length}(H) = \text{Length}(H') + (w_x + w_y) \cdot 1
  • Length(T)=Length(T)+(wx+wy)1\text{Length}(T) = \text{Length}(T') + (w_x + w_y) \cdot 1
  • Therefore Length(H)Length(T)\text{Length}(H) \leq \text{Length}(T)

Summary of Proof Techniques

ProblemAlgorithmProof Technique
Interval SchedulingEarliest Finish TimeContradiction/Induction (exchange argument)
Interval PartitioningEarliest Start TimeLower bound matching
Minimizing LatenessEarliest Deadline FirstInversion counting
Huffman CodingMerge lowest frequenciesInduction on problem size

Common pattern: Show that any optimal solution can be transformed into the greedy solution without increasing cost.


Other Classic Greedy Algorithms

  • Dijkstra’s algorithm: Single-source shortest paths
  • Kruskal’s algorithm: Minimum spanning tree (sort edges, add if no cycle)
  • Prim’s algorithm: Minimum spanning tree (grow tree from source)