4: Union-Find

Union-Find tracks disjoint sets with two operations:

  • Find: Which set contains this element? (Returns the root/representative)
  • Union: Merge two sets

With path compression and union by rank, both operations are O(α(n))O(1)O(\alpha(n)) \approx O(1).


Data Structure

UnifNode:
  - key: Key
  - parent: Optional[UnifNode]  # None if root
  - rank: Integer

UnionFind:
  - nodes: Dictionary[Key, UnifNode]
  - make_set(key) -> UnifNode
  - find(key) -> Optional[UnifNode]
  - union(x, y) -> UnifNode

Each set is a tree. Elements point to parents. The root is the set’s representative.


Operations

Make Set

make_set(key):
    node = UnifNode(key)
    node.parent = None
    node.rank = 0
    nodes.insert(key, node)
    return node

Find (with Path Compression)

find(key):
    node = nodes.get(key)
    if node.parent is None:
        return node
    node.parent = find(node.parent.key)  # compress path
    return node.parent

Path compression flattens the tree: after find, all nodes on the path point directly to the root.

Union (by Rank)

union(x, y):
    root_x = find(x)
    root_y = find(y)

    if root_x == root_y:
        return root_x

    # Attach smaller tree under larger
    if root_x.rank < root_y.rank:
        root_x.parent = root_y
        return root_y
    else if root_x.rank > root_y.rank:
        root_y.parent = root_x
        return root_x
    else:
        root_y.parent = root_x
        root_x.rank += 1
        return root_x

Why O(α(n))?

Union by rank: Trees stay balanced. To get rank kk, need 2k\geq 2^k elements. Height is O(logn)O(\log n).

Path compression: After any find, the entire path is flattened.

Together: amortized O(α(n))O(\alpha(n)) where α\alpha is inverse Ackermann. α(n)4\alpha(n) \leq 4 for all practical nn.


Application: Kruskal’s MST

kruskal(graph):
    uf = UnionFind()
    for v in graph.vertices:
        uf.make_set(v)

    for edge in sort_by_weight(edges):
        if uf.find(edge.u) != uf.find(edge.v):
            add edge to MST
            uf.union(edge.u, edge.v)

Without Union-Find: O(EV)O(EV) for connectivity checks. With Union-Find: O(Eα(V))O(E)O(E \cdot \alpha(V)) \approx O(E).


Runtime Summary

OperationTime
make_setO(1)O(1)
findO(α(n))O(\alpha(n)) amortized
unionO(α(n))O(\alpha(n)) amortized

Limitation: No efficient deletion (no “un-union”).