12: Orthogonal Diagonalization and the Spectral Theorem

Symmetric matrices have a special property: they can always be diagonalized using orthogonal eigenvectors. This means you can write A=QDQTA = QDQ^T where QQ has orthonormal columns. The “orthogonal” part is crucial,instead of computing a full matrix inverse (O(n3)O(n^3)), you just transpose (O(n2)O(n^2)). Geometrically, symmetric matrices can only stretch along perpendicular axes, never rotate. This is the Spectral Theorem, and it’s why symmetric matrices are the friendliest matrices in linear algebra.

The Spectral Theorem

(Statement)

Any symmetric matrix AA (meaning AT=AA^T = A) can be orthogonally diagonalized:

A=QDQTA = QDQ^T

where:

  • DD is diagonal (eigenvalues on the diagonal)
  • QQ is orthogonal (columns are orthonormal eigenvectors)

Key property: Q1=QTQ^{-1} = Q^T, so you transpose instead of inverting.


(Why This Matters)

Computational savings:

  • Matrix inverse: O(n3)O(n^3) operations
  • Matrix transpose: O(n2)O(n^2) operations

When Q1=QTQ^{-1} = Q^T, all the expensive inversion work disappears.

Geometric meaning: Rotate to the eigenvector basis (via QTQ^T), scale along axes (via DD), rotate back (via QQ).

Symmetric matrices cannot rotate,they only stretch along orthogonal axes. That’s why eigenvectors are automatically perpendicular.


Symmetric Matrices Are Special

(Why Eigenvectors Are Orthogonal)

Theorem: If AA is symmetric, eigenvectors from different eigenvalues are orthogonal.

Proof:

Let Av1=λ1v1A\mathbf{v}_1 = \lambda_1\mathbf{v}_1 and Av2=λ2v2A\mathbf{v}_2 = \lambda_2\mathbf{v}_2 where λ1λ2\lambda_1 \neq \lambda_2.

λ1(v1v2)=(λ1v1)v2=(Av1)v2\lambda_1(\mathbf{v}_1 \cdot \mathbf{v}_2) = (\lambda_1\mathbf{v}_1) \cdot \mathbf{v}_2 = (A\mathbf{v}_1) \cdot \mathbf{v}_2

Since AA is symmetric, (Av1)v2=v1(Av2)(A\mathbf{v}_1) \cdot \mathbf{v}_2 = \mathbf{v}_1 \cdot (A\mathbf{v}_2):

=v1(λ2v2)=λ2(v1v2)= \mathbf{v}_1 \cdot (\lambda_2\mathbf{v}_2) = \lambda_2(\mathbf{v}_1 \cdot \mathbf{v}_2)

So:

λ1(v1v2)=λ2(v1v2)\lambda_1(\mathbf{v}_1 \cdot \mathbf{v}_2) = \lambda_2(\mathbf{v}_1 \cdot \mathbf{v}_2)

Since λ1λ2\lambda_1 \neq \lambda_2, we must have v1v2=0\mathbf{v}_1 \cdot \mathbf{v}_2 = 0. ✓


(Repeated Eigenvalues)

When eigenvalues repeat, the corresponding eigenspace might have dimension > 1. But you can always choose an orthonormal basis within that eigenspace using Gram-Schmidt.

Key insight: For symmetric matrices, there’s always enough “room” to find orthogonal eigenvectors, even when eigenvalues repeat. This is why the Spectral Theorem holds for all symmetric matrices.


(Symmetric Matrices Have Real Eigenvalues)

Theorem: All eigenvalues of a real symmetric matrix are real.

This is crucial,complex eigenvalues would break the geometric interpretation. The proof uses the fact that A=ATA = A^T forces eigenvalues to equal their complex conjugates, making them real.


Orthogonal Matrices

(Definition)

A matrix QQ is orthogonal if:

QTQ=IQ^TQ = I

Equivalently: Q1=QTQ^{-1} = Q^T (transpose equals inverse).

In terms of columns: If Q=[q1qn]Q = [\mathbf{q}_1 \mid \cdots \mid \mathbf{q}_n], then the columns form an orthonormal set:

qiqj={1if i=j0if ij\mathbf{q}_i \cdot \mathbf{q}_j = \begin{cases} 1 & \text{if } i = j \\ 0 & \text{if } i \neq j \end{cases}

(Properties of Orthogonal Matrices)

1. Preserve lengths:

Qv=v\|Q\mathbf{v}\| = \|\mathbf{v}\|

Proof: Qv2=(Qv)T(Qv)=vTQTQv=vTv=v2\|Q\mathbf{v}\|^2 = (Q\mathbf{v})^T(Q\mathbf{v}) = \mathbf{v}^TQ^TQ\mathbf{v} = \mathbf{v}^T\mathbf{v} = \|\mathbf{v}\|^2

2. Preserve angles:

(Qu)(Qv)=uv(Q\mathbf{u}) \cdot (Q\mathbf{v}) = \mathbf{u} \cdot \mathbf{v}

3. Preserve dot products: Same as preserving angles.

4. Determinant is ±1\pm 1:

det(QTQ)=det(QT)det(Q)=(detQ)2=det(I)=1\det(Q^TQ) = \det(Q^T)\det(Q) = (\det Q)^2 = \det(I) = 1

So det(Q)=±1\det(Q) = \pm 1.


(Geometric Interpretation)

Orthogonal matrices represent:

  • Rotations: det(Q)=1\det(Q) = 1
  • Reflections: det(Q)=1\det(Q) = -1
  • Combinations: Rotations and reflections preserve lengths and angles,that’s exactly what orthogonal matrices do

Key distinction: Transpose reverses rotations/reflections, but doesn’t undo stretching. That’s why QT=Q1Q^T = Q^{-1} only works for orthogonal matrices (no stretching component).


The Spectral Decomposition

(The Formula)

For a symmetric matrix AA:

A=QDQT=i=1nλiqiqiTA = QDQ^T = \sum_{i=1}^{n} \lambda_i \mathbf{q}_i\mathbf{q}_i^T

where λi\lambda_i are eigenvalues and qi\mathbf{q}_i are orthonormal eigenvectors.

Outer product form: Each term qiqiT\mathbf{q}_i\mathbf{q}_i^T is a rank-1 projection matrix. The spectral decomposition says:

A symmetric matrix is a weighted sum of projections onto its eigenvectors, with weights equal to the eigenvalues.


(Example: Spectral Decomposition)

A=[3113]A = \begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix}

Find eigenvalues:

det(AλI)=det[3λ113λ]=(3λ)21=λ26λ+8\det(A - \lambda I) = \det\begin{bmatrix} 3-\lambda & 1 \\ 1 & 3-\lambda \end{bmatrix} = (3-\lambda)^2 - 1 = \lambda^2 - 6\lambda + 8 =(λ4)(λ2)=0= (\lambda - 4)(\lambda - 2) = 0

Eigenvalues: λ1=4\lambda_1 = 4, λ2=2\lambda_2 = 2.

Find eigenvectors:

For λ1=4\lambda_1 = 4:

(A4I)v=[1111]v=0(A - 4I)\mathbf{v} = \begin{bmatrix} -1 & 1 \\ 1 & -1 \end{bmatrix}\mathbf{v} = \mathbf{0}

Eigenvector: v1=[11]\mathbf{v}_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix}, normalize: q1=12[11]\mathbf{q}_1 = \frac{1}{\sqrt{2}}\begin{bmatrix} 1 \\ 1 \end{bmatrix}

For λ2=2\lambda_2 = 2:

(A2I)v=[1111]v=0(A - 2I)\mathbf{v} = \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}\mathbf{v} = \mathbf{0}

Eigenvector: v2=[11]\mathbf{v}_2 = \begin{bmatrix} 1 \\ -1 \end{bmatrix}, normalize: q2=12[11]\mathbf{q}_2 = \frac{1}{\sqrt{2}}\begin{bmatrix} 1 \\ -1 \end{bmatrix}

Build QQ and DD:

Q=12[1111],D=[4002]Q = \frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}, \quad D = \begin{bmatrix} 4 & 0 \\ 0 & 2 \end{bmatrix}

Verify:

QDQT=12[1111][4002]12[1111]QDQ^T = \frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}\begin{bmatrix} 4 & 0 \\ 0 & 2 \end{bmatrix}\frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1 \\ -1 & 1 \end{bmatrix} =12[4242][1111]=12[6226]=[3113]=A= \frac{1}{2}\begin{bmatrix} 4 & 2 \\ 4 & -2 \end{bmatrix}\begin{bmatrix} 1 & 1 \\ -1 & 1 \end{bmatrix} = \frac{1}{2}\begin{bmatrix} 6 & 2 \\ 2 & 6 \end{bmatrix} = \begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix} = A \quad ✓

Quadratic Forms and Principal Axes

(Quadratic Forms)

A quadratic form is an expression:

Q(x)=xTAxQ(\mathbf{x}) = \mathbf{x}^TA\mathbf{x}

where AA is a symmetric matrix.

Example in R2\mathbb{R}^2:

Q(x,y)=ax2+2bxy+cy2=[xy][abbc][xy]Q(x, y) = ax^2 + 2bxy + cy^2 = \begin{bmatrix} x & y \end{bmatrix}\begin{bmatrix} a & b \\ b & c \end{bmatrix}\begin{bmatrix} x \\ y \end{bmatrix}

(Geometric Interpretation: Ellipsoids)

The level set xTAx=1\mathbf{x}^TA\mathbf{x} = 1 defines a quadric surface:

  • Ellipsoid if all eigenvalues are positive
  • Hyperboloid if eigenvalues have mixed signs
  • Degenerate if any eigenvalue is zero

The eigenvectors of AA are the principal axes of this quadric,they point along the directions of maximum and minimum stretching.


(Diagonalizing Quadratic Forms)

Using A=QDQTA = QDQ^T, substitute x=Qy\mathbf{x} = Q\mathbf{y}:

xTAx=(Qy)TQDQT(Qy)=yTQTQDQTQy=yTDy\mathbf{x}^TA\mathbf{x} = (Q\mathbf{y})^TQD Q^T(Q\mathbf{y}) = \mathbf{y}^TQ^TQ D Q^TQ\mathbf{y} = \mathbf{y}^TD\mathbf{y} =λ1y12+λ2y22++λnyn2= \lambda_1 y_1^2 + \lambda_2 y_2^2 + \cdots + \lambda_n y_n^2

In the eigenvector basis, the quadratic form has no cross terms,it’s just a sum of squares with weights λi\lambda_i.

Example: The ellipse 3x2+2xy+3y2=13x^2 + 2xy + 3y^2 = 1 (from our earlier AA) becomes 4u2+2v2=14u^2 + 2v^2 = 1 in the eigenvector coordinates, where axes are rotated by 45°45°.


Why Transpose ≠ Inverse (Usually)

For a general matrix AA:

ATA1A^T \neq A^{-1}

Why? Transpose reverses the direction of a transformation (rotations/reflections) but doesn’t undo stretching.

If AA stretches by factor σ\sigma in some direction, then:

  • ATA^T still stretches by σ\sigma in the corresponding direction (transpose doesn’t change singular values)
  • A1A^{-1} compresses by factor 1/σ1/\sigma (actually inverts the stretching)

When they’re equal: AT=A1A^T = A^{-1} precisely when AA has no stretching component,when it’s a pure rotation or reflection. These are the orthogonal matrices.


Connection to SVD

(The Setup)

For a non-square matrix AA (m×nm \times n), you can’t do eigendecomposition directly. But the matrix ATAA^TA (n×nn \times n) is symmetric, so it has orthonormal eigenvectors.

These eigenvectors of ATAA^TA are the right singular vectors of AA,the optimal input directions.


(Singular Values from ATAA^TA)

For an eigenvector v\mathbf{v} of ATAA^TA with eigenvalue λ\lambda:

Av2=(Av)T(Av)=vTATAv=vT(λv)=λv2\|A\mathbf{v}\|^2 = (A\mathbf{v})^T(A\mathbf{v}) = \mathbf{v}^TA^TA\mathbf{v} = \mathbf{v}^T(\lambda\mathbf{v}) = \lambda\|\mathbf{v}\|^2

If v\mathbf{v} is normalized (v=1\|\mathbf{v}\| = 1):

Av=λ\|A\mathbf{v}\| = \sqrt{\lambda}

The singular values σi\sigma_i are defined as:

σi=λi\sigma_i = \sqrt{\lambda_i}

where λi\lambda_i are the eigenvalues of ATAA^TA.


(SVD Mechanics)

To build the Singular Value Decomposition A=UΣVTA = U\Sigma V^T:

  1. Compute ATAA^TA (symmetric, n×nn \times n)
  2. Find eigenvalues λ1,,λr\lambda_1, \ldots, \lambda_r (rank rr)
  3. Compute singular values: σi=λi\sigma_i = \sqrt{\lambda_i}
  4. Find eigenvectors of ATAA^TA → normalize → columns of VV
  5. Apply AA to eigenvectors: ui=1σiAvi\mathbf{u}_i = \frac{1}{\sigma_i}A\mathbf{v}_i → columns of UU
  6. Assemble: A=UΣVTA = U\Sigma V^T

Key formulas:

uv=uTv(dot product as matrix multiplication)\mathbf{u} \cdot \mathbf{v} = \mathbf{u}^T\mathbf{v} \quad \text{(dot product as matrix multiplication)} (Av)T=vTAT(transpose distributes in reverse)(A\mathbf{v})^T = \mathbf{v}^TA^T \quad \text{(transpose distributes in reverse)} Av2=vTATAv(length via ATA)\|A\mathbf{v}\|^2 = \mathbf{v}^TA^TA\mathbf{v} \quad \text{(length via } A^TA \text{)}

Special Matrices

(Involutions: A2=IA^2 = I)

Matrices satisfying A2=IA^2 = I are called involutions,applying them twice returns to the original.

Examples:

  • Reflections across a line or plane
  • Swap matrices (permutations that swap pairs)

For symmetric involutions, eigenvalues are ±1\pm 1.


(Finite Order: An=IA^n = I)

Matrices where An=IA^n = I for some integer nn have finite order.

Examples:

  • Rotation by 360°/n360°/n (order nn)
  • Cyclic permutations

Eigenvalues are nn-th roots of unity: e2πik/ne^{2\pi i k / n} for k=0,1,,n1k = 0, 1, \ldots, n-1.


Applications

(Principal Component Analysis)

PCA finds the directions of maximum variance in data. Given a covariance matrix CC (symmetric!), the eigenvectors are the principal components, and eigenvalues measure variance along each component.

The spectral decomposition directly gives you the optimal low-rank approximation.


(Vibrational Modes)

In physics, symmetric matrices appear as stiffness or inertia matrices. Eigenvectors represent vibrational modes (standing waves), and eigenvalues give frequencies.


(Stability Analysis)

For the system dxdt=Ax\frac{d\mathbf{x}}{dt} = A\mathbf{x} where AA is symmetric:

  • Negative eigenvalues → stable (exponential decay)
  • Positive eigenvalues → unstable (exponential growth)
  • Zero eigenvalue → neutral stability

The orthogonal eigenvectors decouple the system into independent modes.


Summary: Why Symmetric Matrices Are Perfect

Symmetric matrices are the gold standard because:

  1. Always diagonalizable (Spectral Theorem)
  2. Real eigenvalues (no complex numbers)
  3. Orthogonal eigenvectors (automatic perpendicularity)
  4. Efficient inversion (QTQ^T instead of Q1Q^{-1})
  5. Geometric clarity (stretch along perpendicular axes, no rotation)
  6. Numerical stability (orthogonal transformations preserve conditioning)

The Spectral Theorem says symmetric matrices live in the simplest possible world: they’re diagonal in the right coordinate system, and finding that system is straightforward. Whenever you see A=ATA = A^T, you know the geometry is clean, the computation is efficient, and the eigenvectors will behave.