8: Orthogonality and Projections

Orthogonality is the geometric idea of “perpendicularity” extended to any dimension. Two vectors are orthogonal when their dot product is zero,they point in completely independent directions. This concept unlocks powerful tools: projections decompose vectors into components, orthogonal bases simplify computations, and the Gram-Schmidt process converts any basis into an orthonormal one. Orthogonality turns complicated geometric problems into simple, coordinate-wise calculations.

The Dot Product

(Definition)

For vectors u,vRn\mathbf{u}, \mathbf{v} \in \mathbb{R}^n:

uv=u1v1+u2v2++unvn=i=1nuivi\mathbf{u} \cdot \mathbf{v} = u_1v_1 + u_2v_2 + \cdots + u_nv_n = \sum_{i=1}^{n} u_iv_i

Matrix notation: uv=uTv\mathbf{u} \cdot \mathbf{v} = \mathbf{u}^T\mathbf{v} (row times column).

Geometric interpretation: The dot product measures how much two vectors “align”:

  • Large positive value: vectors point in similar directions
  • Zero: vectors are perpendicular (orthogonal)
  • Large negative value: vectors point in opposite directions

(Properties)

The dot product is:

  1. Commutative: uv=vu\mathbf{u} \cdot \mathbf{v} = \mathbf{v} \cdot \mathbf{u}
  2. Distributive: u(v+w)=uv+uw\mathbf{u} \cdot (\mathbf{v} + \mathbf{w}) = \mathbf{u} \cdot \mathbf{v} + \mathbf{u} \cdot \mathbf{w}
  3. Homogeneous: (cu)v=c(uv)(c\mathbf{u}) \cdot \mathbf{v} = c(\mathbf{u} \cdot \mathbf{v})
  4. Positive definite: vv0\mathbf{v} \cdot \mathbf{v} \geq 0, with equality iff v=0\mathbf{v} = \mathbf{0}

(Length and Angle)

The length (or norm) of a vector:

v=vv=v12+v22++vn2\|\mathbf{v}\| = \sqrt{\mathbf{v} \cdot \mathbf{v}} = \sqrt{v_1^2 + v_2^2 + \cdots + v_n^2}

The angle between nonzero vectors u\mathbf{u} and v\mathbf{v}:

cosθ=uvuv\cos\theta = \frac{\mathbf{u} \cdot \mathbf{v}}{\|\mathbf{u}\| \|\mathbf{v}\|}

Key insight: The dot product is uvcosθ\|\mathbf{u}\| \|\mathbf{v}\| \cos\theta,it captures both magnitude and directional alignment.


(Example)

u=[34],v=[12]\mathbf{u} = \begin{bmatrix} 3 \\ 4 \end{bmatrix}, \quad \mathbf{v} = \begin{bmatrix} 1 \\ 2 \end{bmatrix} uv=3(1)+4(2)=11\mathbf{u} \cdot \mathbf{v} = 3(1) + 4(2) = 11 u=9+16=5,v=1+4=5\|\mathbf{u}\| = \sqrt{9 + 16} = 5, \quad \|\mathbf{v}\| = \sqrt{1 + 4} = \sqrt{5} cosθ=1155=115250.9839\cos\theta = \frac{11}{5\sqrt{5}} = \frac{11\sqrt{5}}{25} \approx 0.9839

The angle is θ10.3°\theta \approx 10.3°,nearly aligned.


Orthogonality

(Definition)

Vectors u\mathbf{u} and v\mathbf{v} are orthogonal (written uv\mathbf{u} \perp \mathbf{v}) if:

uv=0\mathbf{u} \cdot \mathbf{v} = 0

Geometric meaning: Orthogonal vectors are perpendicular,they point in completely independent directions.

Note: The zero vector 0\mathbf{0} is orthogonal to every vector (by convention).


(Orthogonal Sets)

A set of vectors {v1,v2,,vk}\{\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k\} is orthogonal if:

vivj=0for all ij\mathbf{v}_i \cdot \mathbf{v}_j = 0 \quad \text{for all } i \neq j

Key fact: Nonzero orthogonal vectors are automatically linearly independent.

Proof: Suppose c1v1++ckvk=0c_1\mathbf{v}_1 + \cdots + c_k\mathbf{v}_k = \mathbf{0}. Dot both sides with vi\mathbf{v}_i:

c1(v1vi)++ci(vivi)++ck(vkvi)=0c_1(\mathbf{v}_1 \cdot \mathbf{v}_i) + \cdots + c_i(\mathbf{v}_i \cdot \mathbf{v}_i) + \cdots + c_k(\mathbf{v}_k \cdot \mathbf{v}_i) = 0

All terms vanish except civi2=0c_i\|\mathbf{v}_i\|^2 = 0. Since vi0\mathbf{v}_i \neq \mathbf{0}, we have ci=0c_i = 0. This holds for all ii, so the set is independent. ✓


(Orthonormal Sets)

A set is orthonormal if it’s orthogonal and every vector has length 1:

vivj={1if i=j0if ij\mathbf{v}_i \cdot \mathbf{v}_j = \begin{cases} 1 & \text{if } i = j \\ 0 & \text{if } i \neq j \end{cases}

This is written compactly as vivj=δij\mathbf{v}_i \cdot \mathbf{v}_j = \delta_{ij} (Kronecker delta).

Example: The standard basis {e1,e2,,en}\{\mathbf{e}_1, \mathbf{e}_2, \ldots, \mathbf{e}_n\} is orthonormal.


(Why Orthonormal Bases Are Perfect)

If {q1,,qn}\{\mathbf{q}_1, \ldots, \mathbf{q}_n\} is an orthonormal basis, then for any vector v\mathbf{v}:

v=(vq1)q1+(vq2)q2++(vqn)qn\mathbf{v} = (\mathbf{v} \cdot \mathbf{q}_1)\mathbf{q}_1 + (\mathbf{v} \cdot \mathbf{q}_2)\mathbf{q}_2 + \cdots + (\mathbf{v} \cdot \mathbf{q}_n)\mathbf{q}_n

The coefficients are just dot products,no need to solve a system of equations!

Why this works: Dot both sides with qi\mathbf{q}_i:

vqi=(vq1)(q1qi)++(vqi)(qiqi)+\mathbf{v} \cdot \mathbf{q}_i = (\mathbf{v} \cdot \mathbf{q}_1)(\mathbf{q}_1 \cdot \mathbf{q}_i) + \cdots + (\mathbf{v} \cdot \mathbf{q}_i)(\mathbf{q}_i \cdot \mathbf{q}_i) + \cdots

All terms vanish except (vqi)1=vqi(\mathbf{v} \cdot \mathbf{q}_i) \cdot 1 = \mathbf{v} \cdot \mathbf{q}_i. ✓


Projections

(Scalar Projection)

The scalar projection of v\mathbf{v} onto u\mathbf{u} is:

compuv=uvu\text{comp}_{\mathbf{u}} \mathbf{v} = \frac{\mathbf{u} \cdot \mathbf{v}}{\|\mathbf{u}\|}

This is the signed length of the projection,positive if v\mathbf{v} points roughly in the direction of u\mathbf{u}, negative otherwise.


(Vector Projection)

The vector projection (or orthogonal projection) of v\mathbf{v} onto u\mathbf{u} is:

projuv=uvuuu=uvu2u\text{proj}_{\mathbf{u}} \mathbf{v} = \frac{\mathbf{u} \cdot \mathbf{v}}{\mathbf{u} \cdot \mathbf{u}} \mathbf{u} = \frac{\mathbf{u} \cdot \mathbf{v}}{\|\mathbf{u}\|^2} \mathbf{u}

Geometric meaning: The component of v\mathbf{v} that points in the direction of u\mathbf{u}.

Key property: projuv\text{proj}_{\mathbf{u}} \mathbf{v} is parallel to u\mathbf{u}.


(Orthogonal Component)

The orthogonal component (or rejection) is:

vprojuv\mathbf{v} - \text{proj}_{\mathbf{u}} \mathbf{v}

This is the part of v\mathbf{v} that’s perpendicular to u\mathbf{u}.

Decomposition: Every vector v\mathbf{v} splits into:

v=projuv to u+(vprojuv) to u\mathbf{v} = \underbrace{\text{proj}_{\mathbf{u}} \mathbf{v}}_{\parallel \text{ to } \mathbf{u}} + \underbrace{(\mathbf{v} - \text{proj}_{\mathbf{u}} \mathbf{v})}_{\perp \text{ to } \mathbf{u}}

(Example: Projection)

Project v=[23]\mathbf{v} = \begin{bmatrix} 2 \\ 3 \end{bmatrix} onto u=[11]\mathbf{u} = \begin{bmatrix} 1 \\ 1 \end{bmatrix}.

uv=1(2)+1(3)=5\mathbf{u} \cdot \mathbf{v} = 1(2) + 1(3) = 5 uu=1+1=2\mathbf{u} \cdot \mathbf{u} = 1 + 1 = 2 projuv=52[11]=[2.52.5]\text{proj}_{\mathbf{u}} \mathbf{v} = \frac{5}{2}\begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 2.5 \\ 2.5 \end{bmatrix}

Orthogonal component:

vprojuv=[23][2.52.5]=[0.50.5]\mathbf{v} - \text{proj}_{\mathbf{u}} \mathbf{v} = \begin{bmatrix} 2 \\ 3 \end{bmatrix} - \begin{bmatrix} 2.5 \\ 2.5 \end{bmatrix} = \begin{bmatrix} -0.5 \\ 0.5 \end{bmatrix}

Verify orthogonality:

u[0.50.5]=1(0.5)+1(0.5)=0\mathbf{u} \cdot \begin{bmatrix} -0.5 \\ 0.5 \end{bmatrix} = 1(-0.5) + 1(0.5) = 0 \quad ✓

Orthogonal Complements

(Definition)

For a subspace WRnW \subseteq \mathbb{R}^n, the orthogonal complement WW^\perp is:

W={vRnvw=0 for all wW}W^\perp = \{\mathbf{v} \in \mathbb{R}^n \mid \mathbf{v} \cdot \mathbf{w} = 0 \text{ for all } \mathbf{w} \in W\}

Interpretation: WW^\perp contains all vectors perpendicular to everything in WW.


(Key Properties)

  1. WW^\perp is always a subspace
  2. dim(W)+dim(W)=n\dim(W) + \dim(W^\perp) = n
  3. WW={0}W \cap W^\perp = \{\mathbf{0}\}
  4. (W)=W(W^\perp)^\perp = W
  5. Every vector vRn\mathbf{v} \in \mathbb{R}^n decomposes uniquely as v=w+w\mathbf{v} = \mathbf{w} + \mathbf{w}^\perp where wW\mathbf{w} \in W and wW\mathbf{w}^\perp \in W^\perp

Direct sum notation: Rn=WW\mathbb{R}^n = W \oplus W^\perp


(Finding WW^\perp)

If W=span{v1,,vk}W = \text{span}\{\mathbf{v}_1, \ldots, \mathbf{v}_k\}, then xW\mathbf{x} \in W^\perp iff:

{v1x=0v2x=0vkx=0\begin{cases} \mathbf{v}_1 \cdot \mathbf{x} = 0 \\ \mathbf{v}_2 \cdot \mathbf{x} = 0 \\ \vdots \\ \mathbf{v}_k \cdot \mathbf{x} = 0 \end{cases}

This is a homogeneous system: Ax=0A\mathbf{x} = \mathbf{0} where the rows of AA are v1T,,vkT\mathbf{v}_1^T, \ldots, \mathbf{v}_k^T.

So: W=null(A)W^\perp = \text{null}(A).


(Example: Orthogonal Complement)

Find WW^\perp where W=span{[121]}W = \text{span}\left\{\begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix}\right\} in R3\mathbb{R}^3.

We need x=[x1x2x3]\mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} such that:

x1+2x2+x3=0x_1 + 2x_2 + x_3 = 0

This is a plane through the origin. The general solution:

x=x2[210]+x3[101]\mathbf{x} = x_2\begin{bmatrix} -2 \\ 1 \\ 0 \end{bmatrix} + x_3\begin{bmatrix} -1 \\ 0 \\ 1 \end{bmatrix}

So W=span{[210],[101]}W^\perp = \text{span}\left\{\begin{bmatrix} -2 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} -1 \\ 0 \\ 1 \end{bmatrix}\right\}.

Dimension check: dim(W)+dim(W)=1+2=3=n\dim(W) + \dim(W^\perp) = 1 + 2 = 3 = n


The Gram-Schmidt Process

(The Problem)

Given a basis {v1,,vk}\{\mathbf{v}_1, \ldots, \mathbf{v}_k\} for a subspace, we want to construct an orthogonal (or orthonormal) basis {u1,,uk}\{\mathbf{u}_1, \ldots, \mathbf{u}_k\} for the same subspace.


(The Algorithm)

Start with the first vector:

u1=v1\mathbf{u}_1 = \mathbf{v}_1

For each subsequent vector, subtract off the projections onto all previous orthogonal vectors:

u2=v2proju1v2\mathbf{u}_2 = \mathbf{v}_2 - \text{proj}_{\mathbf{u}_1} \mathbf{v}_2 u3=v3proju1v3proju2v3\mathbf{u}_3 = \mathbf{v}_3 - \text{proj}_{\mathbf{u}_1} \mathbf{v}_3 - \text{proj}_{\mathbf{u}_2} \mathbf{v}_3

In general:

uk=vkj=1k1projujvk=vkj=1k1ujvkujujuj\mathbf{u}_k = \mathbf{v}_k - \sum_{j=1}^{k-1} \text{proj}_{\mathbf{u}_j} \mathbf{v}_k = \mathbf{v}_k - \sum_{j=1}^{k-1} \frac{\mathbf{u}_j \cdot \mathbf{v}_k}{\mathbf{u}_j \cdot \mathbf{u}_j} \mathbf{u}_j

Result: {u1,,uk}\{\mathbf{u}_1, \ldots, \mathbf{u}_k\} is an orthogonal basis.

To make it orthonormal: Normalize each vector:

qi=uiui\mathbf{q}_i = \frac{\mathbf{u}_i}{\|\mathbf{u}_i\|}

(Why This Works)

At each step, we take vk\mathbf{v}_k and remove its components along all previous orthogonal directions. What remains (uk\mathbf{u}_k) is guaranteed to be orthogonal to u1,,uk1\mathbf{u}_1, \ldots, \mathbf{u}_{k-1}.

Key insight: The span of {u1,,uk}\{\mathbf{u}_1, \ldots, \mathbf{u}_k\} equals the span of {v1,,vk}\{\mathbf{v}_1, \ldots, \mathbf{v}_k\} at each stage,we’re just changing the basis vectors, not the subspace.


(Example: Gram-Schmidt in R3\mathbb{R}^3)

Orthogonalize the basis:

v1=[110],v2=[101],v3=[011]\mathbf{v}_1 = \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix}, \quad \mathbf{v}_2 = \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix}, \quad \mathbf{v}_3 = \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix}

Step 1: u1=v1=[110]\mathbf{u}_1 = \mathbf{v}_1 = \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix}

Step 2: Compute u2\mathbf{u}_2:

proju1v2=u1v2u1u1u1=1+01+1[110]=12[110]\text{proj}_{\mathbf{u}_1} \mathbf{v}_2 = \frac{\mathbf{u}_1 \cdot \mathbf{v}_2}{\mathbf{u}_1 \cdot \mathbf{u}_1} \mathbf{u}_1 = \frac{1 + 0}{1 + 1}\begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = \frac{1}{2}\begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} u2=[101]12[110]=[1/21/21]\mathbf{u}_2 = \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix} - \frac{1}{2}\begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 1/2 \\ -1/2 \\ 1 \end{bmatrix}

(Can scale by 2: u2=[112]\mathbf{u}_2 = \begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix})

Step 3: Compute u3\mathbf{u}_3:

proju1v3=0+12[110]=12[110]\text{proj}_{\mathbf{u}_1} \mathbf{v}_3 = \frac{0 + 1}{2}\begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = \frac{1}{2}\begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} proju2v3=01+21+1+4[112]=16[112]\text{proj}_{\mathbf{u}_2} \mathbf{v}_3 = \frac{0 - 1 + 2}{1 + 1 + 4}\begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix} = \frac{1}{6}\begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix} u3=[011]12[110]16[112]\mathbf{u}_3 = \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} - \frac{1}{2}\begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} - \frac{1}{6}\begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix} =[1/21/61/2+1/611/3]=[2/32/32/3]= \begin{bmatrix} -1/2 - 1/6 \\ 1/2 + 1/6 \\ 1 - 1/3 \end{bmatrix} = \begin{bmatrix} -2/3 \\ 2/3 \\ 2/3 \end{bmatrix}

(Can scale by 3: u3=[222]\mathbf{u}_3 = \begin{bmatrix} -2 \\ 2 \\ 2 \end{bmatrix} or by 3/2-3/2: u3=[111]\mathbf{u}_3 = \begin{bmatrix} 1 \\ -1 \\ -1 \end{bmatrix})

Verify orthogonality: Check all pairs have dot product zero.

Normalize for orthonormal basis:

q1=12[110],q2=16[112],q3=13[111]\mathbf{q}_1 = \frac{1}{\sqrt{2}}\begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix}, \quad \mathbf{q}_2 = \frac{1}{\sqrt{6}}\begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix}, \quad \mathbf{q}_3 = \frac{1}{\sqrt{3}}\begin{bmatrix} 1 \\ -1 \\ -1 \end{bmatrix}

QR Factorization

(The Connection)

The Gram-Schmidt process gives the QR factorization of a matrix.

If A=[v1vn]A = [\mathbf{v}_1 \mid \cdots \mid \mathbf{v}_n] has linearly independent columns, then:

A=QRA = QR

where:

  • Q=[q1qn]Q = [\mathbf{q}_1 \mid \cdots \mid \mathbf{q}_n] has orthonormal columns (from Gram-Schmidt)
  • RR is upper triangular (encodes the projection coefficients)

Why this matters: QR factorization is numerically stable and used for solving least squares problems, computing eigenvalues (QR algorithm), and more.


Applications

(Least Squares)

To solve the inconsistent system Ax=bA\mathbf{x} = \mathbf{b} (more equations than unknowns), find the x\mathbf{x} that minimizes Axb\|A\mathbf{x} - \mathbf{b}\|.

The solution is the projection of b\mathbf{b} onto col(A)\text{col}(A):

x^=(ATA)1ATb\mathbf{\hat{x}} = (A^TA)^{-1}A^T\mathbf{b}

If A=QRA = QR (orthonormal columns), this simplifies dramatically:

x^=R1QTb\mathbf{\hat{x}} = R^{-1}Q^T\mathbf{b}

Why? (QR)T(QR)=RTQTQR=RTR(QR)^T(QR) = R^TQ^TQR = R^TR (since QTQ=IQ^TQ = I), and RTRR^TR is easy to work with.


(Orthogonal Decomposition)

Any vector space splits naturally into orthogonal complements. For example:

Rn=row(A)null(A)\mathbb{R}^n = \text{row}(A) \oplus \text{null}(A) Rm=col(A)null(AT)\mathbb{R}^m = \text{col}(A) \oplus \text{null}(A^T)

These are the four fundamental subspaces, paired as orthogonal complements.


(Signal Processing)

In Fourier analysis, sine and cosine waves form an orthogonal basis for periodic functions. The Fourier coefficients are just inner products,projections onto each frequency component.


Summary: Why Orthogonality Simplifies Everything

Orthogonality turns geometry into algebra:

  1. Dot products compute angles and lengths without trigonometry
  2. Orthogonal vectors are automatically independent (no redundancy)
  3. Orthonormal bases make coordinates trivial (just dot products)
  4. Projections decompose vectors into parallel and perpendicular parts
  5. Gram-Schmidt converts any basis into an orthonormal one
  6. Orthogonal matrices preserve structure (lengths, angles, volume)

When vectors are orthogonal, you can work component-wise,no cross-terms, no interactions, just clean decomposition. This is why orthonormal bases are the gold standard: they make every calculation as simple as possible while preserving all the geometry.