| |

8: Orthogonality and Projections

Orthogonality is the geometric idea of โ€œperpendicularityโ€ extended to any dimension. Two vectors are orthogonal when their dot product is zero,they point in completely independent directions. This concept unlocks powerful tools: projections decompose vectors into components, orthogonal bases simplify computations, and the Gram-Schmidt process converts any basis into an orthonormal one. Orthogonality turns complicated geometric problems into simple, coordinate-wise calculations.

The Dot Product

(Definition)

For vectors u,vโˆˆRn\mathbf{u}, \mathbf{v} \in \mathbb{R}^n:

uโ‹…v=u1v1+u2v2+โ‹ฏ+unvn=โˆ‘i=1nuivi\mathbf{u} \cdot \mathbf{v} = u_1v_1 + u_2v_2 + \cdots + u_nv_n = \sum_{i=1}^{n} u_iv_i

Matrix notation: uโ‹…v=uTv\mathbf{u} \cdot \mathbf{v} = \mathbf{u}^T\mathbf{v} (row times column).

Geometric interpretation: The dot product measures how much two vectors โ€œalignโ€:

  • Large positive value: vectors point in similar directions
  • Zero: vectors are perpendicular (orthogonal)
  • Large negative value: vectors point in opposite directions

(Properties)

The dot product is:

  1. Commutative: uโ‹…v=vโ‹…u\mathbf{u} \cdot \mathbf{v} = \mathbf{v} \cdot \mathbf{u}
  2. Distributive: uโ‹…(v+w)=uโ‹…v+uโ‹…w\mathbf{u} \cdot (\mathbf{v} + \mathbf{w}) = \mathbf{u} \cdot \mathbf{v} + \mathbf{u} \cdot \mathbf{w}
  3. Homogeneous: (cu)โ‹…v=c(uโ‹…v)(c\mathbf{u}) \cdot \mathbf{v} = c(\mathbf{u} \cdot \mathbf{v})
  4. Positive definite: vโ‹…vโ‰ฅ0\mathbf{v} \cdot \mathbf{v} \geq 0, with equality iff v=0\mathbf{v} = \mathbf{0}

(Length and Angle)

The length (or norm) of a vector:

โˆฅvโˆฅ=vโ‹…v=v12+v22+โ‹ฏ+vn2\|\mathbf{v}\| = \sqrt{\mathbf{v} \cdot \mathbf{v}} = \sqrt{v_1^2 + v_2^2 + \cdots + v_n^2}

The angle between nonzero vectors u\mathbf{u} and v\mathbf{v}:

cosโกฮธ=uโ‹…vโˆฅuโˆฅโˆฅvโˆฅ\cos\theta = \frac{\mathbf{u} \cdot \mathbf{v}}{\|\mathbf{u}\| \|\mathbf{v}\|}

Key insight: The dot product is โˆฅuโˆฅโˆฅvโˆฅcosโกฮธ\|\mathbf{u}\| \|\mathbf{v}\| \cos\theta,it captures both magnitude and directional alignment.


(Example)

u=[34],v=[12]\mathbf{u} = \begin{bmatrix} 3 \\ 4 \end{bmatrix}, \quad \mathbf{v} = \begin{bmatrix} 1 \\ 2 \end{bmatrix} uโ‹…v=3(1)+4(2)=11\mathbf{u} \cdot \mathbf{v} = 3(1) + 4(2) = 11 โˆฅuโˆฅ=9+16=5,โˆฅvโˆฅ=1+4=5\|\mathbf{u}\| = \sqrt{9 + 16} = 5, \quad \|\mathbf{v}\| = \sqrt{1 + 4} = \sqrt{5} cosโกฮธ=1155=11525โ‰ˆ0.9839\cos\theta = \frac{11}{5\sqrt{5}} = \frac{11\sqrt{5}}{25} \approx 0.9839

The angle is ฮธโ‰ˆ10.3ยฐ\theta \approx 10.3ยฐ,nearly aligned.


Orthogonality

(Definition)

Vectors u\mathbf{u} and v\mathbf{v} are orthogonal (written uโŠฅv\mathbf{u} \perp \mathbf{v}) if:

uโ‹…v=0\mathbf{u} \cdot \mathbf{v} = 0

Geometric meaning: Orthogonal vectors are perpendicular,they point in completely independent directions.

Note: The zero vector 0\mathbf{0} is orthogonal to every vector (by convention).


(Orthogonal Sets)

A set of vectors {v1,v2,โ€ฆ,vk}\{\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k\} is orthogonal if:

viโ‹…vj=0forย allย iโ‰ j\mathbf{v}_i \cdot \mathbf{v}_j = 0 \quad \text{for all } i \neq j

Key fact: Nonzero orthogonal vectors are automatically linearly independent.

Proof: Suppose c1v1+โ‹ฏ+ckvk=0c_1\mathbf{v}_1 + \cdots + c_k\mathbf{v}_k = \mathbf{0}. Dot both sides with vi\mathbf{v}_i:

c1(v1โ‹…vi)+โ‹ฏ+ci(viโ‹…vi)+โ‹ฏ+ck(vkโ‹…vi)=0c_1(\mathbf{v}_1 \cdot \mathbf{v}_i) + \cdots + c_i(\mathbf{v}_i \cdot \mathbf{v}_i) + \cdots + c_k(\mathbf{v}_k \cdot \mathbf{v}_i) = 0

All terms vanish except ciโˆฅviโˆฅ2=0c_i\|\mathbf{v}_i\|^2 = 0. Since viโ‰ 0\mathbf{v}_i \neq \mathbf{0}, we have ci=0c_i = 0. This holds for all ii, so the set is independent. โœ“


(Orthonormal Sets)

A set is orthonormal if itโ€™s orthogonal and every vector has length 1:

viโ‹…vj={1ifย i=j0ifย iโ‰ j\mathbf{v}_i \cdot \mathbf{v}_j = \begin{cases} 1 & \text{if } i = j \\ 0 & \text{if } i \neq j \end{cases}

This is written compactly as viโ‹…vj=ฮดij\mathbf{v}_i \cdot \mathbf{v}_j = \delta_{ij} (Kronecker delta).

Example: The standard basis {e1,e2,โ€ฆ,en}\{\mathbf{e}_1, \mathbf{e}_2, \ldots, \mathbf{e}_n\} is orthonormal.


(Why Orthonormal Bases Are Perfect)

If {q1,โ€ฆ,qn}\{\mathbf{q}_1, \ldots, \mathbf{q}_n\} is an orthonormal basis, then for any vector v\mathbf{v}:

v=(vโ‹…q1)q1+(vโ‹…q2)q2+โ‹ฏ+(vโ‹…qn)qn\mathbf{v} = (\mathbf{v} \cdot \mathbf{q}_1)\mathbf{q}_1 + (\mathbf{v} \cdot \mathbf{q}_2)\mathbf{q}_2 + \cdots + (\mathbf{v} \cdot \mathbf{q}_n)\mathbf{q}_n

The coefficients are just dot products,no need to solve a system of equations!

Why this works: Dot both sides with qi\mathbf{q}_i:

vโ‹…qi=(vโ‹…q1)(q1โ‹…qi)+โ‹ฏ+(vโ‹…qi)(qiโ‹…qi)+โ‹ฏ\mathbf{v} \cdot \mathbf{q}_i = (\mathbf{v} \cdot \mathbf{q}_1)(\mathbf{q}_1 \cdot \mathbf{q}_i) + \cdots + (\mathbf{v} \cdot \mathbf{q}_i)(\mathbf{q}_i \cdot \mathbf{q}_i) + \cdots

All terms vanish except (vโ‹…qi)โ‹…1=vโ‹…qi(\mathbf{v} \cdot \mathbf{q}_i) \cdot 1 = \mathbf{v} \cdot \mathbf{q}_i. โœ“


Projections

(Scalar Projection)

The scalar projection of v\mathbf{v} onto u\mathbf{u} is:

compuv=uโ‹…vโˆฅuโˆฅ\text{comp}_{\mathbf{u}} \mathbf{v} = \frac{\mathbf{u} \cdot \mathbf{v}}{\|\mathbf{u}\|}

This is the signed length of the projection,positive if v\mathbf{v} points roughly in the direction of u\mathbf{u}, negative otherwise.


(Vector Projection)

The vector projection (or orthogonal projection) of v\mathbf{v} onto u\mathbf{u} is:

projuv=uโ‹…vuโ‹…uu=uโ‹…vโˆฅuโˆฅ2u\text{proj}_{\mathbf{u}} \mathbf{v} = \frac{\mathbf{u} \cdot \mathbf{v}}{\mathbf{u} \cdot \mathbf{u}} \mathbf{u} = \frac{\mathbf{u} \cdot \mathbf{v}}{\|\mathbf{u}\|^2} \mathbf{u}

Geometric meaning: The component of v\mathbf{v} that points in the direction of u\mathbf{u}.

Key property: projuv\text{proj}_{\mathbf{u}} \mathbf{v} is parallel to u\mathbf{u}.


(Orthogonal Component)

The orthogonal component (or rejection) is:

vโˆ’projuv\mathbf{v} - \text{proj}_{\mathbf{u}} \mathbf{v}

This is the part of v\mathbf{v} thatโ€™s perpendicular to u\mathbf{u}.

Decomposition: Every vector v\mathbf{v} splits into:

v=projuvโŸโˆฅย toย u+(vโˆ’projuv)โŸโŠฅย toย u\mathbf{v} = \underbrace{\text{proj}_{\mathbf{u}} \mathbf{v}}_{\parallel \text{ to } \mathbf{u}} + \underbrace{(\mathbf{v} - \text{proj}_{\mathbf{u}} \mathbf{v})}_{\perp \text{ to } \mathbf{u}}

(Example: Projection)

Project v=[23]\mathbf{v} = \begin{bmatrix} 2 \\ 3 \end{bmatrix} onto u=[11]\mathbf{u} = \begin{bmatrix} 1 \\ 1 \end{bmatrix}.

uโ‹…v=1(2)+1(3)=5\mathbf{u} \cdot \mathbf{v} = 1(2) + 1(3) = 5 uโ‹…u=1+1=2\mathbf{u} \cdot \mathbf{u} = 1 + 1 = 2 projuv=52[11]=[2.52.5]\text{proj}_{\mathbf{u}} \mathbf{v} = \frac{5}{2}\begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 2.5 \\ 2.5 \end{bmatrix}

Orthogonal component:

vโˆ’projuv=[23]โˆ’[2.52.5]=[โˆ’0.50.5]\mathbf{v} - \text{proj}_{\mathbf{u}} \mathbf{v} = \begin{bmatrix} 2 \\ 3 \end{bmatrix} - \begin{bmatrix} 2.5 \\ 2.5 \end{bmatrix} = \begin{bmatrix} -0.5 \\ 0.5 \end{bmatrix}

Verify orthogonality:

uโ‹…[โˆ’0.50.5]=1(โˆ’0.5)+1(0.5)=0โœ“\mathbf{u} \cdot \begin{bmatrix} -0.5 \\ 0.5 \end{bmatrix} = 1(-0.5) + 1(0.5) = 0 \quad โœ“

Orthogonal Complements

(Definition)

For a subspace WโІRnW \subseteq \mathbb{R}^n, the orthogonal complement WโŠฅW^\perp is:

WโŠฅ={vโˆˆRnโˆฃvโ‹…w=0ย forย allย wโˆˆW}W^\perp = \{\mathbf{v} \in \mathbb{R}^n \mid \mathbf{v} \cdot \mathbf{w} = 0 \text{ for all } \mathbf{w} \in W\}

Interpretation: WโŠฅW^\perp contains all vectors perpendicular to everything in WW.


(Key Properties)

  1. WโŠฅW^\perp is always a subspace
  2. dimโก(W)+dimโก(WโŠฅ)=n\dim(W) + \dim(W^\perp) = n
  3. WโˆฉWโŠฅ={0}W \cap W^\perp = \{\mathbf{0}\}
  4. (WโŠฅ)โŠฅ=W(W^\perp)^\perp = W
  5. Every vector vโˆˆRn\mathbf{v} \in \mathbb{R}^n decomposes uniquely as v=w+wโŠฅ\mathbf{v} = \mathbf{w} + \mathbf{w}^\perp where wโˆˆW\mathbf{w} \in W and wโŠฅโˆˆWโŠฅ\mathbf{w}^\perp \in W^\perp

Direct sum notation: Rn=WโŠ•WโŠฅ\mathbb{R}^n = W \oplus W^\perp


(Finding WโŠฅW^\perp)

If W=span{v1,โ€ฆ,vk}W = \text{span}\{\mathbf{v}_1, \ldots, \mathbf{v}_k\}, then xโˆˆWโŠฅ\mathbf{x} \in W^\perp iff:

{v1โ‹…x=0v2โ‹…x=0โ‹ฎvkโ‹…x=0\begin{cases} \mathbf{v}_1 \cdot \mathbf{x} = 0 \\ \mathbf{v}_2 \cdot \mathbf{x} = 0 \\ \vdots \\ \mathbf{v}_k \cdot \mathbf{x} = 0 \end{cases}

This is a homogeneous system: Ax=0A\mathbf{x} = \mathbf{0} where the rows of AA are v1T,โ€ฆ,vkT\mathbf{v}_1^T, \ldots, \mathbf{v}_k^T.

So: WโŠฅ=null(A)W^\perp = \text{null}(A).


(Example: Orthogonal Complement)

Find WโŠฅW^\perp where W=span{[121]}W = \text{span}\left\{\begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix}\right\} in R3\mathbb{R}^3.

We need x=[x1x2x3]\mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} such that:

x1+2x2+x3=0x_1 + 2x_2 + x_3 = 0

This is a plane through the origin. The general solution:

x=x2[โˆ’210]+x3[โˆ’101]\mathbf{x} = x_2\begin{bmatrix} -2 \\ 1 \\ 0 \end{bmatrix} + x_3\begin{bmatrix} -1 \\ 0 \\ 1 \end{bmatrix}

So WโŠฅ=span{[โˆ’210],[โˆ’101]}W^\perp = \text{span}\left\{\begin{bmatrix} -2 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} -1 \\ 0 \\ 1 \end{bmatrix}\right\}.

Dimension check: dimโก(W)+dimโก(WโŠฅ)=1+2=3=n\dim(W) + \dim(W^\perp) = 1 + 2 = 3 = n โœ“


The Gram-Schmidt Process

(The Problem)

Given a basis {v1,โ€ฆ,vk}\{\mathbf{v}_1, \ldots, \mathbf{v}_k\} for a subspace, we want to construct an orthogonal (or orthonormal) basis {u1,โ€ฆ,uk}\{\mathbf{u}_1, \ldots, \mathbf{u}_k\} for the same subspace.


(The Algorithm)

Start with the first vector:

u1=v1\mathbf{u}_1 = \mathbf{v}_1

For each subsequent vector, subtract off the projections onto all previous orthogonal vectors:

u2=v2โˆ’proju1v2\mathbf{u}_2 = \mathbf{v}_2 - \text{proj}_{\mathbf{u}_1} \mathbf{v}_2 u3=v3โˆ’proju1v3โˆ’proju2v3\mathbf{u}_3 = \mathbf{v}_3 - \text{proj}_{\mathbf{u}_1} \mathbf{v}_3 - \text{proj}_{\mathbf{u}_2} \mathbf{v}_3

In general:

uk=vkโˆ’โˆ‘j=1kโˆ’1projujvk=vkโˆ’โˆ‘j=1kโˆ’1ujโ‹…vkujโ‹…ujuj\mathbf{u}_k = \mathbf{v}_k - \sum_{j=1}^{k-1} \text{proj}_{\mathbf{u}_j} \mathbf{v}_k = \mathbf{v}_k - \sum_{j=1}^{k-1} \frac{\mathbf{u}_j \cdot \mathbf{v}_k}{\mathbf{u}_j \cdot \mathbf{u}_j} \mathbf{u}_j

Result: {u1,โ€ฆ,uk}\{\mathbf{u}_1, \ldots, \mathbf{u}_k\} is an orthogonal basis.

To make it orthonormal: Normalize each vector:

qi=uiโˆฅuiโˆฅ\mathbf{q}_i = \frac{\mathbf{u}_i}{\|\mathbf{u}_i\|}

(Why This Works)

At each step, we take vk\mathbf{v}_k and remove its components along all previous orthogonal directions. What remains (uk\mathbf{u}_k) is guaranteed to be orthogonal to u1,โ€ฆ,ukโˆ’1\mathbf{u}_1, \ldots, \mathbf{u}_{k-1}.

Key insight: The span of {u1,โ€ฆ,uk}\{\mathbf{u}_1, \ldots, \mathbf{u}_k\} equals the span of {v1,โ€ฆ,vk}\{\mathbf{v}_1, \ldots, \mathbf{v}_k\} at each stage,weโ€™re just changing the basis vectors, not the subspace.


(Example: Gram-Schmidt in R3\mathbb{R}^3)

Orthogonalize the basis:

v1=[110],v2=[101],v3=[011]\mathbf{v}_1 = \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix}, \quad \mathbf{v}_2 = \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix}, \quad \mathbf{v}_3 = \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix}

Step 1: u1=v1=[110]\mathbf{u}_1 = \mathbf{v}_1 = \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix}

Step 2: Compute u2\mathbf{u}_2:

proju1v2=u1โ‹…v2u1โ‹…u1u1=1+01+1[110]=12[110]\text{proj}_{\mathbf{u}_1} \mathbf{v}_2 = \frac{\mathbf{u}_1 \cdot \mathbf{v}_2}{\mathbf{u}_1 \cdot \mathbf{u}_1} \mathbf{u}_1 = \frac{1 + 0}{1 + 1}\begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = \frac{1}{2}\begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} u2=[101]โˆ’12[110]=[1/2โˆ’1/21]\mathbf{u}_2 = \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix} - \frac{1}{2}\begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 1/2 \\ -1/2 \\ 1 \end{bmatrix}

(Can scale by 2: u2=[1โˆ’12]\mathbf{u}_2 = \begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix})

Step 3: Compute u3\mathbf{u}_3:

proju1v3=0+12[110]=12[110]\text{proj}_{\mathbf{u}_1} \mathbf{v}_3 = \frac{0 + 1}{2}\begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = \frac{1}{2}\begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} proju2v3=0โˆ’1+21+1+4[1โˆ’12]=16[1โˆ’12]\text{proj}_{\mathbf{u}_2} \mathbf{v}_3 = \frac{0 - 1 + 2}{1 + 1 + 4}\begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix} = \frac{1}{6}\begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix} u3=[011]โˆ’12[110]โˆ’16[1โˆ’12]\mathbf{u}_3 = \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} - \frac{1}{2}\begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} - \frac{1}{6}\begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix} =[โˆ’1/2โˆ’1/61/2+1/61โˆ’1/3]=[โˆ’2/32/32/3]= \begin{bmatrix} -1/2 - 1/6 \\ 1/2 + 1/6 \\ 1 - 1/3 \end{bmatrix} = \begin{bmatrix} -2/3 \\ 2/3 \\ 2/3 \end{bmatrix}

(Can scale by 3: u3=[โˆ’222]\mathbf{u}_3 = \begin{bmatrix} -2 \\ 2 \\ 2 \end{bmatrix} or by โˆ’3/2-3/2: u3=[1โˆ’1โˆ’1]\mathbf{u}_3 = \begin{bmatrix} 1 \\ -1 \\ -1 \end{bmatrix})

Verify orthogonality: Check all pairs have dot product zero.

Normalize for orthonormal basis:

q1=12[110],q2=16[1โˆ’12],q3=13[1โˆ’1โˆ’1]\mathbf{q}_1 = \frac{1}{\sqrt{2}}\begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix}, \quad \mathbf{q}_2 = \frac{1}{\sqrt{6}}\begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix}, \quad \mathbf{q}_3 = \frac{1}{\sqrt{3}}\begin{bmatrix} 1 \\ -1 \\ -1 \end{bmatrix}

QR Factorization

(The Connection)

The Gram-Schmidt process gives the QR factorization of a matrix.

If A=[v1โˆฃโ‹ฏโˆฃvn]A = [\mathbf{v}_1 \mid \cdots \mid \mathbf{v}_n] has linearly independent columns, then:

A=QRA = QR

where:

  • Q=[q1โˆฃโ‹ฏโˆฃqn]Q = [\mathbf{q}_1 \mid \cdots \mid \mathbf{q}_n] has orthonormal columns (from Gram-Schmidt)
  • RR is upper triangular (encodes the projection coefficients)

Why this matters: QR factorization is numerically stable and used for solving least squares problems, computing eigenvalues (QR algorithm), and more.


Applications

(Least Squares)

To solve the inconsistent system Ax=bA\mathbf{x} = \mathbf{b} (more equations than unknowns), find the x\mathbf{x} that minimizes โˆฅAxโˆ’bโˆฅ\|A\mathbf{x} - \mathbf{b}\|.

The solution is the projection of b\mathbf{b} onto col(A)\text{col}(A):

x^=(ATA)โˆ’1ATb\mathbf{\hat{x}} = (A^TA)^{-1}A^T\mathbf{b}

If A=QRA = QR (orthonormal columns), this simplifies dramatically:

x^=Rโˆ’1QTb\mathbf{\hat{x}} = R^{-1}Q^T\mathbf{b}

Why? (QR)T(QR)=RTQTQR=RTR(QR)^T(QR) = R^TQ^TQR = R^TR (since QTQ=IQ^TQ = I), and RTRR^TR is easy to work with.


(Orthogonal Decomposition)

Any vector space splits naturally into orthogonal complements. For example:

Rn=row(A)โŠ•null(A)\mathbb{R}^n = \text{row}(A) \oplus \text{null}(A) Rm=col(A)โŠ•null(AT)\mathbb{R}^m = \text{col}(A) \oplus \text{null}(A^T)

These are the four fundamental subspaces, paired as orthogonal complements.


(Signal Processing)

In Fourier analysis, sine and cosine waves form an orthogonal basis for periodic functions. The Fourier coefficients are just inner products,projections onto each frequency component.


Summary: Why Orthogonality Simplifies Everything

Orthogonality turns geometry into algebra:

  1. Dot products compute angles and lengths without trigonometry
  2. Orthogonal vectors are automatically independent (no redundancy)
  3. Orthonormal bases make coordinates trivial (just dot products)
  4. Projections decompose vectors into parallel and perpendicular parts
  5. Gram-Schmidt converts any basis into an orthonormal one
  6. Orthogonal matrices preserve structure (lengths, angles, volume)

When vectors are orthogonal, you can work component-wise,no cross-terms, no interactions, just clean decomposition. This is why orthonormal bases are the gold standard: they make every calculation as simple as possible while preserving all the geometry.