notebook · ch. 1

LINEAR
ALGEBRA

Vectors, matrices, transformations.

A workbook in thirteen pages — the language of straight things.

vectors

02Vectors

An arrow in space — magnitude and direction. Or, equivalently, an ordered list of numbers: v = (3, 2).

addition: place head-to-tail, draw the resultant.
scalar multiplication: cv stretches by c; negative c flips direction.
Together these two operations are all of linear algebra.

u + v = (u₁+v₁, u₂+v₂)

vector spaces

03Vector Spaces

A vector space V is a set closed under linear combinations: take any vectors in it, scale them, add them — you stay inside.

Basis — a minimal set of vectors whose linear combinations reach every point in V. Like coordinate axes, but you choose them.

Dimension — the number of vectors in any basis. ℝ² has dim 2; the space of polynomials of degree ≤ 5 has dim 6.

span{v₁, v₂, …, vₙ} = { c₁v₁ + ⋯ + cₙvₙ : cᵢ ∈ ℝ }

If the vᵢ are linearly independent, the representation is unique — that's a basis.

matrices

04Matrices

A matrix is a rectangular array of numbers. We write A ∈ ℝ^m×n for m rows and n columns.

Rows index outputs; columns index inputs.
Each column is the image of a basis vector.
Square matrices (m=n) act on a space and return to it.

Matrices are not just bookkeeping — they are the linear maps.

a₁₁a₁₂a₁₃

a₂₁a₂₂a₂₃

a₃₁a₃₂a₃₃

A 3×3 matrix — nine entries, nine degrees of freedom.

composition

05Matrix Multiplication

Matrix multiplication looks bizarre until you realize: it is the composition of linear maps.

(AB)x = A(Bx)

First apply B, then A. The product AB is the single matrix that does both. Hence it is not commutative — in general AB ≠ BA, just as putting on socks then shoes differs from shoes then socks.

Entry rule: (AB)_ij = Σₖ aᵢₖ bₖⱼ — row of A dotted with column of B.

Shape rule: (m×k)·(k×n) = m×n. Inner dimensions must match.

transformations

06Linear Transformations

A linear transformation preserves addition and scaling. Every such map on ℝⁿ is a matrix.

Stretch

diag(2, 1) — pulls along an axis.

Rotate

angle θ; columns are (cos θ, sin θ) and (−sin θ, cos θ).

Shear

(1, 1; 0, 1) — slants the grid.

Project

collapses onto a subspace; loses information.

A shear: grid still parallel, origin fixed.

determinant

07Determinant

The determinant det(A) is the signed volume scaling factor of the linear map A. A unit cube of volume 1 becomes a parallelepiped of volume |det(A)|.

det [ a b
c d ] = ad − bc

det(A) = 0 means the map collapses dimension — A is singular, has no inverse.
Negative determinant means the map flips orientation (mirror).
det(AB) = det(A) · det(B) — volumes multiply.

eigen

08Eigenvalues & Eigenvectors

An eigenvector v of A is a direction that the map only scales — it does not turn.

A v = λ v

The scalar λ is the matching eigenvalue. Eigen-pairs reveal the intrinsic axes of a transformation — the skeleton beneath the cosmetics.

Found by solving det(A − λI) = 0, the characteristic polynomial.

v keeps its line; only its length changes.

systems

09Solving Ax = b

The fundamental equation. Given matrix A and right-hand side b, find the unknown vector x.

Existence — a solution exists iff b lies in the column space of A.

Uniqueness — the solution is unique iff the null space of A is trivial: only x = 0 maps to 0.

rank(A) + nullity(A) = n

The rank-nullity theorem: every input dimension is either preserved (rank) or crushed (nullity). Linear algebra's conservation law.

decompositions

10Decompositions

Hard matrices become easy when factored into structured pieces. Three workhorses:

LU — A = LU. Lower-triangular times upper-triangular. Solves Ax = b in two cheap sweeps. Underlies Gaussian elimination.

QR — A = QR. Orthogonal Q times upper-triangular R. The engine of least squares.

SVD — A = UΣVᵀ. Every matrix is a rotation, then a stretch along orthogonal axes (singular values), then another rotation. The single most useful factorization in applied math: PCA, low-rank approximation, pseudo-inverse, latent semantic analysis.

applications

11Applications

Computer graphics — every rotation, translation, projection, camera transform is a 4×4 matrix. Pixels on screen are the matrix product of geometry and projection.

Machine learning — data is matrices, weights are matrices, gradients are matrices. A neural network is mostly Wx + b, repeated.

Physics — quantum mechanics: states are vectors, observables are Hermitian operators, eigenvalues are measured outcomes. Classical mechanics: rigid-body inertia tensors, normal modes, coupled oscillators.

A discipline indistinguishable from engineering, science, and economics — once you look closely enough.

modern

12Modern Frontiers

Tensors — multi-index generalizations of matrices. The native data type of deep learning libraries (PyTorch, JAX). A 4D tensor: batch × channel × height × width.
Kernel methods — implicit infinite-dimensional feature maps via inner products. The kernel trick made SVMs and Gaussian processes practical.
Numerical linear algebra — randomized SVD, sketching, iterative Krylov methods make billion-dimensional problems tractable.
Deep learning — transformers are stacks of matrix multiplications wrapped in nonlinearity. Attention is softmax(QKᵀ/√d) V. The bedrock is unchanged.

references

13Further Reading

Sheldon Axler — Linear Algebra Done Right. Determinant-free, eigenvalue-first.
Gilbert Strang — Introduction to Linear Algebra; MIT 18.06 lectures.
Trefethen & Bau — Numerical Linear Algebra. The numerical bible.
3Blue1Brown — Essence of Linear Algebra (YouTube) — the visual gold standard.
Eigenvectors, intuitively — Eigenvectors Explained (YouTube).

Close the notebook. Open it again tomorrow. — fin.