2025-09-26 Householder QR¶
Right vs left-looking algorithms
Elementary reflectors
Householder QR
[1]:
using LinearAlgebra
using Plots
using Polynomials
default(lw=4, ms=5, legendfontsize=12, xtickfontsize=12, ytickfontsize=12)
# Here's our Vandermonde matrix again
function vander(x, k=nothing)
if isnothing(k)
k = length(x)
end
m = length(x)
V = ones(m, k)
for j in 2:k
V[:, j] = V[:, j-1] .* x
end
V
end
[1]:
vander (generic function with 2 methods)
Gram-Schmidt orthogonalization¶
Suppose we’re given some vectors and want to find an orthogonal basis for their span.
[2]:
# Let's form Q and R by making each column orthonormal to all previous columns
function gram_schmidt_naive(A)
m, n = size(A)
Q = zeros(m, n) # Orthonormal basis
R = zeros(n, n) # Upper triangular
for j in 1:n
v = A[:, j] # current column to normalize
for k in 1:j-1
# We do a projection against the previous columns of Q. Why?
r = Q[:, k]' * v
v -= Q[:, k] * r
R[k, j] = r # What is this factor?
end
# Normalize and store our final v
R[j, j] = norm(v)
Q[:, j] = v / R[j, j]
end
Q, R
end
[2]:
gram_schmidt_naive (generic function with 1 method)
A variant with more parallelism¶
[3]:
# Let's form Q and R by making each column orthonormal to all previous columns
function gram_schmidt_classical(A)
m, n = size(A)
# Same setup as our naive version
Q = zeros(m, n)
R = zeros(n, n)
for j in 1:n
v = A[: ,j]
# Here we make the current column othogonal to all previous
R[1:j-1, j] = Q[:, 1:j-1]' * v
v -= Q[:, 1:j-1] * R[1:j-1, j]
# And normalize our result
R[j, j] = norm(v)
Q[:, j] = v / norm(v)
end
Q, R
end
[3]:
gram_schmidt_classical (generic function with 1 method)
Right-looking modified Gram-Schmidt¶
There are inherent data dependencies in this process.
[4]:
function gram_schmidt_modified(A)
m, n = size(A)
Q = copy(A)
R = zeros(n, n)
for j in 1:n
# First we normalize our column
R[j, j] = norm(Q[:, j])
Q[:, j] /= R[j, j]
# Then we make all following columns orthogonal to the current
R[j, j+1:end] = Q[:, j]' * Q[:, j+1:end]
Q[:, j+1:end] -= Q[:, j] * R[j, j+1:end]'
end
Q, R
end
[4]:
gram_schmidt_modified (generic function with 1 method)
Classical vs modified¶
Classical
Really unstable, orthogonality error size of \(1 \gg \epsilon_\text{machine}\)
Don’t need to know all the vectors in advance
Modified
Needs to be right-looking for efficiency
Less unstable, but orthogonality error \(10^{-9} \gg \epsilon_\text{machine}\)
Householder QR¶
Gram-Schmidt constructed a triangular matrix \(R\) to orthogonalize \(A\) into \(Q\). Each step was a projector, which is a rank-deficient operation. Householder uses orthogonal transformations (reflectors) to triangularize.
The structure of the algorithm is
Constructing the \(Q_j\)¶
Each of our \(Q_j\) will have the form
where \(F\) is a “reflection” that achieves
where \(x\) is the column of \(R\) from the diagonal down. This transformation is a reflection across a plane with normal \(v = F x - x = \left\lvert \left\lvert x \right\rvert \right\rvert e_1 - x\).
The reflection, as depicted above by Trefethen and Bau (1999) can be written as \(F = I - 2 \frac{v v^T}{v^T v}\).
Reflection¶
[6]:
# Let's make any symmetric matrix
A = rand(4, 4); A += A'
# Now let's build v = |x| e_1 - x
v = copy(A[:,1])
v[1] -= norm(v)
v = normalize(v) # v /= norm(v)
# And build F to clear the first column of A below the diagonal
F = I - 2 * v * v'
B = F * A
display(B)
4×4 Matrix{Float64}:
2.10041 1.5861 1.92938 1.9544
-6.71674e-17 0.832466 0.493814 -0.0422542
3.98231e-17 -0.0690713 0.505746 0.218325
-3.4793e-17 0.507456 1.58059 -0.303831
[7]:
# Now let's keep going with column 2
v = copy(B[2:end, 2]) # Note - starting at 2, not the full column
v[1] -= norm(v);
v = normalize(v)
# Build and apply F to clear the second column of B
F = I - 2 * v * v'
B[2:end, 2:end] = F * B[2:end, 2:end]
display(B)
4×4 Matrix{Float64}:
2.10041 1.5861 1.92938 1.9544
-6.71674e-17 0.977385 1.2055 -0.209166
3.98231e-17 0.0 0.844947 0.138772
-3.4793e-17 5.55112e-17 -0.911459 0.280636
Householder algorithm¶
[8]:
function qr_householder_naive(A)
m, n = size(A)
R = copy(A)
V = [] # list of reflectors
for j in 1:n
# Build v for the current column
v = copy(R[j:end, j])
v[1] -= norm(v)
v = normalize(v)
# Build and apply F
R[j:end,j:end] -= 2 * v * (v' * R[j:end,j:end])
# Keep track of the reflectors
push!(V, v)
end
V, R
end
[8]:
qr_householder_naive (generic function with 1 method)
[9]:
# Let's pick on Vandermonde again
m = 4
x = LinRange(-1, 1, m)
A = vander(x, m)
# Factorize and check R
V, R = qr_householder_naive(A)
_, R = qr(A)
display(R)
4×4 Matrix{Float64}:
-2.0 0.0 -1.11111 0.0
0.0 1.49071 1.38778e-17 1.3582
0.0 0.0 0.888889 9.71445e-17
0.0 0.0 0.0 0.397523
But what about \(Q\)?¶
[10]:
# Here's a functino to apply the reflectors V to a vector x
function reflectors_mult(V, x)
y = copy(x)
for v in reverse(V)
n = length(v) - 1
# This is applying F as we did to form R
y[end-n:end] -= 2 * v * (v' * y[end-n:end])
end
y
end
# Create a dense matrix Q from reflectors V
function reflectors_to_dense(V)
m = length(V[1])
Q = diagm(ones(m))
# Apply the reflectors to the identity matrix to build Q
for j in 1:m
Q[:, j] = reflectors_mult(V, Q[:,j])
end
Q
end
[10]:
reflectors_to_dense (generic function with 1 method)
[11]:
# Again with Vandermonde as our example
m = 20
x = LinRange(-1, 1, m)
A = vander(x, m)
# Factorize
V, R = qr_householder_naive(A)
Q = reflectors_to_dense(V)
# And check
@show norm(Q' * Q - I)
@show norm(Q * R - A);
norm(Q' * Q - I) = 4.043305005028868e-15
norm(Q * R - A) = 7.653110366995408e-15
The columns of Q retain orthogonality! This is better than we saw from our right looking \(QR\) last time.
An edge case¶
[12]:
# Should be trivial, since A == I
A = [1 0; 0 1.]
# But this result is not good
V, R = qr_householder_naive(A)
[12]:
(Any[[NaN, NaN], [NaN]], [NaN NaN; NaN NaN])
We have the lines
v = copy(R[j:end, j])
v[1] -= norm(v)
v = normalize(v)
What happens when \(R\) is already upper triangular?
Improved Householder QR¶
Let’s fix that problem by selecting the reflector that moves \(x\) the larger distance.
[13]:
function qr_householder(A)
m, n = size(A)
R = copy(A)
V = [] # list of reflectors
for j in 1:n
# Build v for the current column
v = copy(R[j:end, j])
v[1] += sign(v[1]) * norm(v) # Here's the fix!
v = normalize(v)
# Build and apply F
R[j:end, j:end] -= 2 * v * (v' * R[j:end,j:end])
# Keep track of the reflectors
push!(V, v)
end
V, R
end
[13]:
qr_householder (generic function with 1 method)
[14]:
# Now this works as expected
A = [1 0; 0 1.]
V, R = qr_householder(A)
[14]:
(Any[[1.0, 0.0], [1.0]], [-1.0 0.0; 0.0 -1.0])
[15]:
# And with a more robust test
A = [2 -1; -1 2] * 1e-10
A += [1 0; 0 1]
V, R = qr_householder(A)
tau = [2*v[1]^2 for v in V]
@show tau
V1 = [v ./ v[1] for v in V]
@show V1
println("R =")
display(R)
tau = [2.0, 2.0]
V1 = [[1.0, -4.999999999e-11], [1.0]]
R =
2×2 Matrix{Float64}:
-1.0 2.0e-10
0.0 -1.0
Householder \(QR\) is backward stable¶
[16]:
# Let's try to 'break' our QR with Vandermonde as before
m = 40
x = LinRange(-1, 1, m)
A = vander(x, m)
V, R = qr_householder(A)
Q = reflectors_to_dense(V)
@show norm(Q' * Q - I)
@show norm(Q * R - A);
norm(Q' * Q - I) = 5.932687575393109e-15
norm(Q * R - A) = 6.6179593854314975e-15
[19]:
# And let's compare to Julia
A = [1 0; 0 1.]
# Ours
println("Our QR")
V, R = qr_householder(A)
println("R =")
display(R)
println("Q = ")
display(reflectors_to_dense(V))
# And Julia's
println("\nJulia's QR")
Q, R = qr(A)
println("R =")
display(R)
println("Q = ")
display(Matrix(Q))
Our QR
R =
2×2 Matrix{Float64}:
-1.0 0.0
0.0 -1.0
Q =
2×2 Matrix{Float64}:
-1.0 0.0
0.0 -1.0
Julia's QR
R =
2×2 Matrix{Float64}:
1.0 0.0
0.0 1.0
Q =
2×2 Matrix{Float64}:
1.0 0.0
0.0 1.0
Orthogonality is preserved¶
[20]:
a = 20
x = LinRange(-1, 1, a)
A = vander(x)
Q, _ = gram_schmidt_classical(A)
v = A[:, end]
@show norm(v)
# Note, if v is actually orthogonal, these should be 0
scatter(abs.(Q[:, 1:end-1]' * v), yscale=:log10, xticks=range(1, a-1; step=2), label="\$v \\cdot Q_{:, j}\$")
norm(v) = 1.4245900685395503
[20]:
[21]:
V, R = qr_householder(A)
Q = reflectors_to_dense(V)
v = A[:, end]
@show norm(v)
# Note, if v is actually orthogonal, these should be 0
scatter(abs.(Q[:, 1:end-1]' * v), yscale=:log10, xticks=range(1, a-1; step=2), label="\$v \\cdot Q_{:, j}\$")
norm(v) = 1.4245900685395503
[21]:
[22]:
scatter(abs.(diag(R)), yscale=:log10, label="\$R_{j, j}\$")
[22]:
[23]:
v = Q[:, end]
@show norm(v)
# Note, if v is actually orthogonal, these should be 0
scatter(abs.(Q[:, 1:end-1]' * v), ylims=[1e-18, 1e-15], yticks=[1e-15, 1e-16, 1e-17, 1e-18], yscale=:log10, xticks=range(1, a-1; step=2), label="\$v \\cdot Q_{:, j}\$")
norm(v) = 1.0
┌ Warning: Invalid negative or zero value 0.0 found at series index 1 for log10 based yscale
└ @ Plots ~/.julia/packages/Plots/xKhUG/src/utils.jl:106
┌ Warning: Invalid negative or zero value 0.0 found at series index 1 for log10 based yscale
└ @ Plots ~/.julia/packages/Plots/xKhUG/src/utils.jl:106
[23]:
┌ Warning: Invalid negative or zero value 0.0 found at series index 1 for log10 based yscale
└ @ Plots ~/.julia/packages/Plots/xKhUG/src/utils.jl:106