2025-09-03 Rootfinding¶

Review condition numbers
Forward and backward error
Volume of a polygon
Rootfinding examples
Bisection

Reliability¶

Reliablility comes from well-conditioned operations and stable algorithms

Well-conditioned operations¶

Modeling turns an abstract question into a mathematical function
We want well-conditioned models (\(\kappa\))
Some systems are inherantly sensitive though (fracture, chaotic systems, combustion, etc)

Stable algorithms¶

An algorithm f(x) may be unstable
These algorithms are unreliable, but may be practical
Unstable algorithms are constructed from ill-conditioned parts

Forward and backward error¶

Forward error is the difference between our result and the true solution
Backward error tells us which problem our result is the solution for

Additional reading: FNC

An ill-conditioned problem from Paul Olum¶

From Surely You’re Joking, Mr. Feynman (p 113)

So Paul is walking past the lunch place and these guys are all excited. “Hey, Paul!” they call out. “Feynman’s terrific! We give him a problem that can be stated in ten seconds, and in a minute he gets the answer to 10 percent. Why don’t you give him one?” Without hardly stopping, he says, “The tangent of 10 to the 100th.” I was sunk: you have to divide by pi to 100 decimal places! It was hopeless.

[1]:

tan(BigFloat("1e100", precision=400))

[1]:

0.4012319619908143541857543436532949583238702611292440683194415381168718098221194

[2]:

# What values could have answered this question?
using Plots
default(lw=4, ms=5, legendfontsize=12, xtickfontsize=12, ytickfontsize=12)

plot(tan, ylims=(-10, 10), x=LinRange(-10, 10, 1000), label="tan(x)")

[2]:

For ill-conditioned problems, the best we can hope for is small backward error
Feynman could have answered with any real number and the relative backward error would have been less than \(10^{-100}\). All vaues on the graph above have tiny backward error, as \(\tan \left( \text{fl} \left( 10^{100} \right) \right) = \tan \left( 10^{100} \left( 1 + \epsilon \right) \right)\) for some \(\epsilon < 10^{-100}\)

Stability - Volume of a polygon¶

If error exceeds what we can explain via condition numbers, then we call our algorithm unstable.

[3]:

# What is the area of this shape?
X = [1 0; 2 1; 1 3; 0 1; -1 1.5; -2 -1; .5 -2; 1 0]
plot(X[:,1], X[:,2], seriestype=:shape, aspect_ratio=:equal, xlims=(-3, 3), ylims=(-3, 3))

[3]:

[4]:

# Area should be invariant under rotation.
R(θ) = [cos(θ) -sin(θ); sin(θ) cos(θ)]
Y = X * R(deg2rad(30))' .+ [0 0]
plot(Y[:,1], Y[:,2], seriestype=:shape, aspect_ratio=:equal, xlims=(-3, 3), ylims=(-3, 3))

[4]:

[5]:

using LinearAlgebra

# Take the volume by using the determinant on sub-shapes
function pvolume(X)
    n = size(X, 1)
    vol = sum(det(X[i:i+1, :]) / 2 for i in 1:n-1)
end

# Does our algorithm respect this invariance due to rotation?
@show vol_X = pvolume(X)
@show vol_Y = pvolume(Y)
[det(Y[i:i+1, :]) for i in 1:size(Y, 1)-1]
X

vol_X = pvolume(X) = 9.25
vol_Y = pvolume(Y) = 9.25

[5]:

8×2 Matrix{Float64}:
  1.0   0.0
  2.0   1.0
  1.0   3.0
  0.0   1.0
 -1.0   1.5
 -2.0  -1.0
  0.5  -2.0
  1.0   0.0

What happens to this algorithm if the polygon is translated, perhaps far away?

Rootfinding¶

Given \(f \left( x \right)\), find \(\tilde{x}\) such that \(f \left( \tilde{x} \right) = 0\).

We will start with scalars (\(x\) is a single value and \(f\) returns a single value) and come back to this with vectors later.

We don’t have \(f \left( x \right)\) but rather the algorithm f(x) that approximates it.
Sometimes we have extra information, such as fp(x) that approximates \(f' \left( x \right)\).
If we have the source code for f(x), we might be able to transform it to provide fp(x).

Example: Queueing¶

In a simple queueing model, there is an arrival rate and a departure (serving) rate. While waiting in the queue, there is a probability of “dropping out”. The length of the queue in this model is

\[\text{length} = \frac{\text{arrival} - \text{drop}}{\text{departure}}\]

One model for waiting time (with exponential distributions for rates) is

[6]:

wait(arrival, departure) = log(arrival / departure) / departure

plot(d -> wait(1, d), xlims=(.1, 1), xlabel="departure", ylabel="wait", label="wait(1, departure)")

[6]:

Suppose I have a maximum threshold for how long I am willing to wait.

[7]:

my_wait = 0.8

plot([d -> wait(1, d) - my_wait, d -> 0], xlims=(.1, 1), xlabel="departure", ylabel="wait", label=["wait(1, departure)" "0.8"])

[7]:

[8]:

# Lets find the max departure rate where I will leave before my turn
#import Pkg; Pkg.add("Roots")
using Roots
d0 = find_zero(d -> wait(1, d) - my_wait, 1)

plot([d -> wait(1, d) - my_wait, d -> 0], xlims=(.1, 1), xlabel="departure", ylabel="wait", label=["wait(1, departure)" "0.8"])
scatter!([d0], [0], marker=:circle, color=:black, title="d0 = $d0", label="d0")

[8]:

Example: Nonlinear Elasticity¶

Strain-energy formulation¶

\[W = C_1 \left( I_1 - 3 - 2 \log \left( J \right) \right) + D_1 \left( J - 1 \right)^2\]

where \(I_1 = \lambda_1^2 + \lambda_2^2 + \lambda_3^2\) and \(J = \lambda_1 \lambda_2 \lambda_3\) are invariants defined in terms of th eprinciple stretches \(\lambda_i\)

Uniaxial extension ¶

In that experiment, we want to know the stress as a function of the stretch \(\lambda_1\). We don’t know \(J\), and will have to determine it by solving an equation.

https://www.youtube.com/watch?v=9N5SS8f1auI&t=60

What is the change of volume?¶

Using symmetries of uniaxial extension, we can write an equation \(f \left( \lambda, J \right) = 0\) that must be satisfied. We’ll need to solve a rootfinding problem to compute \(J \left( \lambda \right)\).

[9]:

function f(lambda, J)
    C_1 = 1.5e6
    D_1 = 1e8
    D_1 * J^(8/3) - D_1 * J^(5/3) + C_1 / (3*lambda) * J - C_1 * lambda^2/3
end
plot(J -> f(4, J), xlims=(0.1, 3), xlabel="J", ylabel="f(J)", label="f(4, J)")

[9]:

[10]:

find_J(lambda) = find_zero(J -> f(lambda, J), 1)
plot(find_J, xlims=(0.1, 5), xlabel="lambda", ylabel="J", label="J(lambda)")

[10]:

Bisection algorithm¶

Bisection is a rootfinding algorithm that starts with an interval \(\left[ a, b \right]\) containing a root and does not require derivatives. Suppose \(f\) is continuous. What is a sufficient condition for \(f\) to have a root on the interval \(\left[ a, b \right]\)?

[11]:

hasroot(f, a, b) = f(a) * f(b) < 0

f(x) = cos(x) - x
plot(f, label="\$cos(x) - x\$")

[11]:

Bisection¶

[12]:

# Bisection algorithm
function bisect(f, a, b, tol)
    mid = (a + b) / 2
    if abs(b - a) < tol
        return mid
    elseif hasroot(f, a, mid)
        return bisect(f, a, mid, tol)
    else
        return bisect(f, mid, b, tol)
    end
end

x0 = bisect(f, -1, 3, 1e-5)
@show (x0, f(x0));

(x0, f(x0)) = (0.7390861511230469, -1.7035832658995886e-6)

What is my convergence rate?¶

[13]:

# And bisection with history
function bisect_hist(f, a, b, tol)
    mid = (a + b) / 2
    if abs(b - a) < tol
        return [mid]
    elseif hasroot(f, a, mid)
        return prepend!(bisect_hist(f, a, mid, tol), [mid])
    else
        return prepend!(bisect_hist(f, mid, b, tol), [mid])
    end
end

bisect_hist(f, -1, 3, 1e-4)

[13]:

17-element Vector{Float64}:
 1.0
 0.0
 0.5
 0.75
 0.625
 0.6875
 0.71875
 0.734375
 0.7421875
 0.73828125
 0.740234375
 0.7392578125
 0.73876953125
 0.739013671875
 0.7391357421875
 0.73907470703125
 0.739105224609375

Iterative Bisection¶

[14]:

# And bisection with iteration instead of recursion
function bisect_iter(f, a, b, tol)
    hist = Float64[]
    while abs(b - a) > tol
        mid = (a + b) / 2
        push!(hist, mid)
        if hasroot(f, a, mid)
            b = mid
        else
            a = mid
        end
    end
    hist
end

bisect_iter(f, -1, 3, 1e-4)

[14]:

16-element Vector{Float64}:
 1.0
 0.0
 0.5
 0.75
 0.625
 0.6875
 0.71875
 0.734375
 0.7421875
 0.73828125
 0.740234375
 0.7392578125
 0.73876953125
 0.739013671875
 0.7391357421875
 0.73907470703125

Let’s plot the error¶

\[\lvert \text{bisect}^k \left( f, a, b \right) - x_* \rvert, k = 1, 2, \dots\]

where \(r\) is the true root (\(f \left( x_* \right) = 0\)).

[15]:

a = -1
b = 3
r = bisect(f, a, b, 1e-15) # is this a good 'true' root?

# Let's plot the convergence of the iteration on a log-linear plot
hist = bisect_hist(f, a, b, 1e-10)
scatter(abs.(hist .- r), yscale=:log10, label="\$x_k\$")

ks = 1:length(hist)
plot!(ks, (b - a) * (0.5 .^ ks), xlabel="\$k\$", ylabel="\$e_k\$", label="~\$2^{-k}\$")

[15]:

[16]:

# Note the difference without the log-linear scale on the axes
scatter(abs.(hist .- r), label="\$x_k\$")

ks = 1:length(hist)
plot!(ks, (b - a) * (0.5 .^ ks), xlabel="\$k\$", ylabel="\$e_k\$", label="~\$2^{-k}\$")

[16]:

So the error \(e_k = x_k - x_*\) satisfies the bound

\[\left\lvert e_k \right\rvert \leq c 2^{-k}\]

Exploration¶

Share an equation on Zulip that is useful for one task but requires rootfinding for another task.
Or explain a system in which one stakeholder has the natural inputs but a different stakeholder only knows the outputs.

Collaborate!