{ "cells": [ { "cell_type": "markdown", "id": "a571e8fb-ccaf-441b-a202-9cc801c8d061", "metadata": {}, "source": [ "# Newton's Method for Systems of Equations\n", "\n", "We have covered rootfinding for non-linear equations, including Newton's method, and linear algebra.\n", "There is a connection between these two topics - Newton's method for systems of equations.\n", "\n", "We want to find the solution $x_*$ such that\n", "\n", "$$ f \\left( x_* \\right) = 0 $$\n", "\n", "for some non-linear function $f \\left( x \\right)$.\n", "\n", "For a single equation, Newton's method is\n", "\n", "$$ x_n = x_{n - 1} - \\frac{f \\left( x_{n - 1} \\right)}{f' \\left( x_{n - 1} \\right)} $$\n", "\n", "We will generalize this for two equations with two unknown variables.\n", "Therefore, we will find the solution $X_*$ such that\n", "\n", "$$ F \\left( X_* \\right) = 0 $$" ] }, { "cell_type": "code", "execution_count": 1, "id": "efd26d2f-1ce3-4727-b384-09289ec4f288", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "F (generic function with 1 method)" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let's start with some system of equations\n", "f1(x1, x2) = (x1 - 2) * (x1 - 3) + (x2 - 4) * (x2 - 3)\n", "f2(x1, x2) = (x1 - 2) * (x1 - 4) + (x2 - 5) * (x2 - 3)\n", "\n", "function F(X)\n", " return [f1(X[1], X[2]);\n", " f2(X[1], X[2])]\n", "end" ] }, { "cell_type": "markdown", "id": "596f4744-9ced-4a50-9b1d-04c484865837", "metadata": {}, "source": [ "With Newton's method for scalar equations, we start with an initial guess $x_0$ for $x_*$ and create iterates $x_n$ that get increasingly close to $x_*$\n", "With each iteration, we are solving for the value $x$ that is the root of the tangent line for the current guess $x_{n - 1}$:\n", "\n", "$$ t \\left( x \\right) = f \\left( x_{n - 1} \\right) + f' \\left( x_{n - 1} \\right) \\left( x - x_{n - 1} \\right)$$\n", "\n", "Solving for $t \\left( x \\right) = 0$ gives the next $x_n$ for our Newton iteration:\n", "\n", "$$ x_n = x_{n - 1} - \\frac{f \\left( x_{n - 1} \\right)}{f' \\left( x_{n - 1} \\right)} $$\n", "\n", "Similarly, we can form a [**tangent plane**](https://en.wikipedia.org/wiki/Tangent#Plane), $T \\left( X \\right)$, and find the point on the tangent plane closest to the origin.\n", "\n", "$$ T \\left( X \\right) = F \\left( X_{n - 1} \\right) + J \\left( X_{n - 1} \\right) \\left( X - X_{n - 1} \\right) $$\n", "\n", "Solving for $T \\left( X \\right) = 0$ gives the next $X_n$ for our Newton iteration:\n", "\n", "$$ X_n = X_{n - 1} - J^{-1} \\left( X_{n - 1} \\right) F \\left( X_{n - 1} \\right) $$\n", "\n", "If $J \\left( X_{n - 1} \\right)$ is not invertible, the we want to find the $X_n$ such that $\\left\\lvert \\left\\lvert J \\left( X_{n - 1} \\right) X_{n - 1} - F \\left( X_{n - 1} \\right) \\right\\rvert \\right\\rvert$ is minimized.\n", "This happens frequently with practical problems.\n", "\n", "Especially when $J$ is large, it is common to use a [**Krylov method**](https://en.wikipedia.org/wiki/Krylov_subspace) to solve for $X_n$.\n", "This leads to the [**Newton-Krylov**](https://en.wikipedia.org/wiki/Newton%E2%80%93Krylov_method) methods, which are workhorses of numerical computation.\n", "\n", "In order to form this tangent plane, we need the [**Jacobian matrix**](https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant), $J \\left( X_{n - 1} \\right)$.\n", "This Jacobian matrix is formed with the entries $J_{i, j} = \\frac{\\partial f_i}{\\partial i_j}$." ] }, { "cell_type": "code", "execution_count": 2, "id": "2b892b18-41c3-4993-bc4a-31171edf890a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "J (generic function with 1 method)" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# We also need derivatives\n", "f1dx1(x1, x2) = (x1 - 3) + (x1 - 2)\n", "f1dx2(x1, x2) = (x2 - 3) + (x2 - 4)\n", "f2dx1(x1, x2) = (x1 - 4) + (x1 - 2)\n", "f2dx2(x1, x2) = (x2 - 3) + (x2 - 5)\n", "\n", "function J(X)\n", " return [f1dx1(X[1], X[2]) f1dx2(X[1], X[2]);\n", " f2dx1(X[1], X[2]) f2dx2(X[1], X[2])]\n", "end" ] }, { "cell_type": "markdown", "id": "e310aadb-c61e-4ffe-bd5c-1c2ca920275a", "metadata": {}, "source": [ "Let's start with an initial guess of $X_0 = 0$ and do a few iterations \"by hand\"." ] }, { "cell_type": "code", "execution_count": 3, "id": "0f13f711-8652-483c-9146-cf08e6834e5b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2-element Vector{Float64}:\n", " 8.499999999999986\n", " -3.4999999999999973" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Initial guess\n", "X_0 = [0; 0]\n", "\n", "# Solving Ax = b where A = J(X_O) and b = F(X_0)\n", "b = F(X_0)\n", "A = J(X_0)\n", "X_1 = X_0 - inv(A) * b" ] }, { "cell_type": "markdown", "id": "3f7468cd-9f4a-4adb-9fa6-ff2a492c711d", "metadata": {}, "source": [ "Is this closer?\n", "Let's check." ] }, { "cell_type": "code", "execution_count": 4, "id": "8e5b3d1e-75ad-4ab3-a21a-0be94e5233f0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "norm(F(X_0)) = 29.206163733020468\n", "norm(F(X_1)) = 119.50104602052625\n" ] } ], "source": [ "using LinearAlgebra\n", "\n", "@show norm(F(X_0))\n", "@show norm(F(X_1));" ] }, { "cell_type": "markdown", "id": "91c2ac6f-2056-4803-9b78-2ed318d291b2", "metadata": {}, "source": [ "Hmm, that doesn't look closer, but let's try another iteration." ] }, { "cell_type": "code", "execution_count": 5, "id": "f217cd89-181f-4bbb-96a9-2bd1f3eb22d5", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2-element Vector{Float64}:\n", " 5.249999999999987\n", " -0.24999999999999956" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Solving Ax = b where A = J(X_1) and b = F(X_1)\n", "b = F(X_1)\n", "A = J(X_1)\n", "X_2 = X_1 - inv(A) * b" ] }, { "cell_type": "code", "execution_count": 6, "id": "852576bf-706b-4613-a192-9d730ad210a0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "norm(F(X_0)) = 29.206163733020468\n", "norm(F(X_1)) = 119.50104602052625\n", "norm(F(X_2)) = 29.875261505131533\n" ] } ], "source": [ "@show norm(F(X_0))\n", "@show norm(F(X_1))\n", "@show norm(F(X_2));" ] }, { "cell_type": "markdown", "id": "bc6447e2-027b-41e0-b015-db495f735b49", "metadata": {}, "source": [ "Now we're headed in the right direction again.\n", "Let's take another step." ] }, { "cell_type": "code", "execution_count": 7, "id": "df4a63f9-62f8-405b-8688-97ccb1411deb", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2-element Vector{Float64}:\n", " 3.624999999999995\n", " 1.3750000000000024" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Solving Ax = b where A = J(X_1) and b = F(X_1)\n", "b = F(X_2)\n", "A = J(X_2)\n", "X_3 = X_2 - inv(A) * b" ] }, { "cell_type": "code", "execution_count": 8, "id": "45e8f325-4400-4c86-9374-81c108c93ea7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "norm(F(X_0)) = 29.206163733020468\n", "norm(F(X_1)) = 119.50104602052625\n", "norm(F(X_2)) = 29.875261505131533\n", "norm(F(X_3)) = 7.46881537628288\n" ] } ], "source": [ "@show norm(F(X_0))\n", "@show norm(F(X_1))\n", "@show norm(F(X_2))\n", "@show norm(F(X_3));" ] }, { "cell_type": "markdown", "id": "62a044b2-e9e5-4dfc-ae00-c873ac1f2304", "metadata": {}, "source": [ "Ok, this is starting to look like an algorithm we can code up:" ] }, { "cell_type": "code", "execution_count": 9, "id": "c8b77d16-1fc5-4d21-ae5e-cf96fa355044", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "newton (generic function with 1 method)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "function newton(F, J, X_0; tol=1e-12, max_its=100, verbose=true)\n", " # Initial setup\n", " X_n = X_0\n", " F_n = F(X_n)\n", " if verbose\n", " println(\"Newton residuals:\")\n", " println(\"r_0 = $(norm(F_n))\")\n", " end\n", " \n", " i = 0\n", " while norm(F_n) > tol && i < max_its\n", " # Form Ax = b\n", " b = F_n\n", " A = J(X_n)\n", " # Solve for update\n", " X_n = X_n - inv(A) * b\n", " F_n = F(X_n)\n", " if verbose\n", " println(\"r_$(i) = $(norm(F_n))\")\n", " end\n", " i += 1\n", " end\n", " if i == max_its\n", " println(\"Warning: Max iterations reached\")\n", " end\n", " X_n\n", "end" ] }, { "cell_type": "code", "execution_count": 10, "id": "9e356a40-86b2-4175-a2db-da4b186a8cea", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Newton residuals:\n", "r_0 = 29.206163733020468\n", "r_0 = 119.50104602052625\n", "r_1 = 29.875261505131533\n", "r_2 = 7.46881537628288\n", "r_3 = 1.867203844070719\n", "r_4 = 0.4668009610176802\n", "r_5 = 0.11670024025441997\n", "r_6 = 0.029175060063604052\n", "r_7 = 0.0072937650159012775\n", "r_8 = 0.0018234412539753194\n", "r_9 = 0.00045586031349382984\n", "r_10 = 0.00011396507837345746\n", "r_11 = 2.8491269593364365e-5\n", "r_12 = 7.122817398341091e-6\n", "r_13 = 1.7807043495852728e-6\n", "r_14 = 4.451760873963182e-7\n", "r_15 = 1.1129402184907955e-7\n", "r_16 = 2.7823505462269888e-8\n", "r_17 = 6.955876365567472e-9\n", "r_18 = 1.738969091391868e-9\n", "r_19 = 4.34742272847967e-10\n", "r_20 = 1.0868556821199175e-10\n", "r_21 = 2.7171392052997937e-11\n", "r_22 = 6.792848013249484e-12\n", "r_23 = 1.698212003312371e-12\n", "r_24 = 4.2455300082809277e-13\n", "\n", "Final result:\n", "X_star = [2.000000387430191, 2.999999612569809]\n", "F_star = F(X_star) = [3.0020430585864233e-13, 3.0020430585864233e-13]\n", "norm(F_star) = 4.2455300082809277e-13\n" ] } ], "source": [ "# And let's try it\n", "X_0 = [0; 0]\n", "X_star = newton(F, J, X_0)\n", "\n", "println(\"\\nFinal result:\")\n", "@show X_star\n", "@show F_star = F(X_star)\n", "@show norm(F_star);" ] } ], "metadata": { "kernelspec": { "display_name": "Julia 1.11.6", "language": "julia", "name": "julia-1.11" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.11.6" } }, "nbformat": 4, "nbformat_minor": 5 }