Notes on Numerical Calculations in Physics

February 10, 2018

Unfortunately, not every equation in physics can be solved analytically; that is, we cannot find a nice, closed formula for the solution. Far more often, the only thing we can do is to calculate the numerical values of the solution with the help of a computer.

These notes record various techniques for modeling equations and calculating numerical solutions that I have found useful over time. Sometimes, I give a brief account of the technique, while at other times, I simply link to one of the many excellent online resources. Most of the examples are written in MATLAB code, but other numerical software like NumPy offers similar functionality. Sometimes, I wish to mix symbolic and numerical computations, these examples use Mathematica.

Table of Contents:

Calculus: Discretization
Calculus: Higher dimensions
Root finding: General rules
Root finding: All roots with Chebyshev polynomials
Root finding: Testing when a matrix becomes singular

Calculus: Discretization

In quantum mechanics, one often wants to solve an eigenvalue problem for a differential operator acting on a wave function $\psi(x)$. The first step is to assume that the wave function is defined on a finite interval $[0,L]$. A popular approach is to discretize the problem, i.e. to divide the continuous interval into $N$ smaller intervals, and to approximate the wave function by a linear function on each interval. This means to consider the evenly spaced points $0, a, 2a, \dots, L-a,L$ with $a=L/N$ and to consider the values of the wave function only at these points: $$\psi_j = \psi(j\cdot a) \text{ for } j=0…N.$$ The quantity $a$ is also known as the lattice spacing. This collection of $N+1$ complex numbers defines a vector with $N+1$ components, which represents an approximation of the continuous wave function $\psi(x)$.

In this representation, the usual techniques from linear algebra apply. In particular, the differential operator that we want to diagonalize becomes a matrix that acts on the vector $(\psi_j)_{j=0}^N$. It turns out that the following expression is a good discretization of the second derivative (the Laplace operator) $$ \partial^2_x \psi(x) \to (\triangle \psi)_j = \frac{1}{a^2}(\psi_{j+1} - 2\psi_j + \psi_{j-1}) ,$$ while a good discretization of the first derivative is $$ \partial_x \psi(x) \to (D \psi)_j = \frac{1}{2a}(\psi_{j+1} - \psi_{j-1}) .$$ Note that in the discretization, the second derivative is no longer the square of the second derivative, $D^2 \neq \triangle$; this will only be restored in the limit when the discretization becomes infinitely fine, $N\to \infty$ ($a\to 0$).

When defining the matrices above, we have glossed over one small but important detail: What happens at the boundary points, when the index $j$ becomes $j=0$ or $j=N$? Then, the right-hand side would ask for the values of $\psi_{-1}$ or $\psi_{N+1}$, which we have not defined. This question is related to the boundary conditions. In the continuum, one has to specify boundary conditions explicitly, while in the discretization, the boundary conditions become part of the matrix. The easiest way to handle them is to extend the formulas above with the following conventions: $$\text{Dirichlet boundary conditions:}\quad \psi(0)=\psi(L)=0 \quad\iff\quad \psi_{-1} := 0 \text{ and } \psi_{N+1} := 0.$$ $$\text{Periodic boundary conditions:}\quad \psi(0)=\psi(L) \quad\iff\quad \psi_{-1} := \psi_{N} \text{ and } \psi_{N+1} := \psi_{0}.$$ This makes the left-hand side well-defined for all values of $j$. It is also possible to implement von-Neumann or mixed boundary conditions in a similar fashion, but I will not present the details here.

One of the most elegant way to implement the discretized differential operators is to represent them as sparse matrices. In MATLAB, we can write the variants with Dirichlet boundary conditions as

shiftR  = sparse(2:N, 1:N-1, 1, N,N);
shiftL  = shiftR';

D       = (1/(2*a)) * (shiftL - shiftR)
Laplace = (1/(a*a)) * (shiftR + shiftL - 2*speye(N))

Then, more complicated operators can be easily assembled by summing or multiplying these matrices. Since the matrices are sparse, these operations do not cost much time to perform.

Calculus: Higher dimensions

In the previous section, we have discussed how to discretize a function $ψ(x)$ of a single real variable $x$. This helped us in solving an eigenvalue problem in one dimension. However, we live in a three-dimensional world, and we often have to deal with functions of multiple variables, like $ψ(x,y)$ or $ψ(x_1,x_2,x_3)$.

A natural way to discretize a function of two variables is to consider two (not necessarily different) lattice spacings $a_x,a_y$ and collect the function values in a two-dimensional array $$ ψ_{m,n} = ψ(n\cdot a_x, m\cdot a_y) \text{ for } n=1\dots N, m=1\dots M .$$ Here, we following the MATLAB conventions for indexing: $m$ indexes the rows and $n$ the columns of the two-dimensional array, and these indices correspond to the $y-$ and the $x-$axis, respectively. For instance, to make a three-dimensional plot of the square amplitude $|\psi_{m,n}|^2$, we use the following MATLAB code

[Xs, Ys] = meshgrid(1:N, 1:M);
surf(Xs, Ys, abs(psi).^2)

Here, the two-dimensional MATLAB array psi stores the values such that psi(m,n) = $ψ_{m,n}$. The function surf yields a three-dimensional surface plot where the row index ($m$) corresponds to the $y$-axis and the column index ($n$) corresponds to the $x$ axis. The meshgrid function is used to create the coordinate grid, the function abs computes absolute values, and the operation (.^2) computes the square of each array element.

While storing the values of a function with two parameters in an array with two dimensions is very natural for plotting, it does not work very well when we try to express linear operators on these functions. Remember that linear operators are represented as matrices, which are arrays with two dimensions. For these, it is much more convenient to represent the discretized function as an array $ϕ$ with a single dimension of size $M\cdot N$ instead of an array $ψ$ with two dimensions, of size $M\times N$. Converting between these two can be accomplished by a trick where we multiply the column index by $M$ and add the row index to get a single number, like this: $$ ϕ_{(n-1)\cdot M + m} := ψ_{m,n} \text{ for } n=1\dots N, m=1\dots M .$$ For example, if $N=M=10$, we have $ϕ_{14} = \psi_{4,2}$. This trick gives a one-to-one mapping and corresponds to using a positional numeral system with base $M$. Fortunately, we never have to use this trick explicitly. Instead, the MATLAB function reshape allows us to convert between the two representations:

phi = reshape(psi, [N*M,1])
psi = reshape(phi, [M,N])

The reasoning is that the trick with the numeral system accomplishes the same thing as taking the columns of a two-dimensional matrix and concatenating them vertically into a one-dimensional array, like this: $$ ψ = \begin{pmatrix} ψ_{1,1} & ψ_{1,2} & \cdots & ψ_{1,N} \\ ψ_{2,1} & ψ_{2,2} & \cdots & ψ_{2,N} \\ \vdots & \vdots & \ddots & \vdots \\ ψ_{M,1} & ψ_{M,2} & \cdots & ψ_{M,N} \\ \end{pmatrix} \iff ϕ = \begin{pmatrix} ψ_{1,1} \\ ψ_{2,1} \\ \vdots \\ ψ_{M,1} \\ ψ_{2,1} \\ ψ_{2,2} \\ \vdots \\ ψ_{M,N} \end{pmatrix} .$$ Also, and even more importantly, the kron function ("Kronecker product") offers a very convenient way to express linear operators on these arrays. This function is essentially an implementation of the tensor product. If $A$ is an $N\times N$ matrix and $B$ is an $M\times M$ matrix, then their Kronecker product $A \otimes B$ is an $N\cdot M \times N\cdot M$ matrix which has the form of a block matrix where each block is a matrix element of $A$ multiplying a copy of the matrix $B$, like this: $$ A = \begin{pmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,N} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,N} \\ \vdots & \vdots & \ddots & \vdots \\ a_{N,1} & a_{N,2} & \cdots & a_{N,N} \\ \end{pmatrix} \implies A \otimes B = \begin{pmatrix} a_{1,1}B & a_{1,2}B & \cdots & a_{1,N}B \\ a_{2,1}B & a_{2,2}B & \cdots & a_{2,N}B \\ \vdots & \vdots & \ddots & \vdots \\ a_{N,1}B & a_{N,2}B & \cdots & a_{N,N}B \\ \end{pmatrix} .$$ This works particularly well for differential operators that have been discretized as sparse matrices, as discussed in the previous section. For instance, the two-dimensional Laplacian $$ \partial^2_x + \partial_y^2 $$ is represented by the matrix

Laplace2 = kron(Laplace,speye(M)) + kron(speye(N),Laplace)

where Laplacian is a sparse matrix representing the discretized second derivative in one dimension, and speye(N) is the (sparse) $N\times N$ identity matrix. Here, the first summand corresponds to acting with Laplace on the $x-$axis, and the identity on the $y-$axis, and the second summand is similar, but with the axes switched. This is very similar to how the tensor product is used to treat multiple degrees of freedom in quantum mechanics. There, the Laplace operator in two dimensions is also written $∂_x^2 + ∂_y^2 \equiv \partial^2 \otimes \mathbf{1} + \mathbf{1} \otimes \partial^2$ where $\partial^2$ is the second derivative and $\mathbf{1}$ is the identity operator.

Of course, the Kronecker product can also be used for internal degrees of freedom, like spin. For instance, it is very convenient to define the Pauli matrices

sigma0 = eye(2);

sigmax = [0,1; 1,0];
sigmay = [0,-1i; 1i,0];
sigmaz = [1,0; 0,-1];

Then, the discretized Hamiltonian for a free electron in one dimension with a Zeeman field in $z-$direction can be expressed as

H = 1/(2*mass) * kron(Laplace, sigma0) + ...
    Bz * kron(speye(N), sigmaz)

where Bz corresponds to the strength of the Zeeman term.

Root finding: General rules

A root of a nonlinear function $f(x)$ is a point $x_r$ where the function becomes zero, $f(x_r)=0$. There are many problems in physics that have this form. For instance, the vibrational eigenmodes of a disk correspond to the roots of a Bessel function.

When calculating roots numerically, there are several rules to keep in mind in order to get good results:

The function $f$ should be nice, e.g. continuous, or differentiable, or ….
Use a function $f$ that changes sign when $x$ moves through the root, i.e. which satisfies $f(x)< 0$ for arguments $x$ close and to the left of the root, $x< x_r$, and $f(x)> 0$ for arguments close and to the right of the root, $x> x_r$ (or the other way round).
Mathematically, it does not matter whether we solve the equation $f(x)=0$ or the equation $|f(x)|^2=0$, both have the same roots. However, it matters a lot for numerical computations, as the latter variant is always positive and gives worse results. However, in the former case, if the computer has found two values of opposite sign, then by continuity, there must be a root in between, and the computer is well-advised to continuing the search in between.
Do not compute eigenvalues by finding the roots of the characteristic equation.
Given any matrix $A$, we know that its eigenvalues are the roots $λ$ of the equation $\det(A-λ\mathbf{1})=0$. However, expanding the left-hand side as a polynomial in $λ$ and trying to calculate its roots is not numerically stable. There are much better, iterative, methods for calculating eigenvalues that work directly with matrices and vectors. In fact, there are methods for root finding that map the problem to an eigenvalue problem for a nice matrix, and proceed by solving the latter iteratively.

Root finding: All roots with Chebyshev polynomials

A common method for finding roots numerically is Newton's method. It requires that the function $f$ is differentiable. The method takes a starting point $x_0$, which we need to guess and which should be in the vicinity of the root of interest, and iteratively improves this values until a value $x_r$ is found that satisfies $f(x_r)=0$ to high accuracy. In MATLAB, this method is implemented in fzero and in Mathematica, this method is implemented in FindRoot.

Unfortunately, Newton's method is only able to find a single numerical root and rquires a starting value. However, we are often interested in a certain range of values, say an interval $x\in [a,b]$, and want to find all roots within this range. In full generality, this is a hard problem, even numerically. Neither MATLAB nor Mathematic offer a built-in function to solve this problem. (People have collected approaches to finding all roots in Mathematica)

But it turns out that for a class of "sufficiently nice functions", good numerical methods for finding all roots exist. In particular, this is true for functions that can be expanded in terms of Chebyshev polynomials. There is an excellent software package, Chebfun, for MATLAB, which implements this method in a very user-friendly way. With this package, you don't even need to know what a Chebychev polynomial is, and finding all the roots is as simple as calling a function roots!

Root finding: Testing when a matrix becomes singular

From linear algebra, we know that a matrix $A$ is singular (not invertible) if and only if its determinant vanishes, $\det A = 0$. Sometimes, the matrix depends on a parameter, $A\equiv A(t)$, and we want to find the numerical values of $t$ where the matrix becomes singular.

When the matrix $A(t)$ is of low dimension, say $2\times 2$, it seems to me that simply applying the previous root finding techniques to the function $\det A(t)$ works pretty well.

However, for a large matrix, the rank function or the condition number of a matrix are appropriate numerical tools to test whether it is singular. I have not yet figured out how to combine this with the root finding technique, as the rank function is not continuous, and the condition number is always positive, thus does not change sign.