A Neat Bound on the Matrix Exponential

Jan 5, 2022

tags: math, analysis

Pre-requisites: Linear algebra, ODEs

Background

The exponential is one of the most fundamental functions in mathematics, and especially in differential equations. Using Taylor series, it is possible to prove that

e^x = \sum_{k \geq 0} \frac{x^k}{k!} \tag{1}

for any $x \in \mathbb{R}$ . This sum is absolutely convergent for every $x$ , so it also makes sense even in situations where $x$ is not a real number. This is, for instance, how $e^z$ is defined when $z$ is a complex number, giving rise to the meme-worthy formula

e^{\pi i} = -1.

This sum is also how you define the exponential of a (square) matrix.

The Matrix Exponential

The exponential of a real number shows up literally everywhere when studying differential equations with a single variable, such as

\frac{\mathrm{d}}{\mathrm{d}t}f(t) = t.

In the same way, the matrix exponential shows up all over the place when studying systems of differential equations, such as

\frac{\mathrm{d}}{\mathrm{d}t}f_0(x) = f_1(x)\\ \frac{\mathrm{d}}{\mathrm{d}t}f_1(x) = f_0(x)

In particular, the matrix exponential shows up in linear differential equations. Take for instance the following initial value problem $w(t) \in \mathbb{R}^n$ , and $A \in M_{n \times n}(\mathbb{R})$ .

\frac{\mathrm{d}}{\mathrm{d}t} w(t) = A w(t)

w(0) = w_0

This has the (unique) solution

w(t) = e^{A t} w_0.

This solution can be verified by replacing $e^{A t}$ with its defining Taylor series, and differentiating term by term. Estimates of the matrix exponential help mathematicians to understand the behavior of vector-valued differential equations.

Preliminary Bounds

My goal in this post is to present a neat bound on the matrix exponential, but first I think it is prudent to go through some alternative bounds on the exponential. Because the following discussion requires us to use eigenvalues and eigenvectors, which may be complex, we naturally will naturally work with matrices in $M_{n \times n}(\mathbb{C})$ . For $A \in M_{n \times n}(\mathbb{C})$ , let $\|A\|$ denote its spectral norm.

Within the complex numbers, exponential is very well-understood. Euler's famous formula gives that:

e^z = e^{\Re z}\left(\cos(\Im z) + i\sin(\Im z)\right)

In particular,

|e^z| = e^{\Re z} \leq e^{|z|} \tag{2}

Now, consider exponentials in $M_{n \times n}(\mathbb{C})$ . We get the basic bound

\|e^A\| = \left\|1 + \frac{x}{1!} + \frac{x^2}{2!} \dots \right\|

\leq 1 + \frac{\|x\|}{1!} + \frac{\|x\|}{2!} \dots \leq e^{\|x\|}

Thus,

\|e^A\| \leq e^{\|A\|} \tag{3}

This bound can be pretty lousy at times, but it is a start. For instance, if $A = -t 1$ , where $1$ is the identity matrix, then $\|e^A\| = e^{-t}$ . In constrast, the above bound only guarantees that $\|e^A\| \leq e^t$ .

For certain matrices, we can compute $e^A$ exactly, and use this to get a much more exact bound. Let $A$ be a diagonal matrix:

A = \begin{pmatrix} a_1 & 0 & \dots\\ 0 & a_2 & \dots\\ \vdots & \vdots & a_n \end{pmatrix}

Then, the matrix exponential is exactly

e^A = \begin{pmatrix} e^{a_1} & 0 & \dots\\ 0 & e^{a_2} & \dots\\ \vdots & \vdots & e^{a_n} \end{pmatrix}

and we have the following bound

\|e^A\| = \max_{i=1 \dots n} e^{\Re a_i} \tag{4}

If $A$ is diagonalizable, then we can get a bound similar to (4). Suppose that $A = T D T^{-1}$ , where $D$ is a diagonal matrix. Then,

e^A = e^{T D T^{-1}} = T e^D T^{-1}

The diagonal entries of $D$ are exactly the eigenvalues of $A$ . Let $\lambda_1 \dots \lambda_n$ be the eigenvalues of $A$ , and define

\Lambda(A) = \max_{i=1 \dots n} \Re \lambda_i.

It can be shown that $\Lambda$ is a continuous function $M_{n \times n}(\mathbb{C}) \rightarrow \mathbb{R}$ . With this notation, we can write the approximation:

\| e^A \| \leq \|T\|\|T^{-1}\| e^{\Lambda(A)} \tag{5}

More generally, any square matrix $A$ is similar to a matrix $J$ in Jordan canonical form, $A = T J T^{-1}$ . The exponential $e^J$ can be computed exactly, yielding the bound:

\|e^A\| \leq C_n \|T\|\|T^{-1}\|(1+\|A\|^{n-1}) e^{\Lambda(A)} \tag{6}

for some constant $C_n > 0$ that depends only on $n$ , the dimension. This bound, although very useful, is not entirely satisfying because it depends on $T$ , which may be very hard to compute and may be arbitrarily large.

A Pretty Good Bound

It turns out that:

\|e^A\| \leq C_n (1+\|A\|^{n-1}) e^{\Lambda(A)} \tag{7}

The constant $C_n$ may be quite large, but it is at most exponential in $n$ . Since $\Lambda$ is a continuous function, this bound is actually continuous in $A$ , which is why I like it so much.

For a period of time, I was very interested in understanding the matrix exponential, but I had trouble finding any bounds similar to (7). This is one of the reasons why I'm making this blog post. Eventually, with the help of Math Stack Exchange, I received an answer from user Anton Malyshev. The proof that I present here is a paraphrasing of his proof.

Proof: If $n = 0$ , then this bound is clearly true. Proceed by induction on $n$ .

Let $A \in M_{n \times n}(\mathbb{C})$ . Let $v$ be a unit eigenvector of $A$ with eigenvalue $\lambda$ . Without loss of generality, suppose that $\Lambda(A) = \Re \lambda$ . Let $V$ be the 1-dimensional subspace of $\mathbb{C}^n$ spanned by $v$ , and let $W$ be the orthogonal subspace to $V$ . Note that $\dim W = n-1$ . Find $b \in V$ and $D: V \rightarrow V$ such that, for $w \in W$ ,

A w = (b \cdot w) v + D w

Informally, we could write $A$ in block matrix form as:

A = \begin{pmatrix} \lambda & b \\ 0 & D \end{pmatrix}

To compute $e^A$ , consider the initial value problem

u(0) = u_0

\frac{\mathrm{d}}{\mathrm{d}t}u(t) = A u(t) \tag{8}

The (unique) solution to this differential equation is:

u(t) = e^{A t} u_0

Thus, to bound $e^A$ , it suffices to bound $u(1)$ .

We next project equation (8) onto $V$ and $W$ . Let $u(t) = u_V(t) + u_W(t)$ where $u_V(t) \in V$ and $u_W(t) \in W$ . Then, (8) becomes the following:

\begin{cases} \frac{\mathrm{d}}{\mathrm{d}t} u_V(t) = \lambda u_V(t) + (b \cdot u_W(t)) v\\ \frac{\mathrm{d}}{\mathrm{d}t} u_W(t) = D u_W(t) \end{cases} \tag{9}

Now, $u_W(t)$ satisfies a simple linear equation. We can thus solve for $u_W(t)$ exactly.

u_W(t) = e^{D t} u_W(0)

By induction, we have the bound:

\|u_W(t)\| \leq C_{n-1} (1 + \|D\|^{n-2} t^{n-2}) e^{\Lambda(D)t} \|u_W(0)\|

Note that $\|D\| \leq \|A\|$ , $\Lambda(D) \leq \Lambda(A)$ , and $\|u_W(0)\| \leq \|u_0\|$ . This yields the following, somewhat simpler bound:

\|u_W(t)\| \leq C_{n-1} (1 + \|A\|^{n-2} t^{n-2}) e^{\Lambda(A) t} \|u_0\| \tag{10}

Bounding $u_V(t)$ is trickier, but possible. Since $u_V(t)$ is a multiple of $V$ , we can define $\alpha: \mathbb{R} \rightarrow \mathbb{C}$ by: $u_V(t) = \alpha(t) v$ . Plugging this in to (9) gives us the differential equation:

\frac{\mathrm{d}}{\mathrm{d}t} \alpha(t) = \lambda \alpha(t) + b \cdot u_W(t) \tag{11}

The solution to to this differential equation can be approximated using integrating factors. Define $\beta(t) = e^{-\lambda t} \alpha(t)$ . Some routine calculations follow...

\frac{\mathrm{d}}{\mathrm{d}t} \beta(t) = e^{- \lambda t}b \cdot u_W(t)

\beta(t) = \int_0^t e^{-\lambda s} b \cdot u_W(s) \mathrm{d} s + \beta(0)

\alpha(t) = e^{\lambda t} \int_0^t e^{-\lambda s} b \cdot u_{W}(s) \mathrm{d} s + e^{\lambda t}\alpha(0)

\|u_V(t)\| \leq e^{\Lambda(A) t} \int_0^t e^{-\Lambda(A) s} \|A\| \|u_W(s)\| \mathrm{d} s + e^{\Lambda(A) t} \|u_0\|

Excellent! We can now use the bounds for $u_W(t)$ from equation (10).

\|u_V(t)\| \leq C_{n-1} \|u_0\| \|A\| e^{\Lambda(A) t} \int_0^t (1 + \|A\|^{n-2} s^{n-2}) \mathrm{d}s + e^{\Lambda(A) t} \|u_0\|

This can be integrated by hand, and approximated by:

\|u_V(t)\| \leq C (1 + \|A\|^{n-1} t^{n-1}) e^{\Lambda(A) t} \|u_0\|

This completes the proof. $\square$ .

Final Comments

The bound (7) doesn't always give great results, especially if $A$ is close to a diagonal or Hertitian matrix. Bound (7) can be improved to:

\|e^A\| \leq C_n(1 + \|A - A^*\|^{n-1}) e^{\Lambda(A)}

\|e^A\| \leq C_n(1 + \|A + A^*\|^{n-1}) e^{\Lambda(A)}

Here, $A^*$ represents the adjoint of $A$ . These bounds can be further refined, but they become more messy.