Skip to main content

Section 8.1 Linear operators

Linear operators are first encountered by students in calculus and differential equations, though not always with that name. Consider the problem
\begin{equation*} y^\prime = f(x) \end{equation*}
which can be written
\begin{equation*} \frac{d}{dx} y = f. \end{equation*}
More generally, we learn that any linear differential equation can be written in the form
\begin{equation*} L y = f \end{equation*}
for the differential operator \(L\text{.}\) When we encounter problems like this the first time, we learn a group of recipes that solve the equation for various classes of \(L\text{,}\) functions \(f\) and sets of boundary conditions.
In another set of classes, the linear algebra sequence, we learn an extensive set of tools for analyzing the matrix problem
\begin{equation*} A x = b \end{equation*}
in terms of the structure of the matrix \(A\text{.}\) For example, we learn that the existence of a solution depends on the columnspace of \(A\) and that the structure of the solution set depends on the nullspace (or kernel, thinking of \(A\) as a function) of \(A\text{.}\)
This chapter will examine the relationship between the matrix situation (in finite dimensions) and the linear operator situation (in infinite dimensions). A surprising amount of structure carries from finite to infinite dimensions.

Definition 8.1.1.

If \(V, W\) are vector spaces over a field \(k\text{,}\) a linear operator from \(V\) to \(W\) is a map \(T:V \to W\) such that
\begin{equation*} T(ax + by) = aT(x) + bT(y), \end{equation*}
for all \(a, b \in k\) and \(x, y \in V\text{.}\)
A linear operator on \(V\) is a linear operator from \(V\) to \(V\text{.}\) (Note: these notes largely follow the same conventions as Axler’s Linear Algebra Done Right. There is a difference here - Axler restricts the term operator to only the space \(\mathcal{L}(V)\) - that is, maps from \(V \to V\text{.}\))
If \(V, W\) are normed spaces, we say that a linear operator \(T:V \to W\) is bounded if there exists a constant \(M \geq 0\) such that
\begin{equation*} \norm{Tx} \leq M \norm{x} \end{equation*}
for all \(x \in V\text{.}\) For a bounded \(T\text{,}\) we define the operator norm (or just norm when there is no ambiguity) of \(T\) to be the non-negative number
\begin{equation*} \norm{T} = \sup{\norm{Tx}:x\in V, \norm{x}\leq 1}. \end{equation*}
The kernel of a linear operator \(T:V\to W\) is the subspace of \(V\) given by \(\{x: Tx = 0\}\text{.}\) The kernel of \(T\) is denoted \(\ker T\text{.}\) The range of \(T\) is the subspace of \(W\) given by \(\{Tx: x \in V\}\text{.}\) The range of \(T\) is denoted \(\ran T\text{.}\)
\(\norm{T}\) can be thought of as the largest factor by which \(T\) stretches any vector, though it is a supremum, not a maximum. (What does this notion mean for a square matrix?) It also the case that
\begin{equation*} \norm{Tx} \leq \norm{T}\norm{x} \text{ (why?)} \end{equation*}

Example 8.1.2. the zero operator.

Let \(V, W\) be normed spaces. The map that sends every element of \(V\) to the zero element in \(W\) is a bounded operator of norm \(0\text{.}\)

Example 8.1.3. a multiplication operator.

Choose a function \(f \in C[a, b]\) and define the operator \(M_f\) on \(L^2[a,b]\) by
\begin{equation*} (M_f x)(t) = f(t)x(t). \end{equation*}
It should be immediately clear that \(M_f\) is linear. Recalling the norms on \(C[a,b]\) and \(L^2\text{,}\) for any \(x\in L^2[a,b]\) we get
\begin{align*} \norm{Mx}^2 \amp= \int_a^b \abs{f}^2 \abs{x}^2 \, dt \\ \amp\leq \sup_{t \in [a, b]}\abs{f(t)}^2 \int_a^b \abs{x}^2 \, dt\\ \amp = \norm{f}_\infty^2 \norm{x}^2. \end{align*}
Then \(M_f\) is a bounded operator and \(\norm{M} \leq \norm{f}_\infty\text{.}\) It turns out to be the case that (usefully) \(\norm{M} = \norm{f}_\infty\text{.}\) (Can you prove it?)

Example 8.1.4. an integral operator.

For real numbers \(a, b, c, d\text{,}\) let \(k:[a,b]\times [c,d] \to \C\) be a continuous map, and define \(K:L^2[a,b] \to L^2[c,d]\) by
\begin{equation*} (Kx)(t) = \int_a^b k(t, s) x(s) \, ds \end{equation*}
for \(t \in [c,d]\text{.}\) \(K\) is linear (as integration is). For any \(t\in [c,d]\text{,}\) the Cauchy-Schwarz inequality gives
\begin{equation*} \abs{Kx(t)}^2 = \left(\int_a^b \abs{k(t,s)^2} \, ds\right) \left(\int_a^b \abs{x(s)}^2 \, ds\right). \end{equation*}
Thus \(K\) is bounded and
\begin{equation*} \norm{K} \leq \left(\int_c^d \int_a^b \abs{k(t,s)}\, ds \, ds \right)^{1/2}. \end{equation*}

Example 8.1.5. the shift operator.

Consider the sequence space \(\ell^2\) and define the shift operator \(S\) on \(\ell^2\) by \(S(x_1, x_2, \ldots) = (0, x_1, \ldots)\text{.}\) Since \(\norm{Sx} = \norm{x}\) for all \(x \in \ell^2\text{,}\) we see that \(S\) is an isometry, which implies that \(S\) is bounded. The backwards shift \(S\ad\) is defined by \(S(x_1, x_2, \ldots) = (x_2, x_3, \ldots)\text{.}\) While \(S\ad\) is bounded and \(\norm{S\ad} = 1\text{,}\) it is immediate that \(S\ad\) is not an isometry.
As with linear functionals, continuity and boundedness are equivalent for linear operators on normed spaces.
The proof of this theorem is identical to the proof of Theorem 7.1.7 with absolute values replaced by norms.
In fact, many properties of linear functionals extend to the larger class of linear operators. For normed spaces \(V, W\text{,}\) we denote by \(\L(V, W)\) the set of bounded (and thus continuous) linear operators from \(V \to W\text{.}\) Just as in the case of linear functionals, the bounded linear operators \(\L(V, W)\) form a vector space. If \(V, W\) are normed spaces, we can say more.
For compatible operators, we can work with compositions. We’ll use the usual operator notation \(BA\) to mean \(B \circ A\) where \(A: U \to V\) and \(B: V \to W\text{.}\) So \(BAx = B(Ax)\text{.}\) The norms act in the manner you might suspect.

Proof.

The operator \(BA\) inherits linearity from \(A\) and \(B\) and continuity as a composition of continuous functions. To see the inequality with the norms, for any \(x \in U\text{,}\)
\begin{align*} \norm{BAx} \amp= \norm{B(Ax)} \\ \amp \leq \norm{B}\norm{Ax}\\ \amp \leq \norm{B}\norm{A}\norm{x}. \end{align*}
We’ve shown that
\begin{equation*} \norm{BA} \leq \norm{B}\norm{A}. \end{equation*}
In the case that \(A \in \L(V)\text{,}\) we denote repeated compositions of \(A\) in power notation. Hence \(\underbrace{AA\cdots A}_{n \text{times}} = A^n\text{.}\) As an immediate and useful consequence of Theorem 8.1.8,
\begin{equation*} \norm{A^n} \leq \norm{A}^n. \end{equation*}