Section 5.4 Complete orthonormal sequences
We are now ready to establish a generalization of coordinates and basis appropriate to Hilbert space. That is, we would like to make sense of finding a coordinate expression for a vector \(x\) in a Hilbert space \(\hilbert\text{.}\) The primary geometric difficulty that we have to wrestle with is that we have infinite “directions” in which extra stuff can hide. We need to know that the orthonormal sequence we’re expanding against doesn’t leave any directions hidden.
Given an orthonormal sequence
\((e_n)\text{,}\) we would like to write (in parallel with
Theorem 1.3.1)
\begin{equation*}
x = \sum_{n=1}^\infty \ip{x}{e_n} e_n.
\end{equation*}
Theorem 5.2.1 and
Theorem 5.2.3 guarantee that the right hand side makes sense; the series converges to some vector in
\(\hilbert\text{.}\) But there is no way to know without further assumptions on the orthonormal sequence that there aren’t directions that are “hidden” that might make the equation false. That is, we don’t know if the limit of the right hand side is actually
\(x\text{.}\)
The issue of when a Fourier series is equal to the object that generates it is delicate enough that in practice, mathematicians often use \(\sim\) instead of \(=\) to denote the relationship; that is,
\begin{equation*}
x \sim \sum_{n=1}^\infty \ip{x}{e_n} e_n
\end{equation*}
means that \(x\) has the given Fourier representation, but makes no claims on equality (whatever that might mean).
We can see the issue if we look at an expansion of a vector \(x\in \ell^2\) with respect to the standard orthonormal sequence (much like the standard unit vectors) where \(e_n\) has a 1 in the \(n\)th position and 0s elsewhere. From this, construct a shifted sequence \(f_n = e_{n+1}\text{.}\) Then \((f_n)_{n\in\mathbb{N}}\) remains an infinite orthonormal sequence in \(\ell^2\text{.}\) Now choose \(x = (x_n) \in \ell^2\text{.}\) Then
\begin{align*}
\sum_{n=1}^\infty \ip{x}{f_n}f_n \amp= \sum_{n=2}^\infty \ip{x}{e_n} e_n\\
\amp = (0, x_2, x_3, \ldots) \neq x.
\end{align*}
More generally, let us look at the error between \(x\) and an orthonormal expansion in terms of an orthonormal sequence \((e_n)\text{:}\)
\begin{equation*}
y = x - \sum_{n=1}^\infty \ip{x}{e_n}e_n.
\end{equation*}
Then for each \(j \in \mathbb{N}\text{,}\) linearity of the inner product gives
\begin{equation*}
\ip{y}{e_j} = \ip{x}{e_j} - \sum_{n=1}^\infty \ip{x}{e_n}\ip{e_n}{e_j} = 0.
\end{equation*}
That is, the vector \(y\) is orthogonal to each member of the orthonormal sequence. (This property is used to compute the distance from a vector to a subspace in finite dimensions.) So if we know that the only vector orthogonal to every \(e_n\) is the zero vector, then we can infer that \(y = 0\text{,}\) which makes the representation valid.
Definition 5.4.1.
An orthonormal sequence \((e_n)\) in a Hilbert space \(\hilbert\) is complete if the only member of \(\hilbert\) which is orthogonal to every \(e_n\) is the zero vector.
We are ready to prove the Hilbert space analogue of
Theorem 1.3.1.
Theorem 5.4.2.
Let \((e_n)_{n\in\mathbb{N}}\) be a complete orthonormal sequence in a Hilbert space \(\hilbert\text{.}\) For any \(x \in \hilbert\text{,}\)
\begin{equation*}
x = \sum_{n=1}^\infty \ip{x}{e_n}e_n
\end{equation*}
and
\begin{equation*}
\norm{x}^2 = \sum_{n=1}^\infty \abs{\ip{x}{e_n}}^2.
\end{equation*}
Proof.
As above, compute
\begin{equation*}
y = x - \sum_{n=1}^\infty \ip{x}{e_n}e_n.
\end{equation*}
Then for each \(j \in \mathbb{N}\text{,}\) linearity of the inner product gives
\begin{equation*}
\ip{y}{e_j} = \ip{x}{e_j} - \sum_{n=1}^\infty \ip{x}{e_n}\ip{e_n}{e_j} = 0.
\end{equation*}
That is, the vector \(y\) is orthogonal to each member of the orthonormal sequence. Since by hypothesis \((e_n)\) is complete, we deduce the first statement.
\begin{equation*}
\norm{\sum_{n=1}^N \ip{x}{e_n}e_n}^2 = \sum_{n=1}^N \abs{\ip{x}{e_n}}^2.
\end{equation*}
On letting \(N\to\infty\text{,}\) continuity of the norm gives the second statement.
We have finally arrived at the notion of a basis in a Hilbert space (indexed by
\(\mathbb{N}\)). A complete orthonormal sequence in
\(\hilbert\) is also called an
orthonormal basis of
\(\hilbert\text{.}\) The issue with identifying a basis is typically not orthogonality or norm, but instead completeness. After all, in infinite dimensions, there are lots of places to hide. (The study of bases for infinite dimensional spaces is complicated; see this article on the notion of a
Schauder basis for a first look into the details.)
In finite dimensions, a basis is a minimal spanning set - that is, the span of a basis is the entire space. A similar fact holds in Hilbert space, with the caveat that we have to work with the closed linear span.
Theorem 5.4.3.
Let \((e_n)_{n\in\mathbb{N}}\) be an orthonormal sequence in a Hilbert space \(\hilbert\text{.}\) The following are equivalent:
\((e_n)\) is complete;
\(\cl \spn \{e_n: n \in \mathbb{N}\} = \hilbert\text{;}\)
\begin{equation*}
\norm{x}^2 = \sum_{n=1}^\infty \abs{\ip{x}{e_n}}^2.
\end{equation*}
Proof.
\((1) \Rightarrow (2)\) and
\((1) \Rightarrow (3)\) follow immediately from
Theorem 5.4.2. As series convergence is convergence of the sequence of partial sums, and each partial sum is in
\(\cl \spn \{e_n: n \in \mathbb{N}\}\text{,}\) closure gives
\begin{equation*}
\sum_{n=1}^\infty \ip{x}{e_n} e_n \in \cl \spn \{e_n: n \in \mathbb{N}\}
\end{equation*}
for all \(x \in \hilbert\text{.}\)
\((3) \Rightarrow (1)\text{:}\) Suppose that \((e_n)\) is not complete, which is to say that there exists some non-zero vector \(y\) with \(y\perp e_n\) for all \(n\text{.}\) Then \(\norm{x} \neq 0\text{,}\) but \(\ip{x}{e_n} = 0\) for all \(n\) and so
\begin{equation*}
\sum_{n=1}^\infty \abs{\ip{x}{e_n}}^2 = 0,
\end{equation*}
which contradicts \((3)\text{.}\)
\((2) \Rightarrow (1)\text{:}\) Suppose that \((2)\) holds. Let \(x \in \hilbert\) be any vector that is orthogonal to every \(e_n\text{.}\) Now construct the set of vectors orthogonal to \(x\text{:}\)
\begin{equation*}
\M = \{y: \ip{x}{y} = 0\}.
\end{equation*}
Linearity of the inner product gives that \(\M\) is a subspace of \(\hilbert\text{,}\) and continuity of the inner product gives that \(\M\) is closed. Also, by assumption, every \(e_n \in \M\text{,}\) and thus \(\cl \spn\{e_n\} \subseteq \M\text{.}\)and so (2) implies that \(\hilbert = \M\text{.}\) But then in particular \(x \in \M\text{,}\) and so \(\ip{x}{x} = 0\) by the definition of \(\M\text{,}\) which gives that \(x=0\text{.}\) Thus, \((e_n)\) is complete.
While most of the commonly encountered Hilbert spaces possess complete orthonormal sequences, some do not. Fortunately, similar statements to the above apply to orthonormal systems as well, though working with uncountably indexed sets is not something that most readers will encounter until graduate real analysis. It will be sufficient for us to mostly restrict our attention to the rather more easily conceptualized case of sequences.
Definition 5.4.4.
A Hilbert space is called separable if it contains a complete orthonormal sequence (indexed by \(\mathbb{N}\) or finite).
In the same way that we know from linear algebra that every (complex) finite dimensional Hilbert space is isomorphic to
\(\C^n\text{,}\) so that we think of “one” vector space of a given dimension with different labelings (see
Section 1.3), we have already met
the one separable Hilbert space of infinite dimension, namely
\(\ell^2\) (and at last our suspicions are confirmed). The next set of arguments will make this rigorous.
Definition 5.4.5.
A mapping \(U: \hilbert \to \K\) between Hilbert spaces \(\hilbert, \K\) is a unitary operator if it is linear and bijective and preserves inner products. That is, \(U\) satisfies
\begin{equation*}
\ip{Ux}{Uy} = \ip{x}{y}
\end{equation*}
for all \(x, y \in \hilbert.\)
Theorem 5.4.6.
Let \(\hilbert\) be a separable Hilbert space. Then either \(\hilbert\) is isomorphic to \(\C^n\) for some \(n \in \mathbb{N}\) or \(\hilbert\) is isomorphic to \(\ell^2\text{.}\)
Proof.
Suppose that \(\hilbert\) contains a finite complete orthonormal sequence \(e_1, \ldots e_n\text{.}\) For any \(x \in \hilbert\text{,}\) find the error vector
\begin{equation*}
y = x - \sum_{i=1}^n \ip{x}{e_i}e_i.
\end{equation*}
Linearity of the inner product gives \(y \perp e_i\) for all \(i\) and so \(y=0\text{.}\) Thus \(e_1, \ldots, e_n\) is an orthonormal basis for \(\hilbert\text{.}\) Now define the operator \(U: \hilbert \to \C^n\) by
\begin{equation*}
U(\la_1 e_1 + \la_2 e_2 + \ldots + \la_n e_n) = (\la_1, \ldots , \la_n).
\end{equation*}
Then \(U\) is linear and bijective. From the fact that \(\norm{x}^2 = \sum_{i=1}^n \abs{\ip{x}{e_i}}^2\text{,}\) we infer that \(\norm{Ux} = \norm{x}\) for all \(x \in \hilbert.\) Thus \(U\) is a unitary operator, and so \(\hilbert\) is isomorphic to \(\C^n\text{.}\)
On the other hand, suppose that
\(\hilbert\) contains a complete orthonormal sequence
\((e_n)_{n \in \mathbb{N}}\text{.}\) Define the operator
\(U: \hilbert \to \ell^2\) by
\(Ux = (\la_n)_{n\in \mathbb{N}}\) where
\(\la_n = \ip{x}{e_n}\text{.}\) By
Theorem 5.4.3 (3),
\(Ux \in \ell^2\) and moreover
\(\norm{Ux} = \norm{x}\) for all
\(x \in \hilbert\text{.}\) As
\(U\) is defined in terms of the inner product, it is clearly linear. If
\((\la_n) \in \ell^2\text{,}\) then by
Theorem 5.2.3 the series
\(\sum \la_n e_n\) converges to a point in
\(\hilbert\) and by definition
\(Ux = (\la_n)\text{.}\) Thus
\(U\) is surjective. Hence
\(U\) is a unitary operator, and so
\(\hilbert\) is isomorphic to
\(\ell^2\text{.}\)
As in undergraduate linear algebra, one might feel a bit of a letdown here. However, just as in finite dimensions, taking different views of the same space provides valuable insight into the objects being studied. In finite dimensions, for example, the space of polynomials of degree at most \(n\) is isomorphic to \(\C^n\) but is also a set of functions with properties that interact with the geometry of \(\C^n\text{.}\) Both the vector space properties and the functional properties are important - what the isomorphism provides is a set of properties that the space carries by relation with Euclidean space. In the same way, while all separable Hilbert spaces can be thought of as \(\ell^2\) in some sense, often what we are interested in is the relationship between the functions themselves as functions and the geometry induced by their Fourier (coordinate) representations.