Skip to main content

Section 5.1 Geometry in Hilbert space

As we discussed in the introductory chapter, Section 1.3, one of the key ideas of finite dimensional linear algebra is to use a basis for a vector space to move back and forth between the vectors themselves (say \(v \in V\)) and their coordinate representations with respect to those bases. This move lets us consider a vector space of dimension \(n\) as essentially just a renaming of Euclidean space \(\R^n\text{.}\) We are now ready to try to capture this geometry in the context of Hilbert space.

Definition 5.1.1.

Vectors in an inner product space are orthogonal, denoted \(x \perp y\text{,}\) if \(\ip{x}{y} = 0\text{.}\)
A family \((e_\alpha)_{\alpha \in A}\) in \(V - \{0\}\) is called an orthogonal system if \(e_\alpha \perp e_\beta\) when \(\alpha \neq \beta\text{.}\) If further, \(\norm{e_\alpha} = 1\) for all \(\alpha \in A\text{,}\) then \((e_\alpha)_\alpha\) is called an orthonormal system. An orthonormal system is called an orthonormal sequence if it can be indexed by \(\mathbb{N}\text{.}\)
Make note: we have not defined what it means to be a basis yet in the context of Hilbert spaces - that remains to be developed later in this chapter. Note also that a system index by \(\mathbb{Z}\) can be reindexed in terms of \(\mathbb{N}\text{,}\) and so we can take a typical orthonormal system to be indexed by \(\mathbb{N}\) without loss of generality.

Example 5.1.2.

In \(\C^n\) the standard basis vectors constitute an orthonormal system; so does any subset of them.
In \(\ell^2\text{,}\) the standard unit vectors \((e_n)_{n\in\mathbb{N}}\) form an orthonormal sequence, where \(e_n\) has a 1 in the \(n\)th position and 0s elsewhere.
In \(L^2(-\pi,\pi)\text{,}\) one orthonormal sequence \((e_n)_{n \in \mathbb{Z}}\) is given by
\begin{equation*} e_n(t) = \frac{1}{\sqrt{2\pi}} e^{int} \end{equation*}
in the \(L^2\) inner product (2.1.1). An alternative orthonormal sequence in \(L^2(-\pi,\pi)\) is given by the family of functions
\begin{equation*} \frac{1}{\sqrt{2\pi}}, \frac{1}{\sqrt{\pi}} \cos t, \frac{1}{\sqrt{\pi}} \sin t, \frac{1}{\sqrt{\pi}} \cos 2t, \ldots \end{equation*}
These families form the beginnings of Fourier analysis
 1 
en.wikipedia.org/wiki/Fourier_analysis
, which motivates the following definition. (Compare to Theorem 1.3.1.)

Definition 5.1.3.

If \((e_n)\) is an orthonormal sequence in a Hilbert space \(\hilbert\) then for any \(x \in \hilbert\text{,}\) the inner product \(\ip{x}{e_n}\) is the \(n\)th Fourier coefficent of \(x\) with respect to \((e_n)\text{.}\) The Fourier series of \(x\) with respect to \((e_n)\) is the series
\begin{equation*} \sum_{n \in \mathbb{N}} \ip{x}{e_n} e_n. \end{equation*}
At this point, this is only a formal sum in a formal definition. We want to understand the extent to which orthogonal systems can play the role of coordinate systems in finite dimensions, with the ultimate idea of working in coordinates. First, given our definition of orthogonality, we can generalize the Pythagorean theorem.

Checkpoint 5.1.5.

The basic properties of orthogonal expansions are mainly derived from the following geometric identity. Note that it applies to finite orthonormal systems.

Proof.

\begin{equation*} \ip{\sum \la_i e_i}{\sum \la_i e_i} = \norm{\sum \la_i e_i}^2 = \sum \norm{\la_i e_i}^2 = \sum \abs{\la_i}^2 = \sum \la_i \cc{\la_i}. \end{equation*}
Then
\begin{align*} \norm{x - \sum \la_i e_i}^2 \amp = \ip{x - \sum \la_i e_i}{x - \sum \la_i e_i}\\ \amp= \ip{x}{x} - \sum\la_i\ip{e_i}{x} - \sum\cc{\la_i}\ip{x}{e_i} + \sum \la_i \cc{\la_i}\\ \amp= \norm{x}^2 - \sum \la_i \cc{c_i} - \sum \cc{\la_i} c_i + \sum \la_i \cc{\la_i}\\ \amp= \norm{x}^2 + \sum (\la_i \cc{\la_i} - \la_i \cc{c_i} - \cc{\la_i}c_i + c_i \cc{c_i}) + \sum c_i \cc{c_i}\\ \amp= \norm{x}^2 + \sum \ip{\la_i - c_i}{\la_i - c_i} - \sum\abs{c_i}^2\\ \amp= \norm{x}^2 + \sum \abs{\la_i - c_i}^2 - \sum\abs{c_i}^2. \end{align*}
Now suppose that \(x\) and the orthonormal system \(e_i\) are fixed. Varying \(\la_i\) on the expression \(\sum \la_i e_i\) will trace out the linear span of the \(e_i\) (as it runs through all linear combinations). Since \(c_i = \ip{x}{e_i}\) are fixed, Lemma 5.1.6 implies that quantity \(\norm{x - \sum \la_i e_i}\) is minimized when \(\la_i = c_i\) for \(1 \leq i \leq n\) (killing the middle term). This gives the following theorem.
We finish with a corollary that resembles (and recovers) the orthonormal expansion of a vector in \(\R^n\text{.}\)

Checkpoint 5.1.9.

In \(L^2[0,1]\) let \(e_0(t) = 1, e_1(t) = \sqrt{3}(2t -1)\) for all \(t \in (0,1)\text{.}\) Show that \(e_0, e_1\) is an orthonormal system in \(L^2(0,1)\) and show that the polynomial \(y\) of degree 1 which is closest to the function \(x(t) = t^2\) is given by \(y(t) = t - 1/6\text{.}\) What is \(\norm{x - y}\text{?}\)