Section 5 Interpolation
Subsection 5.1 Naive polynomial interpolation
Given a set of points, say \(P(x_1, y_1), Q(x_2, y_2), R(x_3, y_3)\text{,}\) we say that a function \(f\) interpolates \(P, Q, R\) if the graph of \(f\) passes through each prescribed point - that is, \(f(x_i) = y_i\) for each \(i = 1, \ldots 3\) in this case. The function \(f\) can be used to predict the values of \(y\) for values of \(x\) that fall in between these points. (Predicting points outside of the data is called extrapolation.)
A first approach to this problem might take advantage of the following fact. Let \(p_1, \ldots, p_n\) be non-collinear. Then there is a unique polynomial of degree \(n -1\) that interpolates the data.
Theorem 5.1.
So suppose that we are given the points \((1,1), (2,-3), (5,10)\text{.}\) A quick plot shows that they are clearly non-collinear. Thus, there should exist a degree two polynomial that interpolates the points, which will have the general form
The unknowns we need to find are the coefficients \(a, b, c\text{.}\) So we'll plug in the known data and get a system of equations.
We can solve the system using matrices.
Doing this by hand would be potentially very time consuming, particularly if we had to work with many points, not just three. Our result says that the approximate polynomial that interpolates the data is
Note that this polynomial is unique (up to approximation error). That is, any other technique that finds a degree two polynomial through these three points will find the same result.
In the next few sections, we will discuss two more interpolation techniques, one that is very easy for humans to understand (and evaluate with a little modification in certain cases) and one that is easy to program into a computer technique.
Subsection 5.2 Lagrange interpolation
We'll begin with a method due to Lagrange that makes it very easy to write down by hand what the interpolating polynomal is (but doesn't give it in a computationally efficient form). Still, Lagrange interpolation is based on fundamental observations about the graphs of functions.
Suppose that we are given the points \((1,0), (2,0), (5,0)\) and we are asked to provide an interpolating polynomial of degree two (these points are collinear, but they very soon won't be). One obvious choice is to form the polynomial
This polynomial is not a unique polynomial of degree two, but it certainly interpolates them. We are going to use the idea of interpolating roots, as we've done here, to build interpolating functions for data off of the \(x\)-axis.
This exercise can be made slightly harder by allowing one of the points to move off of the \(x\)-axis. Now consider the points \((1,0),(2,0), (5,10)\text{.}\) Ignoring the third point for the moment, we can still interpolate \((1,0), (2,0)\) in the same way as before: with \(p(x) = (x -1)(x-2)\text{.}\) Maybe we're lucky and \(p\) also passes through \((5,10)\text{.}\) We can check -- \(p(5) = (5-1)(4-1) = 12\text{.}\) That is, \(p\) misses \((5,10)\text{.}\)
What we need is a way to adjust \(p\) that keeps the function passing through \((1,0)\) and \((2,0)\text{.}\) We might notice that every function of the form \(f(x) = c(x-1)(x-2) = c\cdot p(x)\) has this property, so let's choose a value of \(c\) that makes \(f\) pass through \((5,10)\text{.}\)
So \(f(x) = \frac{10}{12} p(x) = \frac{10}{12} (x-1)(x-2)\) interpolates the data.
Now, notice that 10 was the value we wanted to be output with input 5, and that \(p(5) = 12\text{.}\) In fact, if \((5,10)\) had been given the name \((x_3, y_3)\text{,}\) then the value of \(c\) would have been \(c = \frac{y_3}{p(x_3)},\) and the interpolating function for \((1,0), (2,0)\) and \((x_3, y_3)\) would have been
We can modify this further: given initial data \((x_1, 0), (x_2, 0), (x_3, y_3)\text{,}\) define
and
\(f\) is the interpolating function for the data.
We're finally ready to tackle the full problem: using this approach to find the interpolating polynomial for \((1,1), (2, -3), (5,10)\text{.}\) We'll approach the problem in three pieces: First, construct \(f_3(x)\) for the points \((1,0), (2,0), (5,10)\) using the ideas from the last section.
Second, construct \(f_2(x)\) for the points \((1,0), (2, -3), (5,0).\) Since \(p_2(x) = (x-1)(x-5)\text{,}\) we get \(c = \frac{-3}{p_2(2)} = \frac{-3}{-3} = 1\text{.}\) So
interpolates this data.
For a third step, construct \(f_1(x)\) for the points \((1,1), (2,0), (5,0)\text{.}\) In this problem, \(p_1(x) = (x-2)(x-5)\text{,}\) and so \(c = \frac{1}{p(1)} = \frac{1}{4}\text{.}\) Thus,
What follows is a plot of the three solutions to the subproblems.
You should notice that each parabola passes through one of the real points and two of the substituted roots. We are at the magic step. Let
What happened? Let's look at \(f\) more closely.
Notice that for a given input for one of the data points, only one term is non-zero, and that term has been built to evaluate to the correct \(y\)-value. That is,
Furthermore, \(f\) must be unique, because \(f\) is a degree two polynomial. (In fact, \(f\) is the same function from the previous section if you multiply it out.)
So what is the general procedure? Start with a list of interpolants, \((x_1, y_1), \ldots, (x_n, y_n)\text{.}\) For each \(i\text{,}\) let
Then define \(f_i\) to be
and finally the Lagrange interpolating polynomial to be
Subsection 5.3 Newton interpolation
Lagrange interpolation is easy to describe and relatively easy to write out by hand, even for many point, with some practice. However, the computation of the Lagrange interpolating polynomial doesn't easily lend itself to the structure of computer programs. We would like a method for writing down an interpolating polynomial that can be performed in a straightforward way in a recursive fashion. The approach that we present here is called Newton interpolation. Of course, the result will be the same, as interpolating polynomials of degree \(n-1\) are unique for a given set of \(m\) points.
Suppose that we're given two points, \((x_0, y_0), (x_1, y_1)\text{.}\) The lowest degree polynomial \(f\) through the first point is the horizontal line \(f(x) = y_0 = f(x_0)\text{.}\) The lowest degree polynomial \(f\) through the two points is
We call the quantities \(y_0\) and \(\frac{y_1 - y_0}{x_1 - x_0}\) divided differences. So now lets see what happens with three points.
Suppose that we're given three points to interpolate, \((x_0,y_0), (x_1, y_1), (x_2, y_2)\text{.}\) We claim that we can construct a unique quadratic polynomial that interpolates the data of the form
for some as yet unknown constants \(b_0, b_1, b_2\) that depend on the points. (You can easily show that for noncollinear points, you get a nonsingular coefficient matrix for the resulting system, which implies a unique solution.)
To find the values of the coefficients, we'll plug in our data points.
and so \(b_0 = f(x_0) = y_0\text{.}\) Moving to \(x_1\text{,}\)
Notice that so far, we've exactly reproduced the line connecting the first two points. Now let's look at \(b_2\text{.}\)
where the last step requires a bit of non-obvious algebra, but provides a more useful form.
It turns out that this process can be repeated for additional points, though one can imagine the formula for the next step is quite complicated to express in \(x_i, y_i\text{.}\) So we introduce a notation that will make this process easy to write in recursive form. The \(m\)th divided difference is given by the recursive formula
For example, in this notation,
and you should convince yourself that the expression for \(f[x_2, x_1, x_0]\) agrees with our previous computation.
We want to use this approach because we'd prefer our computations to be purely in terms of array entries, not the application of general formulas. As an example, we will trace through the example of computing the coefficients of the Newton polynomial for the points \((1,1), (2, 3), (5,10)\) in code.
The code above should provide an example of how to implement the recursion without having to rely on increasingly complex formulas to compute the coefficients. As a goal, one could try to produce all of the data in the matrix “data” in one loop, instead of the many loops that I have (suggestively) used. One could also write a much more revealing computation of the final polynomial.
Subsection 5.4 Problems with polynomials
Polynomial interpolation has a serious drawback in many cases: high degree polynomials can behave wildly between points in a way that doesn't reflect the expected behavior of the underlying function. To see this, consider the function
which is a typical example of a rational function with no vertical asymptotes.
We will sample this function and then attempt to reconstruct it using polynomial interpolation. Suppose that we choose six equally spaced points from \(f\text{.}\)
Now suppose we find the unique polynomial that goes through these six points. (The code here uses the command polyfit.) Notice that the interpolating polynomial doesn't appear to behave like the original function at all, oscillating through the end points. Perhaps we can fix the problem by choosing more points.Unfortunately, this seems to have exacerbated the problem. The polynomial starts to oscillate wildly and to grow sharply between nodes near the outside of the domain. It turns out that this behavior is common in high degree polynomial interpolation of equally-spaced points. Replacing a complicated function with a polynomial interpolant isn't going to be as easy as sampling equidistant points, and we should expect that similar bad behavior is going to crop up in data sets with many points if we use a unique polynomial fit.
There are many approaches to dealing with this undesirable behavior. In the next section, we'll talk about building smooth curves that pass through all of the points by considering them three at a time. In this section, we should at least be aware of better sampling methods that underlie much of modern approximation theory. Note that this is only a flavor - approximation theory is a huge field with myriad applications and techniques.
One way that we can try to make a better approximating polynomial for a function is to sample more points near the edge - this should allow us to control the bad behavior that seems to occur there. But how should we pick the points? For various reasons, it turns out that a natural choice for sampling nodes (that is, the \(x\)-values that we'll use to sample the function we're trying to approximate) is the so-called Chebyshev nodes, which correspond to the \(x\) coordinates of equally spaced points on a unit circle.
The following code will compare high degree approximations of \(f\) using equally spaced and Chebyshev nodes
Subsection 5.5 Spline interpolation
Instead of trying to force a high degree polynomial to fit through many points, if we want an interpolating function that has smooth properties, we could consider a piecewise approach instead. That is, instead of a single formula that interpolates all of the points, we'll construct small segments that join together into a smooth(ish) function that changes definition as we move down the set of data points. Piecewise interpolation is usually done in early math classes with a ruler connecting points with lines - a more sophisticated name for this is linear interpolation. Because a given line only has two parameters (slope and intercept), the best we can expect is to meet two conditions - that is, the line passes through each of two points.
This is reflective of a general principle in mathematics that gets used but not often mentioned throughout college level courses - to meet one condition, an equation or set of equations needs one free parameter. In this case, since we're working with simultaneous equations, essentially we're using the invertible matrix theorem from linear algebra to guarantee a unique solutions.
What we're going to present is a method that breaks a set of data points into consecutive groups of two points. We'll need an interpolating function piece with at least two free parameters that can be chosen. Since we want the curve to be differentiable, we'll need each segment to meet the next with the same slope - that is, we need the first derivatives to match every time we change segments. With \(n+1\) points, that means we have \(n-1\) points that are connections (that is, not the endpoints of the data set). Since we're dictating a condition on derivatives, we know that we'll need at least three parameters. In order to make the changes between the segments curve naturally (which is important in many physical applications), we also want the second derivatives to match. (This is essentially dictating that the curvatures of the segments match, or more geometrically that there is an osculating circle where two pieces meet.) So each segment is actually dealing with four restrictions, and thus we need a function with four free parameters - a cubic polynomial fits the bill!
Where are we at? Suppose that we have \(n + 1\) points to interpolate. Then there are \(n\) segments, which requires \(n\) functions. Each segment has an associated cubic polynomial with 4 parameters, so there are \(4n\) parameters available. How many restrictions are there?
- Two points to interpolate for each segment, so \(2n\) conditions.
- \(n-1\) points at which derivatives match.
- \(n - 1\) points at which second derivatives match.
So far, we have \(4n\) parameters and \(2n + n - 1 + n - 1 = 4n -2\) conditions. Where are the last two conditions going to come from? The endpoints! Since we have two available conditions to impose, we'll get a nice square system if we impose them on the endpoints. There are a couple of common choices for how to do this. The first is called a natural spline, which is the case where the function has no curvature at the endpoints - that is, it's locally linear:
In some physical applications, the designer of the spline might want to have prescribed slopes at the endpoints instead. That is,
Let's take a look at a small example to see how spline fitting works in practice. Suppose we're given three points, \((1,2), (2,4), (3,1)\text{.}\) Note that these are arranged in order of ascending \(x\) values. For each pair of points, we need a cubic spline - that is functions
We'll organize the equations as above. First, we have the interpolating conditions (that is, the functions need to pass through the points.)
Second, we want first derivatives to match at the transition point at \((2,4)\text{.}\)
Third, we want second derivatives to match at \((2,4)\text{.}\)
Finally, we impose the endpoint condition for a natural spline, which is that \(f''(x_1) = f''(x_n) = 0\text{:}\)
There are various algebraic approaches to solving splines that we will not consider here. Instead, we will try to organize these equations into a matrix form.
At this point, we can solve the system using octave and row reduction. As long as there are no three consecutive points that are collinear, this system will have a unique solution (that is, the coefficient matrix is invertible).
Next is a quick code demonstration of the addition of one more points, say at \((4,10)\text{.}\) You should look for the pattern in the equation sets, as this should give a clue about how to implement a general routine. By way of comparison, we will also plot the unique cubic that passes through the four points. You should note that the purple graph makes wider oscillations between the points than the multi-colored spline fit.