Wednesday, July 24, 2013

The Heat Equation and Fourier Series

Disclaimer: As of this post, TeX the World and other LaTeX rendering software will not be required to view LaTeX code entered in blog posts. I have switched to MathJax which is supported by Blogger. MathJax also interfaces a lot better than TeX the World anyway. No more nasty white boxes around LaTeX code!

As anyone in engineering, mathematics or the mathematical sciences knows, the name Fourier comes up a lot in various forms. Among my friends and colleagues, it is no secret that I am a huge fan of Fourier theory. In fact, I will be writing a blog post on some work I have done in it in the not-too-distant future. Often the first place it is encountered is a course in partial differential equations as a means by which to solve boundary value problems for Laplace's equation, the wave equation and the heat equation on rectangular domains. In fact, this is what they were used for initially in the early days of partial differential equations. Fourier series are named after Joseph Fourier - a French mathematician and physicist who lived from 1768 to 1830. It was only in the time of the second decade in the nineteenth century that calculus started to be formalized as a mathematical subject by Weierstrass with significant contributions from Cauchy and others. Prior to that, the notion of limits was on extremely shaky foundation and many mathematicians scoffed at calculus (or "the calculus" as it was called at the time and still is today by some authors) because it lacked rigor and was developed first by Newton who was clearly not a mathematician. Fourier published his work on what are now known as Fourier series in 1822 and would not find himself amidst as much criticism and debate as Newton (who was criticized not only for putting forth a very handwaved piece of mathematics but also for more political reasons regarding "the calculus"). However much mystery surrounded Fourier series for almost a century and a half despite extensive amounts of time spent by the world's most brilliant mathematicians. I will not attempt to give a full course on Fourier theory as this can be found in many books and it would take a lot of building blocks to do in full detail. Instead, I am going to give a bird's eye view of Fourier series and comment on where things can go wrong. I will then discuss the inherent limitations of Fourier series and how it leads into Fourier transform theory.

In his study of the transfer of heat, Fourier discovered that heat transfer on a line obeyed the following equation

$$ \frac{\partial v}{\partial t} = k\frac{\partial^2 v}{\partial x^2}, $$

where \(v \) is the temperature and \(k \) is a constant known as the thermal diffusivity and can be calculated for a certain material. This equation cannot be solved exactly without knowing additional information. Much like in the case of ordinary differential equations, initial values or boundary values must be given in order to solve a partial differential equation or some mixture thereof.

Since this equation is supposed to model temperature flow, it is good to consider physical examples. Physically, we can imagine a thin wire of length L that is heated up - perhaps a straight filament in a light bulb - and on each end, it is in contact with something that doesn't change much in temperature. So at the endpoints, the temperature is roughly fixed and is given by \( T_0 \). So we have that \( v(0,t) = T_0 = v(L,t) \). Working with the constant \(T_0\) requires a little bit more finesse and work so instead we can define the new temperature function \(u(x,t) = v(x,t) - T_0 \). It is clear from the properties of derivatives that \( u \) also solves the heat equation but now has boundary conditions given by \( u(0,t) = 0 = u(L,t) \).

Since we have two spatial derivatives, we need two boundary values in order to partially specify the solution but we also need the initial temperature distribution of the thin wire in order to have an accurate description of how the temperature will change over time. It stands to reason that if two identical filaments were heated up initially in different ways that their temperature distribution would be different over time and so we must specify initial temperature distribution as well. So we also have that \( u(x,0) = f(x) \). It turns out that these three pieces of information are all that is needed to specify the solution u and we will now turn our eyes to a solution.

Due to the stunning lack of general solutions to partial differential equations (that still persists to this today!), Fourier took some liberty with mathematics and made some simplifying assumptions. First he postulated that the time and spatial dependence could be separated so that \(u(x,t) = X(x)T(t)\). When making this substitution, the partial differential equation becomes

$$ X(x)T'(t) = k X''(x)T(t). $$

If we divide the entire equation by by \(kX(x)T(t)\) we get

$$ \frac{1}{k}\frac{T'(t)}{T(t)} = \frac{X''(x)}{X(x)}. $$

It is clear that the left side is a function of time only and the right side is a function of space only and since they are equal, they must be constant (otherwise it would contradict one side being time independent and the other being spatially independent). So we have that

$$ \frac{1}{k}\frac{T'(t)}{T(t)} = \frac{X''(x)}{X(x)} = -\lambda. $$

Rewriting this slightly, we have that

$$ X''(x) + \lambda X(x) = 0 $$

and the solution to which is given by \( X(x) = Ae^{\sqrt{-\lambda}x} + Be^{-\sqrt{-\lambda}x}. \) This can be checked by differentiating twice. The question then becomes: what kind of values may \( \lambda \) take? We want it to be real so there are three regimes to consider: \(\lambda < 0\), \(\lambda = 0\) and \(\lambda > 0\). To answer this question, we return to our boundary conditions.

Case 1: \(\lambda < 0 \).

$$ u(0,t) = T(t)X(0) = T(t)(A + B) = 0 $$

We could have that \( T(t) = 0 \) and so \( u(x,t) = 0 \) but this is the trivial solution (since it would imply \(u = 0 \) for all times) so we discard it. Thus \( A = -B \). Making use of the other boundary condition we have

$$ u(L,t) = T(t)(Ae^{\sqrt{-\lambda}L} + Be^{-\sqrt{-\lambda}L}) = 0.$$

These two equations together are only solved if \( A = B = 0 \) because real-valued exponentials are never zero. Since this is the trivial solution, we again discard it. The case for \(\lambda = 0 \) is very similar (except that now \(X(x) = Ax + B\) but the rest of the argument is quite similar). So we conclude that \(\lambda > 0 \).

For simplicity (to avoid square roots like we just saw), I will let \(\lambda = \eta^2\). We have the following differential equation

$$ X''(x) + \eta^2 X(x) = 0.$$

The solutions to this differential equation are given by sines and cosines. We write the general solution as

$$ X(x) = A\cos(\eta x) + B\sin(\eta x).$$

Making use of our boundary conditions, we get

$$ X(0) = 0 = A\cos(0) + B\sin(0) = A.$$

By simply making use of one of the boundary conditions, we have substantially reduced our solution space to only including sines. This makes intuitive sense since our solution at \( 0 \) is \( 0 \) which is impossible if it was composed of cosines. To determine one of \( B\) or \( \eta \) we must make use of the other boundary condition. This gives us

$$ X(L) = 0 = B\sin(\eta L). $$

Unfortunately, we cannot determine \( B \) from this uniquely (which means we will have to determine \(B \) from our initial condition!) but we can determine what kind of values \( \eta \) can take. Since sine is \( 0\) at integer multiples of \(\pi\), we have that

$$\eta L = n\pi \Rightarrow \eta = \frac{n\pi}{L},$$

where \(n\) is an integer. So the general solution to the boundary conditions is given by

$$X(x) = B\sin\left(\frac{n\pi}{L}x\right).$$

We have nearly arrived at the genius of Fourier. He noted that any linear combination of functions satisfying the boundary conditions again satisfied the boundary condition. Thus the general solution for \(X\) is given by a linear combination of sines:

$$X(x) = \sum_n B_n \sin\left(\frac{n\pi}{L}x\right).$$

From the differential equation satisfied by \( T\) we have that the general solution is

$$ u(x,t) = \sum_n B_n \sin\left(\frac{n\pi}{L}x\right)e^{-\frac{n^2\pi^2 k}{L^2}t}.\tag{1}$$

We still have not determined what the \( B_n \) are but will do so shortly using our initial condition. In the case that we have a finite sum of sines, the above expression was well-understood during Fourier's time but he posited that sometimes for certain initial conditions, the series would have to be infinite in nature. This was not well understood by any means during his time. What Fourier noted is that cosines and sines are orthogonal over the interval \( [0,2\pi]\), that is to say that

$$ \begin{eqnarray}
\int_0^{2\pi}\sin(nx)\sin(mx)dx &=& \pi\delta_{nm}\\
\int_0^{2\pi}\cos(nx)\cos(mx)dx &=& \pi\delta_{nm} \\
\int_0^{2\pi}\cos(nx)\sin(mx)dx &=& 0
\end{eqnarray}$$

where \(\delta_{nm} \) is the Kronecker delta and is defined to be \( 1 \) if \( n = m \) and \( 0 \) if \( n \neq m \). What Fourier did (haphazardly I might add) follows. First, he allowed the sum to be infinite in nature and claimed that the infinite sum is equal to \( u \) (even though he could not prove that the infinite sum even made sense). He also made use of this orthogonality property to try to extract the coefficients \( B_n \) by making use of the initial conditions. We have that

$$ u(x,0) = f(x) = \sum_n B_n\sin\left(\frac{n\pi}{L}x\right).$$

If we integrate both sides of this equation against \(\frac{2}{L}\sin\left(\frac{m\pi}{L}x\right) \) over the interval \([0,L]\) we have

$$\frac{2}{L}\int_0^L f(x)\sin\left(\frac{m\pi}{L}x\right) dx = \frac{2}{L}\int_0^L\left(\sum_n B_n\sin\left(\frac{n\pi}{L}x\right)\right)\sin\left(\frac{m\pi}{L}x\right)dx. $$

Making use of linearity, he assumed he could pass the integral inside the (possibly infinite) sum to extract the coefficients. However nothing led him to believe this was valid other than intuition and that it worked in the case of a finite amount of terms. He then commuted the sum and integral and obtained the relation

$$ B_n = \frac{2}{L}\int_0^L f(x)\sin\left(\frac{n\pi}{L}x\right)dx.$$

There is nothing in the last bit of analysis that had anything to do with the heat equation per se and it turned out this was a way to relate a function to a possibly infinite sum of sines and cosines. Writing it out with complex exponentials, he claimed that

$$f(x) = \sum_{n=-\infty}^{\infty} \left(\frac{1}{L}\int_0^L f(y)e^{-i\frac{n\pi}{L}y}dy\right)e^{i\frac{n\pi}{L}x}.\tag{2}$$

His claim is pretty easily seen to be false and one case that it breaks down is if \( f \) is a discontinuous function. I will not prove it here, but it is a known result that in the presence of a jump discontinuity, the Fourier series for the function (given in (2)) overshoots the discontinuity by about nine percent of the length of the discontinuity. This is known as the Gibbs phenomenon. It can be seen from the following graphs of the Fourier series of a step function (they are given by increasing number of Fourier terms):

Even though the Fourier series is tighter at the jump discontinuities as you add more terms, the overshoot is always present. At the time he developed his series representation for functions, he was not aware of this phenomenon but that does not mean Fourier series are without merit. There is more than one notion of convergence that one might be interested in. What we see above is that a Fourier series does not necessarily converge pointwise - meaning the value of the Fourier series at any given point does not necessarily have to equal the value of the function at that point. If you look at the above plots more closely what you will see is that the area between the oscillations and the function is actually getting a lot smaller. In fact, we can say the following. Let \( S_Nf \) be a truncated Fourier series over an interval \([0,L]\) of a function \( f \) that has the property that \( \int_0^L |f(x)|^2 dx < \infty\), then

$$\lim_{n\rightarrow\infty} \int_0^L|S_Nf(x)-f(x)|^2dx = 0.$$

This is known as \(L^2\) convergence. Even if the Fourier series does not equal the function at every point, we know that they closely resemble each other in some way.

The best notion of convergence is uniform convergence and can be thought of as asking the following: "If I give you a small number (greater than 0), if you made a little tube around your function that has a radius equal to that small number, then can you say that the truncated Fourier series lies within that tube?" Clearly if the function is discontinuous, that answer would be no because all we have to do is pick a number less than nine percent of the value of the jump discontinuity and the Fourier series would lie outside of that tube (see: the graphs up above and note the overshoot by the Fourier series).

Uniform convergence implies both pointwise and \( L^2 \) convergence but the converse is not true. One might wonder how pointwise and uniform are different. If the Fourier series agrees with the function at every point, how could the truncated Fourier series not lie within that tube? Well the answer is a bit technical but the Fourier series could converge to the function value at different rates depending on the x value and it might be that you need to include all of the terms in the Fourier series to make the Fourier series lie within the tube over the whole domain. But clearly in the definition of uniform convergence, it asked for a truncated Fourier series and so it can only have a finite number of terms.

(For the mathematically mature reader, I will be more precise with the definition of uniform convergence. If for every \( \varepsilon > 0 \) there exists an \( N \) such that for all \( n > N \) and all \(x\) in the domain that the Fourier series is defined on that

$$ |S_nf(x) - f(x)| < \varepsilon $$

then the Fourier series is said to converge uniformly.)

The last outstanding piece to the puzzle in Fourier series theory was finally put in place in 1968 - about a century and a half after Fourier first put forth his series. It was proved by Richard Hunt which built upon work by Lennart Carleson two years prior. Hunt's work was an extension of Carleson's and it is therefore known as the Carleson-Hunt theorem. With this final piece put into place, the fundamental theory of Fourier series was put to rest. However Fourier analysis does not stop there.

While Fourier series are very nice, I am personally more partial to what is known as the Fourier transform. It is a very closely related idea and it comes from considering some of the limitations of a Fourier series. If we return to the conception of Fourier series and how it somewhat popped out of an attempt to solve the heat equation, one might immediately note that we restricted ourselves to a finite domain. What would happen if we tried to extend the analysis we did above to also include an infinite domain? Is it possible? It turns out that we can but with some caveats: 1) We cannot perform a Fourier series because a Fourier series implicitly assumes one is working on a finite domain. 2) What happens to our boundary terms? Are they still relevant? Enter Fourier transforms which will be motivated from (2). This will be the topic of my next blog post which shall be coming soon.

No comments:

Post a Comment