Fourier Analysis: A Linear Algebra Approach


Clickbait title: My 3Blue1Brown callout post

Nah, Grant (the guy behind the channel) is great, and his videos are genuinely amazing. He does, however, perfectly illustrate my issue with mainstream “pop science” math content. And that is, that it is too mainstream. Him, and most other very popular math youtubers, take a very geometric and visual approach, which is great when you’re going for a broad appeal, but can counterintuively be harmful if you’re trying to learn in a more professional setting (like university).

This actually happened to me, specifically with 3Blue1Brown, when I was first doing my linear algebra course some years ago. Picture this; An over-eager 2nd semester physics student, about to have their first proper math course (Linear Algebra). What does she do in preparation? Well, she of course binges the very popular video series by this new math channel she just discovered, 3Blue1Brown, on that very subject! How handy! Well imagine my surprise when I had my first lecture, and there were no mentions of matrices, and no awesome animations of shifting coordinate grids. Instead, it’s all just sets of linear equations! And here comes my gripe with that particular video series; it is called the “essence of linear algebra”, but is focused entirely on a geometric view. And thus, I feel like it misses the (to me) actual essence of linear algebra. The algebraic manipulation of systems that behave linearly.

With that exposition out of the way, it might be prudent of me to actually explain what Fourier Analysis even is. The idea is to ask: Can you rewrite any function as a sum of sines and cosines? This question was asked by French mathematician Joseph Fourier, when he was trying to figure out how heat diffuses through any object, using the heat equation. It turned out that he could solve this problem only when the initial heat distribution was a sine or cosine. And so he naturally asked our question; Can you rewrite any function as a sum of sines and cosines? And thus, the field of Fourier Analysis arose.

Another more mainstream example of where Fourier Analysis is very appropriate is sound. Ours ears actually naturally does this frequency decomposition, where we are able to distinguish a single signal (air pressure at any time), into a sum of pure tones. You can even try it yourself using this interactive tool, just remember to tick the box that says “Sound on/off”.

In general, Fourier Analysis is an incredibly useful tool in many highly technical STEM fields, including my own world of Electrical Engineering. I do also think that understanding the base concepts can be a bit of a paradigm shift in how you view the world, as you might start seeing how basically anything around you can be written as a sum of waves.

I will not be spending too much time on explaining what linear algebra is, so if you don’t have any prior knowledge, there will be some things you’re just gonna have to accept. I do require you at least have a base understanding of calculus though, and especially intergrals, as they are the key to Fourier Analysis.

Linearity

I keep using this word, linear, but what does it actually mean? Those with a high school level understanding of math will probably be thinking of straight lines, the classic f(x)=ax+bf(x) = ax+b, and while that isn’t wrong, it is very far from the full picture. So here is the most basic idea of linearity; If you have an operator, L(x)L(x), it is said to be linear if the following holds:

L(au)=aL(u)L(u+v)=L(u)+L(v)L(a \cdot \boldsymbol{u}) = a \cdot L(\boldsymbol{u}) \newline L(\boldsymbol{u}+\boldsymbol{v}) = L(\boldsymbol{u}) + L(\boldsymbol{v})

What is a, u, and v? Well, a is what is called a scalar, which is what normal people call “a number”. And what about u and v? They’re what’s called “vectors”. What’s a vector? Weeellllll, it’s complicated. To many, a vector is simply a collection of numbers in a list, but to be able to derive fourier analysis, we have to be a little bit more general. A vector is an element of a vector space. Huh. Okay, but what’s a vector space? Well, it’s a set of elements that obey certain properties. Oh.

Confused spongebob reading book meme

So that’s kind of unhelpful. Understanding exactly what a vector is will require a full university level course, and I will not be going into that much detail here. For this blog post, it is sufficient to know of two types of vector spaces; Rnℝ^n and L2[a,b]L^2[a,b] (pronounced R n and L-square). Rnℝ^n is the classic collection of vectors we all know and love. It is a list of n numbers.

L2L^2 is a little different. It is the collection of all functions, f(x), that take a real number, a<x<ba < x < b, and return a complex number, and which fullfill the following property:

abf(x)2dx<\int_a^b |f(x)|^2 dx < \infty

The above inequality may seem quite intimidating, but it’s integral to understanding this blog post, so we’ll spend a lot of time examining it, and another related integral. Also do note that there is an infinte number of different L2[a,b]L^2[a,b] vector spaces But first, we must take a small detour to talk about the normal vector dot product.

The Inner Product

Okay I lied. It’s called the inner product. The dot product is simply the specific term for the inner product for the Rnℝ^n vector spaces. The generalized inner product equation is:

u,v=uvcos(θ)\langle \boldsymbol{u},\boldsymbol{v} \rangle = |\boldsymbol{u}||\boldsymbol{v}|cos(\theta)

Where u and v are vectors (the normal kind), the || denote the “length”, and θ is the angle between them. For the normal dot product, it is customary to substitute the angle brackets for a dot:

uv=uvcos(θ)\boldsymbol{u} \cdot \boldsymbol{v} = |\boldsymbol{u}||\boldsymbol{v}|cos(\theta)

A nice visualization of the dot product can be found on wikipedia. Here they have used x and y, where as we used u and v. Visualization of the dot product Alright so the dot product can be used to find the angle between two vectors, but how do you compute it? Well, that depends entirely on what vector space you exist in. For Rnℝ^n, you have:

uv=u0v0+u1v1+...+unvn\boldsymbol{u} \cdot \boldsymbol{v} = u_0 v_0 + u_1 v_1 + ... + u_n v_n

This definition almost begs of you to ask: what happens if you take the dot product of a vector with itself? Well, that gives you the “norm squared”, aka the length squared:

uu=u02+u12+...+uk2=u2\boldsymbol{u} \cdot \boldsymbol{u} = u_0^2 + u_1^2 + ... + u_k^2 = |\boldsymbol{u}|^2

Which fits well with our intuition from pythagoras. We will now define the length of a vector (of any vector space with a defined inner product) as such:

u=u,u|\boldsymbol{u}| = \sqrt{\langle \boldsymbol{u},\boldsymbol{u} \rangle}

I will now define the inner product for the L2L^2 vector space (the weird one that is a collection of functions), but to be completely honest, I don’t entirely know where it came from. HOWEVER, with that said, it is an awefully useful definition, and it is that from which the fourier series falls out of!

f,g=abf(x)g(x)dx\langle f,g \rangle = \int_a^b f(x) \overline{g(x)}dx

The line above the g(x) means the complex conjugate. Remember that f and g are functions of x[a,b]x \in [a,b], which return a complex number, and that fullfill the earlier criterion. Just a quick reminder for those who are a bit rushy with complex numbers, given a complex number z: zz=z2z\overline{z} = |z|^2 where || denote the norm (absolute value).

With this definition of the inner product, we can now rewrite the criterion as:

f,f=abf(x)2dx<\langle f,f \rangle = \int_a^b |f(x)|^2 dx < \infty

So the criterion is that the functions in L2L^2 have finite length, for this definition of length.

Projections

There is just one more concept we need to touch on before we’re ready to tackle fourier analysis, and it is, as the heading suggests, projections.

If we want to project one vector onto another as such:

Projection of one vector onto another

This can be written, in linear algebra terms, as:

uontov=u,vv,vv\boldsymbol{u}_{onto_{\boldsymbol{v}}} = \frac{\langle \boldsymbol{u}, \boldsymbol{v} \rangle}{\langle \boldsymbol{v}, \boldsymbol{v} \rangle} \boldsymbol{v}

It can be thought of as a new vector with length proportional to the angle between u and v in the direction of v. Alright cool, but why do we care? In many applications, it can be very useful to rewrite a vector as a weighted sum of other vectors.

To give a simple example of this, consider the vector w=[3,1]\boldsymbol{w} = [3,1], and say we want to write it as a weighted (linear) sum of the two vectors u=[1,1]\boldsymbol{u} = [1,1], v=[1,1]\boldsymbol{v} = [1,-1]:

w=wuuuu+wvvvv\boldsymbol{w} = \frac{\boldsymbol{w} \cdot \boldsymbol{u} }{\boldsymbol{u} \cdot \boldsymbol{u}}\boldsymbol{u} + \frac{\boldsymbol{w} \cdot \boldsymbol{v}}{\boldsymbol{v} \cdot \boldsymbol{v}}\boldsymbol{v} w=31+1111+11u+31+1(1)11+(1)(1)v\boldsymbol{w} = \frac{3\cdot1 + 1\cdot 1}{1\cdot1+1\cdot1} \boldsymbol{u} + \frac{3\cdot1 + 1\cdot (-1)}{1\cdot1+(-1)\cdot(-1)} \boldsymbol{v} w=2u1v=2[1,1]+[1,1]=[3,1]\boldsymbol{w} = 2 \boldsymbol{u} - 1\boldsymbol{v} = 2[1,1] + [1,-1] = [3,1]

The Fourier Series

I’m going to say some words, and then we’ll go through what it means after: The set of functions below form an orthonormal basis in L2[t0,t1]L^2[t_0,t_1]

f(t){1T,2Tcos(2πTkt),2Tsin(2πTkt)}f(t) \in \left\{\sqrt{\frac{1}{T}},\sqrt{\frac{2}{T}}\textrm{cos}\left(\frac{2\pi}{T} kt\right),\sqrt{\frac{2}{T}}\textrm{sin}\left(\frac{2\pi}{T} kt\right)\right\}

For all values of k[1,2,3....]k \in [1,2,3....], with T=t1t0T = t_1-t_0, such that the sines and cosines always complete a whole number of cycles in the range [t0,t1][t_0,t_1]. Below is a plot for k[1,4]k \in [1,4] with t0=0t_0 = 0 and t1=1t_1 = 1

Projection of one vector onto another

Firstly, yes, f(t)=1Tf(t) = \sqrt{\frac{1}{T}} is a valid function of tt. Remember that TT is just a constant determined by our range [t0,t1][t_0,t_1].

Secondly, what the hell is an “ortonormal basis”. Let’s start with that second word, basis. Essentially it’s a set of vectors in a vector space, such that you can reach any other vectors in that space via a weighted sum of the basis vectors. The simplest example is the normal axis of the 2 dimensional coordinate system, x=[1,0],y=[0,1]\boldsymbol{x} = [1,0], \boldsymbol{y} = [0,1] of R2ℝ^2. You can reach any 2 dimensional vector as a weighted sum of those two:

[a,b]=ax+by[a,b] = a\boldsymbol{x} + b\boldsymbol{y}

Orthonormal means a set of vectors are both orthogonal, and normal. A vector is normal if it has length 1, i.e.:

u,u=1\langle \boldsymbol{u}, \boldsymbol{u} \rangle = 1

That’s what all those constants in front of the sines and cosines are. They’re simply there to make sure they have length 1. Why we want to do that becomes apparent a little later.

For Rnℝ^n, orthogonal means they’re perpendicular: Two orthogonal vectors In more general terms, it means their inner product is zero:

u,v=0\langle \boldsymbol{u}, \boldsymbol{v} \rangle = 0

So the above basis consisting of a constant term, an infinte set of sines and cosines, form an orthonormal basis in L2L^2. Does that mean sine and cosine are perpendicular? Well, kinda. At least by this definition of perpendicularity. That may sound odd, but it is exceedingly useful. It is fairly trivial to prove that this set of basis vectors are orthonormal, but it’s honestly just kind of a mess of symbols. For the curious reader, here’s a small hint for how to do it yourself.

Firstly, to simplify the math, we will be working in L2[0,1]L^2[0,1], but the following still applies without loss of generality. Thus T=1T = 1 and our basis functions become:

f(t){1,2sin(2πkt),2cos(2πkt)}f(t) \in \left\{1,\sqrt{2}\textrm{sin}(2\pi kt),\sqrt{2}\textrm{cos}(2\pi kt)\right\}

Now, let us project an arbitrary function s(t)s(t) in our vector space onto one of our basis vectors, say 2cos(2πkt)\sqrt{2}\textrm{cos}(2\pi kt)

From the definition of projection:

s(t)ontocos(kt)=s(t),2cos(2πkt)2cos(2πkt),2cos(2πkt)2cos(2πkt)=2s(t),cos(2πkt)cos(2πkt)=(201s(t)cos(2πkt)dt)cos(2πkt)\begin{align*} s(t)_{onto_{cos(kt)}} &= \frac{\left \langle s(t) , \sqrt{2}\textrm{cos}(2\pi kt)\right \rangle}{\left \langle \sqrt{2}\textrm{cos}(2\pi kt) , \sqrt{2}\textrm{cos}(2\pi kt)\right \rangle} \sqrt{2}\textrm{cos}(2\pi kt) \newline &= 2\left \langle s(t) ,\textrm{cos}(2\pi kt) \right \rangle \textrm{cos}(2\pi kt) \newline &= \left(2\int_0^1s(t)\textrm{cos}(2\pi kt) dt \right) \textrm{cos}(2\pi kt) \end{align*}

Where we have used the fact that we normalized the cosine (it’s inner product with itself is one). We’ll call the stuff in the parentheses aka_k. The same procedure can be applied to the projection onto 2\sqrt{2}sin(2πkt)(2\pi kt):

s(t)ontosin(kt)=(201s(t)sin(2πkt)dt)sin(2πkt)\begin{align*} s(t)_{onto_{sin(kt)}} &= \left(2\int_0^1s(t)\textrm{sin}(2\pi kt)dt \right) \textrm{sin}(2\pi kt) \end{align*}

Where we’ll call the stuff in the parentheses bkb_k. Lastly we need to deal with the constant term, f(t)=1f(t) = 1:

s(t)ontoconstant=s(t),11,11=s(t),1=01s(t)dt\begin{align*} s(t)_{onto_{constant}} &= \frac{\left \langle s(t) , 1\right \rangle}{\left \langle 1,1\right \rangle} 1 \newline &= \left \langle s(t) , 1 \right \rangle \newline &= \int_0^1s(t)dt \end{align*}

Which we will call a0a_0. We now have a recipe for rewriting our function s(t)s(t) as a sum of sines and cosines, shown below with any value of TT:

s(t)=a0+k=1akcos(2πkt)+k=1bksin(2πkt),a0=1TTs(t)dtak=2TTs(t)cos(2πTkt)dtbk=2TTs(t)sin(2πTkt)dt\begin{align*} s(t) &= a_0 + \sum_{k=1}^\infty a_k \textrm{cos}(2\pi k t) + \sum_{k=1}^\infty b_k \textrm{sin}(2\pi k t), \newline a_0 &= \frac{1}{T}\int_Ts(t)dt \newline a_k &= \frac{2}{T}\int_Ts(t)\textrm{cos}\left(\frac{2\pi}{T} kt\right) dt \newline b_k &= \frac{2}{T}\int_Ts(t)\textrm{sin}\left(\frac{2\pi}{T} kt\right) dt \end{align*}

And that right there is the fourier series. However, there is a slightly nicer way to write it, if we allow ourselves to use complex exponentials instead.

Consider the basis functions of L2[0,1]L^2[0,1]:

f(t){e2πikt},k{...,1,0,1,...}f(t) \in \left\{e^{2\pi i k t}\right\}, k \in \{...,-1,0,1,...\}

So now we just have one category of basis function, and we allow k to be all integers. Note that this set of basis functions are already normalized. Using Euler’s formula it is trivial to show that these two sets are equivalent. Let’s project our function s(t)s(t) onto this set of basis functions:

s(t)ontoexp=s(t),e2πikte2πikt,e2πikte2πikt=s(t),e2πikte2πikt=(01s(t)e2πiktdt)e2πikt\begin{align*} s(t)_{onto_{exp}} &= \frac{\left \langle s(t) ,e^{2\pi i k t} \right \rangle}{\left \langle e^{2\pi i k t} , e^{2\pi i k t}\right \rangle} e^{2\pi i k t} \newline &= \left \langle s(t) ,e^{2\pi i k t} \right \rangle e^{2\pi i k t} \newline &= \left (\int_0^1 s(t) e^{-2\pi i k t} dt \right) e^{2\pi i k t} \end{align*}

Where we’ll call the stuff in the parentheses ckc_k. Note the sign of the exponential in the integral. This is due to the fact that you take the complex conjugate of the second argument in the inner product, and the complex conjugate of a complex exponential is: eix=eix\overline{e^{ix}} = e^{-ix}.

We may now rewrite s(t)s(t) as this projection, like we did before:

s(t)=k=cke2πikTt,ck=1TTs(t)e2πikTtdt\begin{align*} s(t) &= \sum_{k=-\infty}^{\infty} c_k e^{\frac{2\pi i k}{T} t}, \newline c_k &= \frac{1}{T}\int_T s(t) e^{-\frac{2\pi i k}{T} t} dt \end{align*}

The first line is generally referred to as the Synthesis equation, and the second line the Analysis equation.

If you’re at all familiar with the fourier series, then this is probably how you’ve seen it written. However, we can very easily switch between the two sets of basis. To go from the complex exponentials to the set of sines and cosines, you can use the following formulas:

a0=c0ak=ck+ckbk=i(ckck)\begin{align*} a_0 &= c_0 \newline a_k &= c_k + c_{-k} \newline b_k &= i(c_k - c_{-k}) \end{align*}

And to go the other way, you can use:

ck={12(akibk)k>0a0k=012(akibk)k<0c_k = \left \{ \begin{array}{ll} \frac{1}{2} (a_k - ib_k) & k > 0 \\ a_0 & k = 0 \\ \frac{1}{2} (a_{-k} - ib_{-k}) & k < 0 \\ \end{array} \right.

And there you have it, we have now derived the fourier series. One really important thing to note though, is that this only works for periodic signals, since the recreation is periodic with periodicity TT. Technically it also works for functions that are only defined on the range [t0,t1][t_0,t_1], but the recreation will then still be repeating for values of tt outside that range. This means that the fourier series cannot deal with signals that are defined over infinte time. To analyse functions that are either aperoidic (so non-repeating) or infinte in the input dimension, we must turn to the fourier transform.

The Fourier transform

So let’s do it! One small heads up first though. The Fourier Transform does not fit into the tradition definition of the L2L^2 vector space, since it’s set of basis vectors is eiwte^{iwt} for all values t[,]t \in [-\infty,\infty]. But we can still tease it out from the complex exponential definition of the Fourier Series, by considering the limiting behaviour as TT \rightarrow \infty.

Let’s consider the vector space L2[T2,T2]L^2[-\frac{T}{2},\frac{T}{2}], such that the length of the range is still TT. So the Analysis equation from before become:

ck=1TT/2T/2s(t)e2πikTtdt\begin{align*} c_k &= \frac{1}{T}\int_{-T/2}^{T/2} s(t) e^{-\frac{2\pi i k}{T} t} dt \end{align*}

We’re going to manipulate this equation slightly. Firstly, by multiplying both sides by T, and secondly by introducing a new variable for the exponential, Δw=2πT\Delta w = \frac{2\pi}{T}

Tck=T/2T/2s(t)eikΔwtdt\begin{align*} Tc_k &= \int_{-T/2}^{T/2} s(t) e^{-i k \Delta w t} dt \end{align*}

Now, consider the limiting behaviour of the right hand side as we let TT \rightarrow \infty. This means our integral is over infinite time, allowing our signal to be infinite in time. It also implies that Δw\Delta w becomes infinitely small, and so by calculus convention, we’re going to rename it dwdw. We’re then going to immediately get rid of it by defining a new continous variable w=kdww = kdw.

Tck=s(t)eiwtdtS(w)=s(t)eiwtdt\begin{align*} T c_k &= \int_{-\infty}^{\infty} s(t) e^{-i w t} dt \newline S(w) &= \int_{-\infty}^{\infty} s(t) e^{-i w t} dt \end{align*}

And that last line is the Fourier Transform. It is convention to capitalize the name of the function you’re Fourier Transforming. For those who aren’t familiar, the above variable substitution is called the angular frequency ww, and it’s measured in radianss\frac{\textrm{radians}}{\textrm{s}}. It’s often used when working with sines and cosines, and is related to the classical frequency by f=w2πf = \frac{w}{2\pi}, which as units of Hertz [Hz].

The notation you’ll often see in literature is:

S(w)=F[s(t)]S(w) = ℱ[s(t)]

The last thing to do is find the Inverse Fourier Transform. Consider the Synthesis equation from before:

s(t)=k=cke2πikTt s(t) = \sum_{k=-\infty}^{\infty} c_k e^{\frac{2\pi i k}{T} t}

Now substitute ck=S(w)Tc_k = \frac{S(w)}{T} and T=2πΔwT = \frac{2\pi}{\Delta w}:

s(t)=12πk=S(w)eiΔwktΔw s(t) = \frac{1}{2\pi} \sum_{k=-\infty}^{\infty} S(w) e^{i \Delta w k t} \Delta w

Again, consider the limiting behaviour as we let TT \rightarrow \infty, such that Δw=dw\Delta w = dw:

s(t)=12πk=S(w)eiwtdws(t) = \frac{1}{2\pi} \sum_{k=-\infty}^{\infty} S(w) e^{i w t} dw

The sum is the definition of the Reimann Integral, so substituting the sum for an integral yields:

s(t)=12πS(w)eiwtdws(t) = \frac{1}{2\pi} \int_{-\infty}^{\infty} S(w) e^{i w t} dw

And that right there is the Inverse Fourier Transform, often denoted in the literature as:

s(t)=F1[S(w)]s(t) = ℱ^{-1}[S(w)]

Other Basis Functions

Since Fourier Analysis is basically just projections of functions in the L2L^2 vectors space, why did we choose to use sines and cosines (or complex exponentials), and not some other infinte family of functions? For one, there is the whole physical intuition in terms of things like sound being a sum of pure tones, but there is also a good mathematical reason. The set of sines and cosines are naturally orthogonal for any choice of region [t0,t1][t_0,t_1]. This is not at all true for most other sets of basis functions you might choose, like the set of powers. Consider the set of functions in L2[t0,t1]L^2[t_0,t_1]:

f(t)[tk],k{0,1,2,...}f(t) \in [t^k], k \in \{ 0, 1, 2, ...\}

It is fairly trivial to show that this set is not orthogonal:

f(t)=tng(t)=tmf(t),g(t)=t0t1tntmdt=t0t1tn+mdt=1n+m+1[tn+m+1]t0t1=t1n+m+1t0n+m+1n+m+1\begin{align*} f(t) &= t^n \newline g(t) &= t^m \newline \langle f(t),g(t) \rangle &= \int_{t_0}^{t_1}t^n t^m dt \newline &= \int_{t_0}^{t_1}t^{n+m}dt \newline &= \frac{1}{n+m+1}[t^{n+m+1}]_{t_0}^{t_1} \newline &= \frac{t_1^{n+m+1}-t_0^{n+m+1}}{n+m+1} \end{align*}

It should be fairly obvious that the above is, in fact, not equal to zero for any choice of [t0,t1][t_0,t_1] and for nmn \ne m.

It is possible to create an orthogonal basis of polynomials, but it will: 1) be a mess, and 2) depend on your choice of [t0,t1][t_0,t_1]. How you would go about it is beyond the scope of this blog post, but the eager reader may want to check out the Gram-Schmidt process.

Why Care?

If you’re someone who just engages in mathematics recreationally, for example via popular youtube channels, then the general geometric approach of many videos is probably fine. But if you actually do it professionally, such as myself, then I believe it’s a really good idea to have an at least superficial idea of the algebra behind it. And I don’t just mean for Fourier Analysis or Linear Algebra, but for all mathematical tools you may use professionally. Knowing how and why something works, on a deeper more intuitive level, allows you to manipulate and apply it far more broadly.