Anda di halaman 1dari 12

#AltDevBlogADay

EACH DAY A LITTLE MORE #GAMEDEV LOVE


Keep data close, part 1 CES to E3: Some Thoughts For Designers

Seach
Search

May

17

2011

Understanding the Fourier transform by Stuart Riffle


Posted by Stuart Riffle at 4:00 am #gamedev, Audio, General Interest, Programming, Uncategorized

Pages
#readme (for Readers) #readme (for Writers) Post Schedule

117

Share 15

Like

Meta
Log in Entries RSS Comments RSS

Yes, I realize that after reading the title of this post, 99% of potential readers just kept scrolling. So to the few of you who clicked on it, welcome! Dont worry, this wont take long. A very long time ago, I was curious how to detect the strength of the bass and treble in music, in order to synchronize some graphical effects. I had no idea how to do such a thing, so I tried to figure it out, but I didnt get very far. Eventually I learned that I needed something called a Fourier transform, so I took a trip to the library and looked it up (which is what we had to do back in those days). What I found was the Discrete Fourier Transform (DFT), which looks like this:

Archives
May 2011 (61) April 2011 (111) March 2011 (94) February 2011 (77) January 2011 (48)

@mike_acton
If you are a gamedev and have been looking for an excuse (or kick in the pants) to write and share your experience, passion and ideas with other like-minded folk, then you should join us! Shoot me an email: macton@gmail.com

This formula, as anyone can see, makes no sense at all. I decided that Fourier must have

And

if

you'd

like

to

follow

all the

#AltDevBlogADay updates on Twitter, you can follow me @mike_acton

been speaking to aliens, because if you gave me all the time and paper in the world, I would not have been able to come up with that. Eventually, I was able to visualize how it works, which was a bit of a lightbulb for me. Thats what I want to write about today: an intuitive way to picture the Fourier transform. This may be obvious to you, but it wasnt to me, so if you work with audio or rendering, I hope theres something here you find useful. Disclaimer: my math skills are pitch-patch at best, and this is just intended to be an informal article, so please dont expect a rigorous treatment. However, I will do my best not to flat-out lie about anything, and Im sure people will set me straight if I get something wrong. A quick bit of background what does the Fourier transform do? It translates between two different ways to represent a signal: The time domain representation (a series of evenly spaced samples over time) The frequency domain representation (the strength and phase of waves, at different frequencies, that can be used to reconstruct the signal)

can follow me @mike_acton

@twoscomplement
Jonathan Adamczewski helps administer the site - follow and/or bug him on twitter (@twoscomplement) with comments or suggestions.

Categories
#AltDev Updates (3) #AltDevPodcast (4) #AltDevUnconference (1) #gamedev (98) Animation (12) Audio (7) Bizdev (23) Game Design (60) General Interest (110) Guest Post (4) Programming (218) Protected (5) Tools, UI and UX (44) Uncategorized (24) Visual Arts (23)

Authors (109)
Mike Acton (11) Jaymin Kessler (9) Aras Pranckeviius (8) Ben Carter (8) Fabrice Lt (8)

The picture on the left shows 3 cycles of a sine wave, and the picture on the right shows the Fourier transform of those samples. The output bars show energy at 3 cycles (and, confusingly enough, negative 3 cycles more on that below).

Mike Jungbluth (8) Nick Darnell (8) Niklas Frykholm (8) Colleen Delzer (7) David Czarnecki (7) Jon Moore (7) (7)

The inputs and outputs are actually complex numbers, so to feed a real signal (like some music) into the Fourier transform, we just set all the imaginary components to zero. And to check the strength of the frequency information, we just look at the magnitude of the outputs, and ignore the phase. But lets never mind all that for now. What are we trying to accomplish? Weve got a sampled signal, and we want to extract frequency information from it. The Fourier transform works on a periodic, or looping signal. This seems like a problem, since we dont actually have any signals like that. In practice, you just take a small slice of a longer signal, fade both ends to zero so that they can be joined (which is a whole topic unto itself), and pretend its a loop. Lets make things simple and say that our loop repeats once per second.

Julien Hamaide (7) Martin Pichlmair (7) Paul Evans (7) Savas Ziplies (7) Tomasz Dbrowski (7) David Evans (6) Ed Bartley (6) Jake Simpson (6) Jonathan Adamczewski (6) Justin Liew (6) Brett Douville (6) Phil Carlisle (6) Sean Parsons (6) Dylan Cuthbert (5) Eddie Cameron (5) James Podesta (5) Jason Weesner (5) Alex Evans (5) Matt Yaney (5) Max Burke (5) Paul Laska (5) Thaddaeus Frogley (5) Garett Bass (4) Gavan Woolery (4) Claire Blackshaw (4) Jason Swearingen (4) Adam Rademacher (4) Joe Hegarty (4) Kyle Howells (4) Matt Dalby (4)

Picture it as a bead, sliding up and down along a thin rod, tracing out the signal. So as this bead is bobbing up and down, look what happens if we spin the rod at a rate of, say, 10 revolutions per second:

Pat Wilson (4) Pete Collier (4) Richard Fine (4) Richard Sim (4) Bernat Muoz (4) Georg Backer (3)

Joe Valenzuela (3) Bjoern Knafla (3) Colin Riley (3) Keith Judge (3) Kevin Gadd (3) Geraldo Nascimento (3) Mark Lee (3) Ajari Wilson (3) Bruno Evangelista (3) Erwin Coumans (3) Francesco Carucci (3) Rob Braun (3)

We get a scribble, as youd expect. And it is roughly centered on the origin. Now, lets assume we know theres some energy in the signal at 3 Hz, and we want to measure it. What that means is that on top of whatever else is causing the signal to wobble around, weve added a wave that oscillates 3 times per second. It has a high point every 1/3 of a second, and corresponding low points in between, also spaced 1/3 of a second apart. You can probably see now how we might be able to detect it lets try spinning our signal at a matching rate of 3 revolutions per second.

Shervin Ghazazani (3) Steven Tovey (3) Andre Leiradella (3) Chris Kosanovich (3) Tony Albrecht (3) Wolfgang Engel (3) Christoph Lang (2) Mattias Jansson (2) Dean Calver (2) Michael Taylor (2) Noel Austin (2) Norm Nazaroff (2) Bob Koon (2) Paul Firth (2) Julien Koenen (2) Forrest Smith (2) Pete Garcin (2) Daniel Collin (2) Don Olmstead (2) Christian Arca (2) Rob Ashton (2) Rune Vendler (2) Chris Hargrove (2)

Since the signal completes a rotation every 1/3 of a second, all the high points in our 3 Hz wave line up at the same part of the rotation, and this pulls the whole scribble off-center.

Jesse Cluff (2) Szymon Swistun (2) Glenn Watson (2) Yacine Salmi (2) Promit Roy (1)

How can we quantify that? The easiest way would be to record a bunch of points as we rotate, and average them to find their midpoint:

Promit Roy (1) Raul Aliaga Diaz (1) Michael Tedder (1) Gustavo Ambrozio (1) Adam Kidd (1) Jean-Francois St-Amour (1) Andy Thomason (1) Ryan Stubblefield (1) Mitch Zamara (1) Drew Thaler (1) Ann Cudworth (1) Geoff Evans (1) Stuart Riffle (1) Sven Bergstrm (1)

It makes sense that the distance of this midpoint from the origin is proportional to the strength of the signal, because as the high points in our signal get higher, they will move the scribble farther away. But what if the signal contains no energy at 3 Hz? Lets remove the 3 Hz wave and see:

Joo Costa (1) Luke Dicken (1) Todd Keller (1) Jose Antonio Escribano Ayllon (1) Craig Bishop (1) Unai Landa (1) Werner Nemetz (1) Matt Ditton (1) Phil Shenk (1)

Now there is nothing to pull the scribble off center, and all of the other oscillations tend to (approximately) balance each other out. This looks promising as a way to detect energy at a given frequency. Time to translate it into math! For a looping signal of N samples:

(Raising e to an imaginary power produces rotation around a unit circle in the complex plane, according to Eulers formula. How? Magic, as far as I can tell. But apparently its true). So this equation is a little different from what we started with. Ive added a normalization factor of 1/N, and changed the sign of the exponent. I also rearranged the terms slightly for clarity. This form is normally called the inverse DFT, which is confusing, but apparently the difference between the DFT and IDFT is a matter of convention, and can depend on the application. So, lets call that close enough.

Anyway, once you can see whats going on in your head, a lot of the quirks of working with the DFT become much less mysterious. If youve had to work with DFT output before, you may have wondered: Why does the first element in the result (k=0) contain the DC offset? Because in that case, our samples dont spin at all, so all were doing is averaging them. Why doesnt the DC offset affect the frequency information? Because adding a constant value to all the samples just makes the whole scribble bigger, which doesnt affect the midpoint. Why does the second half of the output array contain a mirror image of the

first half? Its just our old friend aliasing. When calculating the last element (k=N-1), were rotating by (N1)/N at each step, which is almost all of the way around. This is the same as taking small steps (1/N) in the wrong direction. Thats why the result at (k=N1) has the same magnitude as (k=1). Its equivalent to processing a negative frequency of (k=-1). Why does a sine wave with amplitude 1.0 come out of the DFT as 0.5? When we spin the sine wave, we get a circle of diameter 1.0, but its midpoint is only half that distance away from the origin. Where is the other half of the energy then? Its hiding in the negative frequency part! Hopefully confusing. And if youd like to get updates on my game development work, come subscribe to my RSS feed at pureenergygames.com! this was more helpful than

117

Share 15

Like

Related posts:
Its a trap! Tips for Managing Multiple Coordinate Systems My programming techniques Part 0 Particle system memory patterns Learning from Others

Stuart Riffle
I'm a graphics/systems programmer

with 14 years in the industry, and my credits include Madden, Bond, and Need for Speed. I'm currently doing the indie thing. My specialties are optimization, rendering, concurrency, and compilers. I am delighted by all the pretty colors.
Add comments

15 Responses to Understanding the Fourier transform


Slagh says:
May 17, 2011 at 4:34 am

You, sir, are a star. My math skills are also pitchpatch and this lights the AHA! globe like nothing else Ive read. Reply

Stuart Riffle says:


May 17, 2011 at 12:50 pm

Aw, shucks. Thank you for your kind words, and Im happy I could help! Reply

Cam says:
May 17, 2011 at 7:23 am

Thats a pretty cool explanation. I challenge you to give FFT the same treatment What? Butterflys?

Reply

S.J. says:
May 17, 2011 at 7:41 am

Thank you, thank you very much. I have to say this to you. Reply

Revolves says:
May 17, 2011 at 8:10 am

Nice post there, Stuart. Fourier transform is useful because we normally use sinusoidal inputs to analyse the behavior of various circuit components. The output that we get from each circuit depends on the frequency of the input sinusoids. Once we know how a circuit responds to sinusoids, we can use Fourier transform on any arbitrary signal. The Fourier transform would give us a set of frequency components and their amplitudes, which when added together would give you your original signal. Since each component is a sinusoid, you find out the output of the circuit for each of the input signal component, and then add them together to find out the final output. I know I might not be making much sense above. But it is quite difficult to explain this without considerable amount of examples. You did a great job at it though! Reply

CJ says:
May 17, 2011 at 8:42 am

Hmmmm..I really, really like the color-coded formula with the intuitive explanation. I need to find a way to incorporate that into my math writings in grad school. Reply

Adrien says:
May 17, 2011 at 8:50 am

Thank you for the explanation. Now you know what a DFT does. If you actually need to perform it. Dont use a DFT which comes from Mars. Use a FFT (Fast Fourier Transform) which comes from another galaxy. Dont expect me to light any bulb in this comment. But there is Wikipedia article called FFT. Just have a look, the concept is the

same, the algorithm is just faster. It is generally embedded in electronics (thats where I learnt it exists) but the algorithm can be implemented with complexity O(n log n). Reply

Stuart Riffle says:


May 17, 2011 at 12:35 pm

Thank you (and Cam above) for bringing up the FFT! I agree, and it would have been better if I mentioned exactly this at the end of the article. (The FFT translates naturally to the GPU too, and there is a quick-anddirty implementation described here on my blog, which is not pretty, but hasnt yet been a bottleneck, knock on wood). http://pureenergygames.com/blog/34development/58-unoptimized-gpu-fft Reply

lithander says:
May 17, 2011 at 10:16 am

Awesome, if all math would be explained like this I bet a lot less people would hate it! Reply

Andrew says:
May 17, 2011 at 11:28 am

Im educated as an acoustical engineer, and we often struggle to find more physical analogs (sorry!) for acoustic phenomenon. I hadnt heard such a good explanation for Fourier. Well done sir. Reply

carl says:
May 17, 2011 at 1:02 pm

When youre talking about getting the energy at a particular level .. is that not inherent _IN_ that energy level? i guess im missing why the 3Hz is added in Reply

Anton says:
May 17, 2011 at 2:01 pm

Thanks, its a really nice way of explaining it! Reply

Jedd Haberstro says:


May 17, 2011 at 3:39 pm

Great article! I really like the colored visualize of the DFT equation. However, Im having a hard visualizing the bead and spinning rod. Do you mean there is a vertical rod (parallel to the y-axis of the time domain signal) with the bead be placed along the rod so that it matches the amplitude of the signal, and the then the rod is moving across the x-axis while being spun? Thanks! Reply

Mike says:
May 17, 2011 at 3:43 pm

Need more teachers like you Reply

Vinney says:
May 17, 2011 at 3:52 pm

If all math lessons were this intuitive (and colorcoded!) I would have passed calculus the first time. Thanks, Stuart! Reply

Leave a Reply
Name (required) E-mail (required) URI

Your Comment bilink href=""b-quotecode? Preview

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> line="" escaped="">
Submit Comment

<pre

lang=""

Notify me of follow-up comments via e-mail

2011 #AltDevBlogADay

Suffusion theme by Sayontan Sinha