Normalize FFT magnitude to imitate WMP
Asked Answered
R

4

6

So, I've been working on a little visualizer for sound files, just for fun. I basically wanted to imitate the "Scope" and "Ocean Mist" visualizers in Windows Media Player. Scope was easy enough, but I'm having problems with Ocean Mist. I'm pretty sure that it is some kind of frequency spectrum, but when I do an FFT on my waveform data, I'm not getting the data that corresponds to what Ocean Mist displays. The spectrum actually looks correct, so I knew there was nothing wrong with the FFT. I'm assuming that the visualizer runs the spectrum through some kind of filter, but I have no idea what it might be. Any ideas?

EDIT2: I posted an edited version of my code here (editor's note: link doesn't work anymore). By edited, I mean that I removed all the experimental comments everywhere, and left only the active code. I also added some descriptive comments. The visualizer now looks like this.

EDIT: Here are images. The first is my visualizer, and the second is Ocean Mist.

my visualizer

ocean mist

Resinous answered 17/3, 2010 at 21:55 Comment(11)
It might help if you posted a link to a screenshot of what you're trying to achieve (e.g., an example of the ocean mist visualization) for the lazy\non WMP users.Ole
@Resinous - I made some changes to your code. THEY ARE UNTESTED so I can't guarantee syntax, but I hope the spirit of them make sense. I'm about to head out for a while, but will check for updates later. Also, it would be helpful if you could post the documentation for the FFT you're using.Tadd
Well, you should have copied the link in the address bar after saving, because pastebin doesn't actually change the existing code, it makes a new "pad". I can wait :)Resinous
Well, getting late for me. Anyway, here's the place where I got the FFT. It isn't as big as say, FFTW, but it seems to work. The original page can't be reached, so here is a Google cache page. 74.125.77.132/search?hl=en&q=cache:http://www.librow.com/…Resinous
@Resinous - that was very silly of me, sorry. Anyway, I reconstructed the changes. See pastebin.com/8WgaaAMY. Make sure that when you pass a sine wave in, you get something like the green line in the loglog graph I posted earlier. Yours should be smoother due to no random noise, but the spike should be about the same width and at roughly the same horizontal place.Tadd
I made the changes that you specified, and it turned out like this: i43.tinypic.com/25ahroz.jpg It seems quite noisy. The sine wave test worked just fine, it was a straight line in the same place as yours.Resinous
Also, I suppose I should somehow create "fake" lines in between, to fill the gaps.Resinous
Well, after a bit of tinkering, it looks like this: i44.tinypic.com/2jacft0.jpg I'm content. I tried using a e^output to enlarge the taller spikes, and it seems to be working. Just needs a bit of tweaking, and it should be great. Thanks!Resinous
@Resinous - Glad it worked out. If you're doing e^output, that's essentially undoing the log in the calculation of the y-coordinate, so maybe just change that line to make y proportional to output.Tadd
You sure? I thought e^ was the inverse of ln, not log.Resinous
@Resinous - two things: 1. when you want to draw someone's attention to a comment, put @their username in your comment (see meta.#1593). 2. ln and log are proportional to each other. If 10^x = b, x = log(b). But you could also write ln(10^x) = ln(b) -> x*ln(10) = ln(b) -> x = ln(b)/ln(10). So, log(b) = ln(b)/ln(10). Since you're not displaying absolute numbers, this proportionality should be good enough for your purposes.Tadd
T
6

Here's some Octave code that shows what I think should happen. I hope the syntax is self-explanatory:

%# First generate some test data
%# make a time domain waveform of sin + low level noise
N = 1024;
x = sin(2*pi*200.5*((0:1:(N-1))')/N) + 0.01*randn(N,1);

%# Now do the processing the way the visualizer should
%# first apply Hann window = 0.5*(1+cos)
xw = x.*hann(N, 'periodic');
%# Calculate FFT.  Octave returns double sided spectrum
Sw = fft(xw);
%# Calculate the magnitude of the first half of the spectrum
Sw = abs(Sw(1:(1+N/2))); %# abs is sqrt(real^2 + imag^2)

%# For comparison, also calculate the unwindowed spectrum
Sx = fft(x)
Sx = abs(Sx(1:(1+N/2)));

subplot(2,1,1);
plot([Sx Sw]); %# linear axes, blue is unwindowed version
subplot(2,1,2);
loglog([Sx Sw]); %# both axes logarithmic

which results in the following graph: top: regular spectral plot, bottom: loglog spectral plot (blue is unwindowed) http://img710.imageshack.us/img710/3994/spectralplots.png

I'm letting Octave handle the scaling from linear to log x and y axes. Do you get something similar for a simple waveform like a sine wave?

OLD ANSWER

I'm not familiar with the visualizer you mention, but in general:

  • Spectra are often displayed using a log y-axis (or colormap for spectrograms).
  • Your FFT might be returning a double-sided spectrum, but you probably want to use only the first half (looks like you're doing already).
  • Applying a window function to your time data makes the spectral peaks narrower by reducing leakage (looks like you're doing this too).
  • You might need to divide by the transform blocksize if you're concerned with absolute magnitudes (I guess not important in your case).
  • It looks like the Ocean Mist visualizer is using a log x-axis too. It might also be smoothing adjacent frequency bins in sets or something.
Tadd answered 17/3, 2010 at 22:2 Comment(8)
I assume you mean log y-axis there, or is there a distinction? How would I implement it?Resinous
+1 for noting that both the x and y axis are logarithmic. The log-x aspect explains why the first narrow peak in the top plot is stretched to about 1/3 of the view in the lower plot. The log-y scaling explains why the variation between the peaks and the average values are compressed in the lower plot.Sallyanne
@Resinous - Both axes are logarithmic. I usually use Octave (a Matlab clone) for graphing, so I have to confess I'm not that good at mapping data to pixels myself. If you have a plotting library, look for loglog plotting (see en.wikipedia.org/wiki/Logarithmic_scale#Log-log_plots). If you're doing it yourself, make the display height proportional to log(spectrum amplitude), as @Paul R suggested. Then make display width proportional to log(freq/FMin), where FMin is the lowest frequency you want to display. I suggest 20 Hz to start with, but a higher number might look better.Tadd
@Tadd - Well, I (think I) implemented what you said, and it ended up like this: i41.tinypic.com/28jslj.jpg Not really what I expected. I might have screwed up though.Resinous
@Resinous - that definitely doesn't look right. Give me a few minutes, I'll make some graphs of what I think should happen.Tadd
Well, there's a clear difference between your graph and mine. Perhaps I can post my code and you can take a look? Not the FFT or anything, just the code that does the actual calculations and plotting.Resinous
@Bevin, sure go ahead. I'm going to be off-line for a couple of hours, but if you don't mind the delay I'd be happy to take a look, or maybe someone else will spot the issue.Tadd
Well, I posted it. The link is in the post at the top.Resinous
B
3

Normally for this kind of thing you want to convert your FFT output to a power spectrum, usually with a log (dB) amplitude scale, e.g. for a given output bin:

p = 10.0 * log10 (re * re + im * im);

Bookmaker answered 17/3, 2010 at 22:3 Comment(5)
Do I have to normalize this "p"? Like, dividing it by n/2 afterward?Resinous
It's a dB value - you can add or subtract a suitable dB offset to get it into whatever range you want. You can then convert this dB value to screen coordinates or pixel intensity or whatever is appropriate for your visualizer.Bookmaker
Well, I tried using your formula, and it came across as kind of noisy. Here, take a look: i39.tinypic.com/15eig3s.jpgResinous
In order to test your implementation you want to start with a simple signal with a known spectrum. Start with e.g. a single pure tone (sine wave) at say 1 kHz and see what that looks like - you should just get a single large peak. If not then you're doing something wrong with your FFT and/or plotting code.Bookmaker
@Resinous - @Paul R's suggestion for taking the log of the squared amplitude is right on. Looking at your second picture, it looks like you need to add a window. Multiply your time domain data by a function of the form 0.5*(1 - cos(2*pi*n/N)), where N is your transform blocksize. See en.wikipedia.org/wiki/Window_function for background.Tadd
C
1

It definitely looks like the ocean mist Y-Axis is logarithmic.

Context answered 17/3, 2010 at 22:19 Comment(1)
So, how would I implement a Y-log scale? Use the log(absolute magnitude) as the y-value?Resinous
K
1

It seems to that not only the y axis, but the x axis also is logarithmic. The distance between peaks seems to lower at higher frequencies.

Kalina answered 17/3, 2010 at 22:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.