How do I obtain the frequencies of each value in an FFT?

Asked 6/12, 2010 at 9:18 Answered 20/8, 2012 at 9:57

179

I have an FFT result. These are stored in two double arrays: a real part array and an imaginary part array. How do I determine the frequencies that correspond to each element in these arrays?

In other words, I would like have create an array that stores the frequencies for each real and imaginary component of my FFT.

Peta answered 6/12, 2010 at 9:18 Comment(3)

I do it in C#.net. Can you help me? – Peta 6/12, 2010 at 17:55

If you don't understand the relevance of the real and imaginary parts of an FFT then you aren't going to get any meaningful results, so you should hunt out some FFT and signal processing tutorials to understand how to interpret the results. I think it's quite likely that whatever you're using it for, you are wanting the magnitude of the FFT or the Power Spectral Density. – Teazel 6/12, 2010 at 23:28

Thank you! I want to get peak frequencies of each frame (frame length depend in Window Length and Shift Length) – Peta 7/12, 2010 at 3:45

435

The first bin in the FFT is DC (0 Hz), the second bin is Fs / N, where Fs is the sample rate and N is the size of the FFT. The next bin is 2 * Fs / N. To express this in general terms, the nth bin is n * Fs / N.

So if your sample rate, Fs is say 44.1 kHz and your FFT size, N is 1024, then the FFT output bins are at:

  0:   0 * 44100 / 1024 =     0.0 Hz
  1:   1 * 44100 / 1024 =    43.1 Hz
  2:   2 * 44100 / 1024 =    86.1 Hz
  3:   3 * 44100 / 1024 =   129.2 Hz
  4: ...
  5: ...
     ...
511: 511 * 44100 / 1024 = 22006.9 Hz

Note that for a real input signal (imaginary parts all zero) the second half of the FFT (bins from N / 2 + 1 to N - 1) contain no useful additional information (they have complex conjugate symmetry with the first N / 2 - 1 bins). The last useful bin (for practical aplications) is at N / 2 - 1, which corresponds to 22006.9 Hz in the above example. The bin at N / 2 represents energy at the Nyquist frequency, i.e. Fs / 2 ( = 22050 Hz in this example), but this is in general not of any practical use, since anti-aliasing filters will typically attenuate any signals at and above Fs / 2.

Washedup answered 6/12, 2010 at 22:35 Comment(16)

Ah, if I get sqrt(realreal + imaginaryimaginary) on each of those complex numbers(real and imaginary parts of result), should it return the frequency (Hz)? – Peta 7/12, 2010 at 3:30

@user532017: no - sqrt(re*re+im*im) will be the magnitude of the signal at the frequency of the given bin. The frequency of the bin is determined by its index as per my answer above. If you find the bin with the largest magnitude then you can determine the frequency of this peak from the index as above. – Washedup 7/12, 2010 at 8:4

I used FFT in a wave file. If I use NPoint = 512 , I will have Peak frequency is a Hz. If I use NPoint= 1024, I will have peak frequency is b Hz. Why the same wave file is the peak frequency difference? I don't understand that much. Please explain this for me, thank so very much. – Peta 7/12, 2010 at 9:3

Follow some document so that find the peak frequency, before I have to calculate spectrogram(rere + imim) array, then I have to find max value in spectrogram array above. Final I get index of element is max value in array, peakFrequency = index * SampleRate / NumSample. Is this right? Thank for your enthusiasm – Peta 7/12, 2010 at 9:30

@user532017: yes, this is a very important point - the larger N (the longer the FFT) the higher the resolution - bigger N means each bin is narrower. The frequency range remains the same though: 0 to Fs/2. – Washedup 7/12, 2010 at 10:12

@user523017: yes, I think you have this correct now - the term spectrogram is not quite correct (it's really just a single power spectrum, not a spectrogram, which is a time-varying sequence of power spectra), but you have the right general idea. – Washedup 7/12, 2010 at 10:14

Thank Paul very much. I had done it. But the results aren't slightly bad. I'm a wave file, I want to get peaks frequency at each Frame. I choose Frame's window length and shift length are 512 and 256, my N-Point=window length=512. Then I perform FFT in each Frame. My wave file is only a segment of whistles. My final result is the peaks frequency array of each frame. This is: [3750, 3750,3750,3750,4156.25, 531.25,875,406.25, 4125,4250,4250]. I don't understand a reason do my result have [..., 531.25, 875, 406.25,...] because frequency of whistle is between 3500-4500Hz. Do you know a reason? – Peta 8/12, 2010 at 3:14

@user532017: you should try plotting your power spectrum to see if it look reasonable - you may have bugs or other problems still to iron out. Also you need to use a window function prior to the FFT, otherwise you will get artefacts in the power spectrum. Read this: en.wikipedia.org/wiki/Window_function - I suggest you use a von Hann (Hanning) window. – Washedup 8/12, 2010 at 7:56

@PaulR - I wanted to thank you for this wonderful answer that has served me over the years. I would visit this answer before I had a StackOverflow account, and I actually forgot about thanking you once I signed up. I was recently taking a look at FFT stuff and I remembered your answer and just visited it now. Once I got here, I remembered to thank you... so thank you! Whenever I have a debate with someone on interpreting what the each point on the horizontal axis of the FFT is, I just point them to this link. – Tennes 30/1, 2015 at 21:21

@rayryeng: thank you so much - I think that's the nicest acknowledgement I've ever had in ~5 years of answering questions here on SO! – Washedup 30/1, 2015 at 22:17

I'm confused by the notion that 0 Hz has an amplitude, though: 0 Hz is literally no signal, so is that actually not 0 Hz but instead "the bin for any frequencies between 0Hz and 43.1Hz"? – Sandpaper 12/4, 2020 at 21:44

@Mike'Pomax'Kamermans: think of the bin 0 component of the signal as the average value aka arithmetic mean (which is exactly what it is, mathematically). From an electronics or audio perspective this is the DC component. You’re right though that this bin will also contain some energy from signals with very low frequencies. – Washedup 13/4, 2020 at 6:56

Does this mean when we calculate magnitude later, we can do it for half the array? Does the other half of the array have any use at all? – Womenfolk 26/7, 2021 at 8:42

@KillzoneKid: for the case of a purely real input you can just ignore the upper half of the spectrum. If you care about absolute magnitude then you'll need to apply a factor of 2 to the magnitudes in the lower half of the spectrum to get the correct value (apart from the DC component at bin 0, which has no complement). – Washedup 26/7, 2021 at 11:13

@PaulR Thank you for clarification. The documentation for FFT function says "Internally input is downscaled by 2 for every stage to avoid saturations inside CFFT/CIFFT process" which I assume you are reffering to. But this would mean that upscaling factor will depend on FFT size, or am I missing something? – Womenfolk 26/7, 2021 at 11:50

@KillzoneKid: no, that’s a different (internal) scaling factor. Usually there is a factor of 1/N for the size of the FFT (this is implementation-dependent however), and then there’s a factor of 2 if you’re only using the bottom half of the resulting FFT as mentioned above. You may also need other factors, for hardware calibration, window function loss, etc - these can often be combined into a single overall scale factor, for the sake of efficiency. – Washedup 26/7, 2021 at 12:21

Take a look at my answer here.

Answer to comment:

The FFT actually calculates the cross-correlation of the input signal with sine and cosine functions (basis functions) at a range of equally spaced frequencies. For a given FFT output, there is a corresponding frequency (F) as given by the answer I posted. The real part of the output sample is the cross-correlation of the input signal with cos(2*pi*F*t) and the imaginary part is the cross-correlation of the input signal with sin(2*pi*F*t). The reason the input signal is correlated with sin and cos functions is to account for phase differences between the input signal and basis functions.

By taking the magnitude of the complex FFT output, you get a measure of how well the input signal correlates with sinusoids at a set of frequencies regardless of the input signal phase. If you are just analyzing frequency content of a signal, you will almost always take the magnitude or magnitude squared of the complex output of the FFT.

Ventricle answered 6/12, 2010 at 16:51 Comment(3)

The real and Imaginary part are FFT's result that used for? Please explain for me. Thank you – Peta 6/12, 2010 at 17:59

Could it be that the magnitude of the complex outputs has to be doubled each? (if I restrict my interpretation to the lower half) – Ejection 21/1, 2016 at 11:31

@Wolf: DFT uses complex exponentials. Real-valued signals are sums of cos functions, exponentials appear by pair of conjugates (look for "inverse Euler formula" for cos function), making the mathematical spectrum symmetrical about 0 Hz. Usually it's not necessary to process the negative conjugates, the spectrum can be converted back into actual cos functions via Euler formula. In practical it consists in doubling magnitudes of positive components, except for DC (+/-0 Hz). Note negative frequency exponentials are only mathematical objects, actual (cos) signals have only positive frequencies. – Introject 28/5, 2023 at 7:0

I have used the following:

public static double Index2Freq(int i, double samples, int nFFT) {
  return (double) i * (samples / nFFT / 2.);
}

public static int Freq2Index(double freq, double samples, int nFFT) {
  return (int) (freq / (samples / nFFT / 2.0));
}

The inputs are:

i: Bin to access
samples: Sampling rate in Hertz (i.e. 8000 Hz, 44100Hz, etc.)
nFFT: Size of the FFT vector

Nystagmus answered 20/8, 2012 at 9:57 Comment(3)

The accepted answer says this should be i * samples / nFFT. Why is the extra 2 there? Am I missing something? – Pliam 14/5, 2014 at 9:38

@yatisagade if your sampling freq is 44100Hz, the highest frequency you can actually obtain is half, so 22050Hz. It's called Nyquist Frequency (check wikipedia for detailed info). The human hearing system can not perceive anything >~20kHz, so that is why 44.1kHz is the most common sampling rate. – Consignment 15/12, 2021 at 16:2

Also, I think in this case nFFT relates to only half of the actual FFT (many functions do that, since the second half is simply the mirror of the first half) – Consignment 15/12, 2021 at 16:11

The FFT output coefficients (for complex input of size N) are from 0 to N - 1 grouped as [LOW,MID,HI,HI,MID,LOW] frequency.

I would consider that the element at k has the same frequency as the element at N-k since for real data, FFT[N-k] = complex conjugate of FFT[k].

The order of scanning from LOW to HIGH frequency is

0,

 1,
 N-1,

 2,
 N-2

 ...

 [N/2] - 1,
 N - ([N/2] - 1) = [N/2]+1,

 [N/2]

There are [N/2]+1 groups of frequency from index i = 0 to [N/2], each having the frequency = i * SamplingFrequency / N

So the frequency at bin FFT[k] is:

if k <= [N/2] then k * SamplingFrequency / N
if k >= [N/2] then (N-k) * SamplingFrequency / N

Imperium answered 30/8, 2011 at 11:28 Comment(0)

Your k^th FFT result's frequency is 2*pi*k/N.

Atthia answered 6/12, 2010 at 13:11 Comment(1)

I guess this will be in radians – Symphonist 11/8, 2014 at 20:54

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags