Transforming Audio Samples From Time Domain to Frequency Domain
Asked Answered
P

2

7

as a software engineer I am facing with some difficulties while working on a signal processing problem. I don't have much experience in this area.

What I try to do is to sample the environmental sound with 44100 sampling rate and for fixed size windows to test if a specific frequency (20KHz) exists and is higher than a threshold value.

Here is what I do according to the perfect answer in How to extract frequency information from samples from PortAudio using FFTW in C

102400 samples (2320 ms) is gathered from audio port with 44100 sampling rate. Sample values are between 0.0 and 1.0

int samplingRate = 44100;
int numberOfSamples = 102400;
float samples[numberOfSamples] = ListenMic_Function(numberOfSamples,samplingRate);

Window size or FFT Size is 1024 samples (23.2 ms)

int N = 1024;

Number of windows is 100

int noOfWindows = numberOfSamples / N;

Splitting samples to noOfWindows (100) windows each having size of N (1024) samples

float windowSamplesIn[noOfWindows][N];
for i:= 0 to noOfWindows -1 
    windowSamplesIn[i] = subarray(samples,i*N,(i+1)*N);
endfor

Applying Hanning window function on each window

float windowSamplesOut[noOfWindows][N];
for i:= 0 to noOfWindows -1 
    windowSamplesOut[i] = HanningWindow_Function(windowSamplesIn[i]);
endfor

Applying FFT on each window (real to complex conversion done inside the FFT function)

float frequencyData[noOfWindows][samplingRate/2]; 
for i:= 0 to noOfWindows -1 
    frequencyData[i] = RealToComplex_FFT_Function(windowSamplesOut[i], samplingRate);
endfor

In the last step, I use the FFT function implemented in this link: http://www.codeproject.com/Articles/9388/How-to-implement-the-FFT-algorithm ; because I cannot implement an FFT function from the scratch.

What I can't be sure is while giving N (1024) samples to FFT function as input, samplingRate/2 (22050) decibel values is returned as output. Is it what an FFT function does?

I understand that because of Nyquist Frequency, I can detect half of sampling rate frequency at most. But is it possible to get decibel values for each frequency up to samplingRate/2 (22050) Hz?

Thanks, Vahit

Probability answered 19/8, 2012 at 20:11 Comment(0)
E
7

See see How do I obtain the frequencies of each value in an FFT?

From a 1024 sample input, you can get back 512 meaningful frequency-levels.

So, yes, within your window, you'll get back a level for the Nyquist frequency.

The lowest frequency level you'll see is for DC (0 Hz), and the next one up will be for SampleRate/1024, or around 44 Hz, the next for 2 * SampleRate/1024, and so on, up to 512 * SampleRate / 1024 Hz.

Emaemaciate answered 19/8, 2012 at 20:24 Comment(4)
For another perspective you may find this post useful: blog.bjornroche.com/2012/07/…Twosome
And for a perspective with why the above blog post incorrectly confuses frequency with pitch, here's another blog post: musingpaw.com/2012/04/…Doloritas
I agree with you David. As conclusion, I think the FFT implementation here has some mistake at the end of the function where the fundamental frequency is calculated. What do you think, friends?Probability
@Vahocan : FFTs do not compute fundamental frequency. An FFT gives you all the frequencies, of which the fundamental frequency may not have the largest magnitude nor be at any bin center.Doloritas
T
2

Since only one band is used in your FFT, I would expect your results to be tarnished by side-band effects, even with proper windowing. It might work, but you might also get false positives with some input frequencies. Also, your signal is close to your niquist, so you are assuming a fairly good signal path up to your FFT. I don't think this is the right approach.

I think a better approach to this kind of signal detection would be with a high order filter (depending on your requirements, I would guess fourth or fifth order, which isn't actually that high). If you don't know how to design a high order filter, you could use two or three second order filters in series. Designing a second order filter, sometimes called a "biquad" is described here:

http://www.musicdsp.org/files/Audio-EQ-Cookbook.txt

albeit very tersely and with some assumptions of prior knowledge. I would use a high-pass (HP) filter with corner frequency as low as you can make it, probably between 18 and 20 kHz. Keep in mind there is some attenuation at the corner frequency, so after applying a filter multiple times you will drop a little signal.

After you filter the audio, take the RMS or average amplitude (that is, the average of the absolute value), to find the average level over a time period.

This technique has several advantages over what you are doing now, including better latency (you can start detecting within a few samples), better reliability (you won't get false-positives in response to loud signals at spurious frequencies), and so on.

This post might be of relevance: http://blog.bjornroche.com/2012/08/why-eq-is-done-in-time-domain.html

Twosome answered 20/8, 2012 at 0:43 Comment(5)
At first, I try to solve my problem in time domain, but I totally failed. What simply I did is below. - Collect samples - Apply HPF on all samples published here Should I apply a window function before applying HPF on samples? Really, I am very a long way off signal processing.Probability
No windowing needed -- just go sample-by-sample. The filter you link to is first order, which is not nearly selective enough.Twosome
Then can you direct me to high order HPC implementations? i.e. your blog. Thanks.Probability
Best thing I know of is the Audio-EQ-Cookbook. Those are second order. Do two or three (or more!) of those in series.Twosome
Better late than never. Here is a how-to for filter. It's written for audio EQing, but the principles are the same: blog.bjornroche.com/2012/08/basic-audio-eqs.htmlTwosome

© 2022 - 2024 — McMap. All rights reserved.