Trying to understand the output of AKFFTTap in AudioKit
Asked Answered
C

1

5

Using AudioKit, I'm trying to build an app that analyses the input of the microphone and separate the incoming sound into pieces of 3 frequency ranges (low, mid, high) and their amplitude.

This is the code I have:

class ViewController: UIViewController {

    var mic: AKMicrophone!
    var amplitude: AKAmplitudeTracker!
    var fftTap: AKFFTTap?
    var timer:  Timer!

    override func viewDidLoad() {
        super.viewDidLoad()
        // Do any additional setup after loading the view, typically from a nib.

        mic = AKMicrophone()
        fftTap = AKFFTTap.init(mic)

    }

    override func viewDidAppear(_ animated: Bool) {
        super.viewDidAppear(animated)

        do {
            try AudioKit.start()
        } catch {
            AKLog("AudioKit did not start!")
        }

        mic.start()

        timer = Timer.scheduledTimer(withTimeInterval: 0.01, repeats: true, block: { (timer) in

            for i in 0...256 {
                print(Double(self.fftTap?.fftData[i] ?? 0.0))
            }

        })
    }

}

But now I have no idea what the output actually means?

How do I get the max amplitude for a certain frequency range? I need all three ranges at the same time, so I think the mere Frequency-Tracker won't do it.

From reading documentations about FFT, I understand that the first 256 bins are representations of a certain frequency's amplitude. But I only found Matlab plotting-examples that convert those values to plots (which don't really make sense to me).

Cyrie answered 7/10, 2018 at 10:43 Comment(0)
C
6

I found a code-snippet on Google that helped me solve my problem:

https://groups.google.com/forum/#!topic/comp.dsp/cZsS1ftN5oI

Specifically this part:

/* do FFT (taken from NR [http://www.nr.com] but uses array of doubles) */
    four1(fftBuffer-1, FFT_SIZE, 1);

/* display 15 bins around the frequency of interest */
    for (long k = 80; k < 110; k += 2) {

    /* real */
        double re = fftBuffer[k];

    /* imaginary */
        double im = fftBuffer[k+1];

    /* get normalized bin magnitude */
        double normBinMag = 2.*sqrt(re*re + im*im) / FFT_SIZE;

    /* convert to dB value */
        double amplitude = 20. * log10( normBinMag );

    /* and display */
        printf("bin: %d,\tfreq: %f [Hz],\tmag: %f,\t ampl.: %f [dB]\n", \
               k/2, sampleRate*.5*(double)k/FFT_SIZE, normBinMag, amplitude);
    }
}

/* Program output:

bin: 40,    freq: 861.328125 [Hz],  mag: 0.000000,   ampl.: -182.347994 [dB]
bin: 41,    freq: 882.861328 [Hz],  mag: 0.000000,   ampl.: -180.895076 [dB]
bin: 42,    freq: 904.394531 [Hz],  mag: 0.000000,   ampl.: -179.201401 [dB]
bin: 43,    freq: 925.927734 [Hz],  mag: 0.000000,   ampl.: -177.156879 [dB]
bin: 44,    freq: 947.460938 [Hz],  mag: 0.000000,   ampl.: -174.555312 [dB]
bin: 45,    freq: 968.994141 [Hz],  mag: 0.000000,   ampl.: -170.934049 [dB]
bin: 46,    freq: 990.527344 [Hz],  mag: 0.000000,   ampl.: -164.817195 [dB]
bin: 47,    freq: 1012.060547 [Hz], mag: 1.000000,   ampl.: 0.000000 [dB]
bin: 48,    freq: 1033.593750 [Hz], mag: 0.000000,   ampl.: -164.633624 [dB]
bin: 49,    freq: 1055.126953 [Hz], mag: 0.000000,   ampl.: -170.566625 [dB]
bin: 50,    freq: 1076.660156 [Hz], mag: 0.000000,   ampl.: -174.003468 [dB]
bin: 51,    freq: 1098.193359 [Hz], mag: 0.000000,   ampl.: -176.419757 [dB]
bin: 52,    freq: 1119.726562 [Hz], mag: 0.000000,   ampl.: -178.277857 [dB]
bin: 53,    freq: 1141.259766 [Hz], mag: 0.000000,   ampl.: -179.783660 [dB]
bin: 54,    freq: 1162.792969 [Hz], mag: 0.000000,   ampl.: -181.046952 [dB]

*/

[Edit]

As requested, here's the Swift-Code:

//
//  ViewController.swift
//

import AudioKit
import UIKit

class ViewController: UIViewController {

    var mic: AKMicrophone!
    var fftTap: AKFFTTap?
    var timer:  Timer!
    let FFT_SIZE = 512
    let sampleRate:double_t = 44100

    override func viewDidLoad() {
        super.viewDidLoad()

        mic = AKMicrophone()

        fftTap = AKFFTTap.init(mic)

    }

    override func viewDidAppear(_ animated: Bool) {
        super.viewDidAppear(animated)

        do {
            try AudioKit.start()
        } catch {
            AKLog("AudioKit did not start!")
        }

        mic.start()

        timer = Timer.scheduledTimer(withTimeInterval: 0.1, repeats: true, block: { (timer) in

            for i in 0...510 {

                let re = self.fftTap!.fftData[i]
                let im = self.fftTap!.fftData[i + 1]
                let normBinMag = 2.0 * sqrt(re * re + im * im)/self.FFT_SIZE
                let amplitude = ((20.0 * log10(normBinMag))

                print("bin: \(i/2) \t freq: \(frequency)\t ampl.: \(amplitude)")
            }

            // Now do anything you like with the data
            // Be aware, though, that the amplitude is a negative number
            // the lower, the less input it represents
            // in my tests, the lowest number was around -260
            // Read more on Google about converting the negative
            // number to a positive

        })
    }

}
Cyrie answered 13/10, 2018 at 19:24 Comment(6)
Hi, I am facing the same problem this days, can you please also share your AudioKit code and how did you normalize the numbers retrieved from the FFTTap? and what does it mean "bin"? thanks!Sachasachem
I thought by the way that the values returned from the AKFFTTap are already the values of the magnitudeSachasachem
The return values from FFTTap are a combination of frequency and amplitude. A «bin» is basically just one entry in the array. You have to do some math to get those values separately. I'll post the final code when I get home after work.Cyrie
I updated my code above with the Swift code. Hope it helps!Cyrie
Thanks! helps a lot. just a few questions as a newbie in the sound processing world, normBinMag is a relative number that is calculated for each bin or is it a spread over all bins of the magnitude (means the sum of all normBinMag of all bin should be 1)? the amplitude negative values are in dB?Sachasachem
Sorry, I have to pass on your first question (my knowledge of Sound-processing might be as limited as your's ;). I have noooo idea what normBinMag is - but you might be right. Amplitude neg. values should be in dB. But I'm not quite sure about this as well. For my app it doesn't really matter as I just need the relative values. Hope google can help you further. A search for FFT-DB-Values should bring up some results.Cyrie

© 2022 - 2024 — McMap. All rights reserved.