Extract Treble and Bass from audio in iOS

Asked 16/3, 2013 at 22:55 Answered 25/4, 2013 at 15:58

I'm looking for a way to get the treble and bass data from a song for some incrementation of time (say 0.1 seconds) and in the range of 0.0 to 1.0. I've googled around but haven't been able to find anything remotely close to what I'm looking for. Ultimately I want to be able to represent the treble and bass level while the song is playing.

Thanks!

Madelinemadella answered 16/3, 2013 at 22:55 Comment(4)

See the accepted answer to this question: #1794510 – Jolinejoliotcurie 17/3, 2013 at 3:38

ok that's explains the procedure but not how to perform that procedure on iOS - or at least where to start. – Madelinemadella 17/3, 2013 at 15:11

iOS has low and high pass filters built-in in the audio unit framework. – Jolinejoliotcurie 17/3, 2013 at 23:24

can you supply some example code? I have no idea what direction to take with this. – Madelinemadella 21/4, 2013 at 10:1

Its reasonably easy. You need to perform an FFT and then sum up the bins that interest you. A lot of how you select will depend on the sampling rate of your audio.

You then need to choose an appropriate FFT order to get good information in the frequency bins returned.

So if you do an order 8 FFT you will need 256 samples. This will return you 128 complex pairs.

Next you need to convert these to magnitude. This is actually quite simple. if you are using std::complex you can simply perform a std::abs on the complex number and you will have its magnitude (sqrt( r^2 + i^2 )).

Interestingly at this point there is something called Parseval's theorem. This theorem states that after performinng a fourier transform the sum of the bins returned is equal to the sum of mean squares of the input signal.

This means that to get the amplitude of a specific set of bins you can simply add them together divide by the number of them and then sqrt to get the RMS amplitude value of those bins.

So where does this leave you?

Well from here you need to figure out which bins you are adding together.

A treble tone is defined as above 2000Hz.
A bass tone is below 300Hz (if my memory serves me correctly).
Mids are between 300Hz and 2kHz.

Now suppose your sample rate is 8kHz. The Nyquist rate says that the highest frequency you can represent in 8kHz sampling is 4kHz. Each bin thus represents 4000/128 or 31.25Hz.

So if the first 10 bins (Up to 312.5Hz) are used for Bass frequencies. Bin 10 to Bin 63 represent the mids. Finally bin 64 to 127 is the trebles.

You can then calculate the RMS value as described above and you have the RMS values.

RMS values can be converted to dBFS values by performing 20.0f * log10f( rmsVal );. This will return you a value from 0dB (max amplitude) down to -infinity dB (min amplitude). Be aware amplitudes do not range from -1 to 1.

To help you along, here is a bit of my C++ based FFT class for iPhone (which uses vDSP under the hood):

MacOSFFT::MacOSFFT( unsigned int fftOrder ) :
    BaseFFT( fftOrder )
{
    mFFTSetup   = (void*)vDSP_create_fftsetup( mFFTOrder, 0 );
    mImagBuffer.resize( 1 << mFFTOrder );
    mRealBufferOut.resize( 1 << mFFTOrder );
    mImagBufferOut.resize( 1 << mFFTOrder );
}

MacOSFFT::~MacOSFFT()
{
    vDSP_destroy_fftsetup( (FFTSetup)mFFTSetup );
}

bool MacOSFFT::ForwardFFT( std::vector< std::complex< float > >& outVec, const std::vector< float >& inVec )
{
    return ForwardFFT( &outVec.front(), &inVec.front(), inVec.size() );
}

bool MacOSFFT::ForwardFFT( std::complex< float >* pOut, const float* pIn, unsigned int num )
{
    // Bring in a pre-allocated imaginary buffer that is initialised to 0.
    DSPSplitComplex dspscIn;
    dspscIn.realp = (float*)pIn;
    dspscIn.imagp = &mImagBuffer.front();

    DSPSplitComplex dspscOut;
    dspscOut.realp  = &mRealBufferOut.front();
    dspscOut.imagp  = &mImagBufferOut.front();

    vDSP_fft_zop( (FFTSetup)mFFTSetup, &dspscIn, 1, &dspscOut, 1, mFFTOrder, kFFTDirection_Forward );

    vDSP_ztoc( &dspscOut, 1, (DSPComplex*)pOut, 1, num );

     return true;
}

Coordinate answered 24/4, 2013 at 15:55 Comment(0)

It seems that you're looking for Fast Fourier Transform sample code.

It is quite a large topic to cover in an answer.

The tools you will need are already build in iOS: vDSP API

This should help you: vDSP Programming Guide

And there is also a FFT Sample Code available

You might also want to check out iPhoneFFT. Though that code is slighlty outdated it can help you understand processes "under-the-hood".

Incense answered 23/4, 2013 at 13:51 Comment(0)

Refer to auriotouch2 example from Apple - it has everything from frequency analysis to UI representation of what you want.

Fleer answered 25/4, 2013 at 15:58 Comment(0)

Recommended topics

Hot tags