How to get BPM and tempo audio features in Python [closed]
Asked Answered
L

6

24

I am involved in a project which requires me to extract song features like beats per minute (BPM), tempo, etc. However, I have not found a suitable Python library that can accurately detect these features.

Does anyone have any advice?

(In Matlab, I do know of a project called Mirtoolbox, which can give the BPM and tempo information after processing the local mp3 file.)

Lumpfish answered 26/12, 2011 at 10:54 Comment(1)
What is the encoding format? I have never heard of a python sound library... Then again, I'm far from omnipotent and all-knowing. Go crank start your google machine and feed it "python sound library"Nichollenicholls
F
7

Echo Nest API is what you are looking for:

http://echonest.github.io/remix/

Python bindings are rich, though installing Echo Nest can be pain as the team does not seem to be able to build solid installers.

However it does not do local processing. Instead, it calculates audio fingerprint and uploads the song for Echo Nest servers for the information extraction using algorithms they don't expose.

Flowerer answered 26/12, 2011 at 11:12 Comment(6)
is there local processing project which can extract the bpm features just based on the local mp3/wav file.Lumpfish
I did some research regarding the matter about one year ago and Echo Nest was the easiest solution for Python. I am not sure whether there are now other libraries available - please put them hear as answer if you find themFlowerer
I do have same finding with you. There is no usable libraries which can extract music features.Lumpfish
Or, is there any other echonest-liked library. Even it only include a few feature extraction function.Lumpfish
EchoNest won't issue API keys any more... developer.echonest.com/account/registerCrown
Their registration page doesn't even load anymore.Proceeds
I
18

This answer comes a year later, but anyway, for the record. I found three audio libraries with python bindings that extract features from audio. They are not that easy to install since they are really in C and you need to properly compile the python bindings and add them to the path to import, but here they are:

Ichneumon answered 28/1, 2013 at 23:43 Comment(1)
Now I would recommend using Essentia (essentia.upf.edu), it is a great library I contributed to some time ago.Ichneumon
F
7

Echo Nest API is what you are looking for:

http://echonest.github.io/remix/

Python bindings are rich, though installing Echo Nest can be pain as the team does not seem to be able to build solid installers.

However it does not do local processing. Instead, it calculates audio fingerprint and uploads the song for Echo Nest servers for the information extraction using algorithms they don't expose.

Flowerer answered 26/12, 2011 at 11:12 Comment(6)
is there local processing project which can extract the bpm features just based on the local mp3/wav file.Lumpfish
I did some research regarding the matter about one year ago and Echo Nest was the easiest solution for Python. I am not sure whether there are now other libraries available - please put them hear as answer if you find themFlowerer
I do have same finding with you. There is no usable libraries which can extract music features.Lumpfish
Or, is there any other echonest-liked library. Even it only include a few feature extraction function.Lumpfish
EchoNest won't issue API keys any more... developer.echonest.com/account/registerCrown
Their registration page doesn't even load anymore.Proceeds
R
2

i've found this code by @scaperot here that could help you:

import wave, array, math, time, argparse, sys
import numpy, pywt
from scipy import signal
import pdb
import matplotlib.pyplot as plt

def read_wav(filename):

    #open file, get metadata for audio
    try:
        wf = wave.open(filename,'rb')
    except IOError, e:
        print e
        return

    # typ = choose_type( wf.getsampwidth() ) #TODO: implement choose_type
    nsamps = wf.getnframes();
    assert(nsamps > 0);

    fs = wf.getframerate()
    assert(fs > 0)

    # read entire file and make into an array
    samps = list(array.array('i',wf.readframes(nsamps)))
    #print 'Read', nsamps,'samples from', filename
    try:
        assert(nsamps == len(samps))
    except AssertionError, e:
        print  nsamps, "not equal to", len(samps)

    return samps, fs

# print an error when no data can be found
def no_audio_data():
    print "No audio data for sample, skipping..."
    return None, None

# simple peak detection
def peak_detect(data):
    max_val = numpy.amax(abs(data)) 
    peak_ndx = numpy.where(data==max_val)
    if len(peak_ndx[0]) == 0: #if nothing found then the max must be negative
        peak_ndx = numpy.where(data==-max_val)
    return peak_ndx

def bpm_detector(data,fs):
    cA = [] 
    cD = []
    correl = []
    cD_sum = []
    levels = 4
    max_decimation = 2**(levels-1);
    min_ndx = 60./ 220 * (fs/max_decimation)
    max_ndx = 60./ 40 * (fs/max_decimation)

    for loop in range(0,levels):
        cD = []
        # 1) DWT
        if loop == 0:
            [cA,cD] = pywt.dwt(data,'db4');
            cD_minlen = len(cD)/max_decimation+1;
            cD_sum = numpy.zeros(cD_minlen);
        else:
            [cA,cD] = pywt.dwt(cA,'db4');
        # 2) Filter
        cD = signal.lfilter([0.01],[1 -0.99],cD);

        # 4) Subtractargs.filename out the mean.

        # 5) Decimate for reconstruction later.
        cD = abs(cD[::(2**(levels-loop-1))]);
        cD = cD - numpy.mean(cD);
        # 6) Recombine the signal before ACF
        #    essentially, each level I concatenate 
        #    the detail coefs (i.e. the HPF values)
        #    to the beginning of the array
        cD_sum = cD[0:cD_minlen] + cD_sum;

    if [b for b in cA if b != 0.0] == []:
        return no_audio_data()
    # adding in the approximate data as well...    
    cA = signal.lfilter([0.01],[1 -0.99],cA);
    cA = abs(cA);
    cA = cA - numpy.mean(cA);
    cD_sum = cA[0:cD_minlen] + cD_sum;

    # ACF
    correl = numpy.correlate(cD_sum,cD_sum,'full') 

    midpoint = len(correl) / 2
    correl_midpoint_tmp = correl[midpoint:]
    peak_ndx = peak_detect(correl_midpoint_tmp[min_ndx:max_ndx]);
    if len(peak_ndx) > 1:
        return no_audio_data()

    peak_ndx_adjusted = peak_ndx[0]+min_ndx;
    bpm = 60./ peak_ndx_adjusted * (fs/max_decimation)
    print bpm
    return bpm,correl


if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Process .wav file to determine the Beats Per Minute.')
    parser.add_argument('--filename', required=True,
                   help='.wav file for processing')
    parser.add_argument('--window', type=float, default=3,
                   help='size of the the window (seconds) that will be scanned to determine the bpm.  Typically less than 10 seconds. [3]')

    args = parser.parse_args()
    samps,fs = read_wav(args.filename)

    data = []
    correl=[]
    bpm = 0
    n=0;
    nsamps = len(samps)
    window_samps = int(args.window*fs)         
    samps_ndx = 0;  #first sample in window_ndx 
    max_window_ndx = nsamps / window_samps;
    bpms = numpy.zeros(max_window_ndx)

    #iterate through all windows
    for window_ndx in xrange(0,max_window_ndx):

        #get a new set of samples
        #print n,":",len(bpms),":",max_window_ndx,":",fs,":",nsamps,":",samps_ndx
        data = samps[samps_ndx:samps_ndx+window_samps]
        if not ((len(data) % window_samps) == 0):
            raise AssertionError( str(len(data) ) ) 

        bpm, correl_temp = bpm_detector(data,fs)
        if bpm == None:
            continue
        bpms[window_ndx] = bpm
        correl = correl_temp

        #iterate at the end of the loop
        samps_ndx = samps_ndx+window_samps;
        n=n+1; #counter for debug...

    bpm = numpy.median(bpms)
    print 'Completed.  Estimated Beats Per Minute:', bpm

    n = range(0,len(correl))
    plt.plot(n,abs(correl)); 
    plt.show(False); #plot non-blocking
    time.sleep(10);
plt.close();
Rusch answered 21/3, 2017 at 17:56 Comment(1)
This is pretty cool, but I'd be curious how well it works, given that the simple bpm detector is indeed simple. Did you ever try something like this as an alternative?Aikoail
C
1

Librosa has the librosa.beat.beat_track() method, but you need to supply an estimate of the BMP as the "start_bpm" parameter. Not sure how accurate it is, but perhaps worth a shot.

Crown answered 27/5, 2016 at 18:16 Comment(0)
P
1

librosa is the package you are looking for. It contains extensive range of functions for audio analysis. librosa.beat.beat_track() and librosa.beat.tempo() functions will extract the required features for you.

Spectral features like chroma, MFCC, Zero-crossing rate, and rhythm features such as tempogram can also be obtained using the functions available in librosa.

Prehension answered 22/10, 2018 at 8:50 Comment(0)
O
-1

Well i recently came across Vampy which is wrapper plugin that enables you to use Vamp plugins written in Python in any Vamp host. Vamp is an audio processing plugin system for plugins that extract descriptive information from audio data. Hope it helps.

Orestes answered 9/5, 2014 at 17:42 Comment(1)
The Vamp website it rather unclear on how to install Vampy, they suggest using a tool such as SonicAnnotator, but the website appears to be down... omras2.org/SonicAnnotator Far more useful would be if Vampy were a python package that could be easily installed with pip/conda or cloned via git with a simple way of using it as a command line tool.Crown

© 2022 - 2024 — McMap. All rights reserved.