Audio spectrum extraction from audio file by python

Asked 24/6, 2014 at 9:18 Answered 29/8, 2018 at 0:45

Sorry if I submit a duplicate, but I wonder if there is any lib in python which makes you able to extract sound spectrum from audio files. I want to be able to take an audio file and write an algoritm which will return a set of data {TimeStampInFile; Frequency-Amplitude}.

I heard that this is usually called Beat Detection, but as far as I see beat detection is not a precise method, it is good only for visualisation, while I want to manipulate on the extracted data and then convert it back to an audio file. I don't need to do this real-time.

I will appreciate any suggestions and recommendations.

Jene answered 24/6, 2014 at 9:18 Comment(0)

You can compute and visualize the spectrum and the spectrogram this using scipy, for this test i used this audio file: vignesh.wav

from scipy.io import wavfile # scipy library to read wav files
import numpy as np

AudioName = "vignesh.wav" # Audio File
fs, Audiodata = wavfile.read(AudioName)

# Plot the audio signal in time
import matplotlib.pyplot as plt
plt.plot(Audiodata)
plt.title('Audio signal in time',size=16)

# spectrum
from scipy.fftpack import fft # fourier transform
n = len(Audiodata) 
AudioFreq = fft(Audiodata)
AudioFreq = AudioFreq[0:int(np.ceil((n+1)/2.0))] #Half of the spectrum
MagFreq = np.abs(AudioFreq) # Magnitude
MagFreq = MagFreq / float(n)
# power spectrum
MagFreq = MagFreq**2
if n % 2 > 0: # ffte odd 
    MagFreq[1:len(MagFreq)] = MagFreq[1:len(MagFreq)] * 2
else:# fft even
    MagFreq[1:len(MagFreq) -1] = MagFreq[1:len(MagFreq) - 1] * 2 

plt.figure()
freqAxis = np.arange(0,int(np.ceil((n+1)/2.0)), 1.0) * (fs / n);
plt.plot(freqAxis/1000.0, 10*np.log10(MagFreq)) #Power spectrum
plt.xlabel('Frequency (kHz)'); plt.ylabel('Power spectrum (dB)');


#Spectrogram
from scipy import signal
N = 512 #Number of point in the fft
f, t, Sxx = signal.spectrogram(Audiodata, fs,window = signal.blackman(N),nfft=N)
plt.figure()
plt.pcolormesh(t, f,10*np.log10(Sxx)) # dB spectrogram
#plt.pcolormesh(t, f,Sxx) # Lineal spectrogram
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [seg]')
plt.title('Spectrogram with scipy.signal',size=16);

plt.show()

i tested all the code and it works, you need, numpy, matplotlib and scipy.

cheers

Martie answered 29/8, 2018 at 0:45 Comment(3)

Thanks for the script! Just add "import numpy as np" at the top to make it work. – Dzerzhinsk 16/9, 2019 at 7:18

Get those Errors: ```RuntimeWarning: divide by zero encountered in log10 plt.plot(freqAxis/1000.0, 10*np.log10(MagFreq)) #Power spectrum Traceback (most recent call last): packages/scipy/signal/_spectral_py.py", line 1971, in _triage_segments raise ValueError('window is longer than input signal') ValueError: window is longer than input signal dirk.schiller@C02DT9UMML7H scipy % ```` – Brigid 16/3, 2022 at 11:58

I ran into the same problem. It seems to be related to reading files that store multiple sound channels (like 2 channels stereo sound). Depending on how you interpret the different channels there are different ways to handle this. The simplest is just to take the first channel and throw away the rest (i.e. put Audiodata = Audiodata[:, 0] right after the data is read in.) – Clothe 20/4 at 18:40

I think your question has three separate parts:

How to load audio files into python?
How to calculate spectrum in python?
What to do with the spectrum?

1. How to load audio files in python?

You are probably best off by using scipy, as it provides a lot of signal processing functions. For loading audio files:

import scipy.io.wavfile

samplerate, data = scipy.io.wavfile.read("mywav.wav")

Now you have the sample rate (samples/s) in samplerate and data as a numpy.array in data. You may want to transform the data into floating point, depending on your application.

There is also a standard python module wave for loading wav-files, but numpy/scipy offers a simpler interface and more options for signal processing.

2. How to calculate the spectrum

Brief answer: Use FFT. For more words of wisdom, see:

Analyze audio using Fast Fourier Transform

Longer answer is quite long. Windowing is very important, otherwise you'll have strange spectra.

3. What to do with the spectrum

This is a bit more difficult. Filtering is often performed in time domain for longer signals. Maybe if you tell us what you want to accomplish, you'll receive a good answer for this one. Calculating the frequency spectrum is one thing, getting meaningful results with it in signal processing is a bit more complicated.

(I know you did not ask this one, but I see it coming with a probability >> 0. Of course, it may be that you have good knowledge on audio signal processing, in which case this is irrelevant.)

Analyst answered 24/6, 2014 at 16:13 Comment(2)

Thanks, that was realy helpfull. I plan to write a software using scikit-learn or PyBrain which will analyze audiofiles and try to determine to which music genere it belongs to. – Jene 27/6, 2014 at 18:36

Late to the party, but given your goal (write a library to classify music genres) you could take a look at this github.com/tyiannak/pyAudioAnalysis – Serinaserine 29/6, 2016 at 20:42

Recommended topics

Hot tags