I'm working on a school project where I have to work with large wav files ( > 250Mgb), and I wonder, why when I read such a file to audacity software, it takes about 40 sec to be read and plotted, but when reading it to python using script.io.wavfile.read, it just last for ever.
So my question is, how does audacity software make it that fast and is this something I can do in python to make it that fast?
EDIT: I added a new section to my code which computes and plots the envelope of a wav file, but the problem is when trying a large wav file, it just going to take years.
Is there any way to read and process large wav files faster?
This is the code I'm using:
import matplotlib.pyplot as plt
import numpy as np
from scipy.io.wavfile import read
from tkinter import filedialog
# Browse, read the signal and extract signal informations (fs, duration)
filename = filedialog.askopenfilename(filetypes = (("""
Template files""", "*.wav"), ("All files", "*")))
fs, data = read(filename, mmap=True)
T = len(data) / fs #duration
nsamples = T * fs #number of samples
time = np.linspace(0, T, nsamples)
# Compute the envelope of the signal
from scipy.signal import hilbert, chirp, resample
analytic_signal = hilbert(data)
amplitude_envelope = np.abs(analytic_signal)
instantaneous_phase = np.unwrap(np.angle(analytic_signal))
instantaneous_frequency = (np.diff(instantaneous_phase) /(2.0*np.pi) * fs)
len_E = len(amplitude_envelope)
t2 = np.linspace(0,T,len_E)
# Plot the signal and its envelope
plt.figure()
plt.subplot(211)
plt.plot(time, data)
plt.subplot(212)
plt.plot(t2,amplitude_envelope)
plt.show()
librosa.read()
to be much faster than scipy, but I just ran a test and it was slower. – Gorenwav
format. But if the format is usable that way I would read the whole file into RAM than split and hand the parts over to multiple processes (would usemultiprocessing.Pool
) for that. Then it would computed on multiple cores. – Indent