Fastest way to read/process large wav files (or any large file) to python
Asked Answered
C

0

8

I'm working on a school project where I have to work with large wav files ( > 250Mgb), and I wonder, why when I read such a file to audacity software, it takes about 40 sec to be read and plotted, but when reading it to python using script.io.wavfile.read, it just last for ever.

So my question is, how does audacity software make it that fast and is this something I can do in python to make it that fast?

EDIT: I added a new section to my code which computes and plots the envelope of a wav file, but the problem is when trying a large wav file, it just going to take years.

Is there any way to read and process large wav files faster?

This is the code I'm using:

import matplotlib.pyplot as plt
import numpy as np
from scipy.io.wavfile import read
from tkinter import filedialog

# Browse, read the signal and extract signal informations (fs, duration)
filename = filedialog.askopenfilename(filetypes = (("""
            Template files""", "*.wav"), ("All files", "*")))

fs, data = read(filename, mmap=True)

T = len(data) / fs        #duration
nsamples = T * fs       #number of samples
time = np.linspace(0, T, nsamples)


# Compute the envelope of the signal
from scipy.signal import hilbert, chirp, resample
analytic_signal = hilbert(data)
amplitude_envelope = np.abs(analytic_signal)
instantaneous_phase = np.unwrap(np.angle(analytic_signal))
instantaneous_frequency = (np.diff(instantaneous_phase) /(2.0*np.pi) * fs)


len_E = len(amplitude_envelope)
t2 = np.linspace(0,T,len_E)

# Plot the signal and its envelope
plt.figure()
plt.subplot(211)
plt.plot(time, data)

plt.subplot(212)
plt.plot(t2,amplitude_envelope)
plt.show()
Chimerical answered 15/3, 2019 at 6:28 Comment(4)
Pass 1: Read the headers, skip the data chunks. Pass 2: Read necessary data chunks. Pass 3: Read remaining data chunks. See also soundfile.sapp.org/doc/WaveFormat Show your code for more specific comments.Breaking
Thank you sir for the informations, i added the code, i have no idea how to implement what you have commented in my code, any help would be much appreciatedWentletrap
You can try using one of the other libraries that read wav files. Personally, I found librosa.read() to be much faster than scipy, but I just ran a test and it was slower.Goren
I know nothing about wav format. But if the format is usable that way I would read the whole file into RAM than split and hand the parts over to multiple processes (would use multiprocessing.Pool) for that. Then it would computed on multiple cores.Indent

© 2022 - 2024 — McMap. All rights reserved.