I have over a thousand audio files, and I want to check if their sample rate is 16kHz. To do it manually would take me forever. Is there a way to check the sample rate using python?
Python has a builtin module dealing with WAV files.
You can write a simple script that will iterate over all files in some directory. something along the general lines of:
import os
import wave
for file_name in os.listdir(FOLDER_PATH):
with wave.open(file_name, "rb") as wave_file:
frame_rate = wave_file.getframerate()
.... DO WHATEVER ....
wave.open()
doesn't return a context manager. For older Python versions, the call can be wrapped in contextlib.closing()
. –
Inquisitive For .wav files the solution might be:
from scipy.io.wavfile import read as read_wav
import os
os.chdir('path') # change to the file directory
sampling_rate, data=read_wav("filename.wav") # enter your filename
print sampling_rate
Solution without importing external libraries
Most probably you already have 'ffmpeg' and 'ffprobe' installed on your system (this are core frameworks on which other python libraries rely on). Then you can pipe
any info about the audio directly from the 'ffprobe'. This might be easier than installing any additional APIs or libraries which either way will be working with ffprobe in the background.
- ffprobe allows to export results directly in the json format
- you can specify which audio parameters to output from ffprobe
My example that gathers only 'sample_rate' from the audio files.
import json
import subprocess
# specify parameters for ffprobe
file_path = "your_audio_file.mp4"
out_data = "stream=sample_rate:format=0:stream_tags=0:format_tags=0"
command = f"ffprobe -v quiet -print_format json -show_format -select_streams a:0 -show_entries {out_data} {file_path}"
# run ffprobe as a subprocess
process = subprocess.Popen(command.split(), stdout=subprocess.PIPE)
output, error = process.communicate()
# gather output from the json
metadata = json.loads(output)
audio_stream = metadata["streams"][0]
sample_rate = audio_stream.get("sample_rate", None)
I end up getting unknow file format error with the wave package from python. wave-error
Alternatively the sox wrapper in python works for me. pysox
!pip install sox
import sox
sox.file_info.sample_rate("file1.wav")
Hope it helps
!pip install pydub
- from pydub.utils import mediainfo
- info=mediainfo("abc.wav")
- print(info)
I use the code given below whenever I want to find the sample rate.
import torchaudio
metadata = torchaudio.info('path/to/audio/file.extension')
print(metadata)
The output will look something like this
AudioMetaData(sample_rate=8000, num_frames=625920, num_channels=1, bits_per_sample=16, encoding=PCM_S)
This is a solution with pydub
from pydub.utils import mediainfo
sr = int(mediainfo("file_path")['sample_rate'])
And here another way with pydub:
from pydub import AudioSegment
audio = AudioSegment.from_file(audio_file)
sr = audio.frame_rate
By the way pydub here just uses ffprobe or avprobe as cmd with popen.
© 2022 - 2024 — McMap. All rights reserved.