How to extract the raw data from a mp3 file using python?
Asked Answered
M

4

8

I have got homework regarding audio data analysis using Python. I wonder is there any good module for me to use to extract the raw data from a mp3 file. I mean the raw data, not the metadata, id3 tags.

I know how to use the wave module to process .wav files. I can readframes to get the raw data. But I don't know how to do with mp3. I have searched a lot on google and stackoverflow and find eyeD3. But unfortunately the documentation is rather frustrating and now the version is 0.7.1, different from most examples I can find on the Internet.

Is there any good module that can extract raw data from a mp3? If there is any good documentation for eyeD3, it is also good.

Middle answered 19/5, 2013 at 11:22 Comment(3)
Check this: #3050072 . Apparently, the easiest would be to convert mp3 to wav using an external programNitwit
The phrase "raw data" is very confusing. If you say raw data i think you want to get the bytes of the file. (which you get with open('your.mp3', 'rb')) But i think you don't want this kind of raw data.Winser
I want the kind of raw data - bytes of the file. But no all bytes of the file are the contents of the music. There are still some tags and maybe sometthing others. So I wonder if there is any module can extract it.@IchUndNichtDuMiddle
A
22

If I understand your question, you can try using pydub (a library I wrote) to get the audio data like so:

from pydub import AudioSegment

sound = AudioSegment.from_mp3("test.mp3")

# sound._data is a bytestring
raw_data = sound._data
Ardys answered 16/9, 2013 at 14:24 Comment(5)
I'm getting the follow error message: File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)Stockroom
I solved the problem and described the solution over here github.com/jiaaro/pydub/issues/450Stockroom
Does not support mp4?Lilla
@StephenBoesch ffmpeg definitely supports mp4, so it should - but you may have to specify which codec it should useArdys
how to convert into float16? divide 255 or fliat16.max? thxBuiron
I
5

There are a few similar questions floating around stackoverflow. There are distinct use cases.

  1. The user wants to convert .mp3 files to PCM files such as .wav files.

  2. The user wants to access the raw data in the .mp3 file (that is, not treat it as compressed PCM). Here the use case is one of understanding how compression schemes like MP3 and AAC work.

This answer is aimed at the second of these, though I do not have working code to share or point to.

Compression schemes such as MP3 generally work in the frequency domain. As a simplified example, you could take a .wav file 1024 samples at a time, transform each block of 1024 samples using an FFT, and store that. Roughly speaking, the lossy compression then throws away information from the frequency domain so as to allow for smaller encodings.

A pure python implementation is highly impractical if all you want to do is convert from .mp3 to .wav. But if you want to explore how .mp3 and related schemes work, having something which you can easily tinker with, even if the code runs 1000 times slower than what ffmpeg uses, can actually be useful, especially if written in a way which allows the reader of the source code to see how .mp3 compression works. For example see http://bugra.github.io/work/notes/2014-07-12/discre-fourier-cosine-transform-dft-dct-image-compression/ for an IPython workbook that walks through how frequency domain transforms are used in image compression schemes like JPEG. Something like that for MP3 compression and similar would be useful for people learning about compression.

An .mp3 file is basically a sequence of MP3 frames, each of which has a header and data component. The first task then is to write a Python class (or classes) to represent these, and read them from an .mp3 file. First read the file in binary mode (that is, f = open(filename,"rb") and then data = f.read() -- on a modern machine, given that a typical 5min song in .mp3 is about 5MB, you may as well just read the whole thing in in one go).

It may also be worth writing a simpler (and far less efficient) coding scheme along these lines to explore how it works, gradually adding the tricks schemes like MP3 and AAC use as you go. For example, split a PCM input file into 1024 sample blocks, use an FFT or DCT or something, and back again, and see how you get your original data back. Then explore how you can throw data away from the frequency transformed version, and see what effect it has when transformed back to PCM data. Then end result will be very poor, at first, but by seeing the problems, and seeing what e.g. MP3 and AAC do, you can learn why these compression schemes do things the way they do.

In short, if your use case is a 'getting stuff done' one, you probably don't want to use Python. If, on the other hand, your use case is a 'learning how stuff gets done' one, that is different. (As a rough rule of thumb, what you could do with optimised assembly on a Pentium 100 from the 90s, you can do at roughly the same performance using Python on a modern Core i5 -- something like that -- there is a factor of 100 or so in raw performance, and a similar slowdown from using Python).

Idocrase answered 31/1, 2016 at 14:59 Comment(0)
S
4

I use pydub from Jiaaro's answer, but I wanted to add some code for this question that can actually extract the PCM data from the MP3 file.

Here is a commented, complete program for reading an MP3 file, extracting the PCM data into a list of signed integers, then plotting it with matplotlib. Obviously pydub and matplotlib will need to be installed.

from pydub import AudioSegment
from matplotlib import pyplot as plt

# This will open and read the audio file with pydub.  Replace the file path with
# your own file.
audio_file = AudioSegment.from_file("./2021-02-23-22:00:11-edited.mp3")

# Set up a list for us to dump PCM samples into, and create a 'data' variable
# so we don't need to type audio_file._data again
data = audio_file._data
pcm16_signed_integers = []

# This loop decodes the bytestring into PCM samples.
# The bytestring is a stream of little-endian encoded signed integers.
# This basically just cuts each two-byte sample out of the bytestring, converts
# it to an integer, and appends it to the list of samples.
for sample_index in range(len(data)//2):
    sample = int.from_bytes(data[sample_index*2:sample_index*2+2], 'little', signed=True)
    pcm16_signed_integers.append(sample)

# Now plot the samples!
plt.plot(pcm16_signed_integers)
plt.show()

Here is what my plot looked like (I zoomed in to a good section):

Audio data plotted with Matplotlib

And yes, this chart is generated from the code above :D

Sudhir answered 24/2, 2021 at 7:5 Comment(1)
Nice. I found at least a use for the raw data from MP3!Harbor
U
3

Have you tried opening the file in read binary mode?

f = open("test.mp3", "rb")
first16bytes = f.read(16)
etc...
Unpaid answered 11/7, 2013 at 4:55 Comment(4)
I'm pretty sure the OP wants the audio data which is encoded in the mp3. In that case the mp3 will need to be decoded before it is read.Ardys
@Ardys you should post an answer explaining thatUnpaid
How do you interpret what the first 16 bytes represent? Normally I’m use to seeing a frequency over time representation in 2dimenzionz, how does this read for the single dimension of bytes?Cristincristina
Does not even address MP3 decoding problem. Opening and reading files is one of the first things you learn in Python 3. I think this answer needs a section on how to actually decode the file. I just tested pydub from Jiaaro's answer, it works well. But it would be incompatible with the code currently here. Maybe another good library exists that you can mention to contribute to the answers here?Sudhir

© 2022 - 2024 — McMap. All rights reserved.