What does librosa.load return?

L

3

8

I'm working with the librosa library, and I would like to know what information is returned by the librosa.load function when I read a audio (.wav) file. Is it the instantaneous sound pressure in pa, or the just the instantaneous amplitude of the sound signal with no units?

Lime answered 24/5, 2020 at 13:12 Comment(2)

It returns floating type audio time series. What is your main question ? – Utopian 24/5, 2020 at 13:25

"I would like to know what is the information that is returned by librosa.load function when I read a audio(.wav) file using it. Is it the instantaneous sound pressure in pa? or the just the instantaneous amplitude of the sound signal with no unit? " – Lime 24/5, 2020 at 13:27

T

10

To confirm the previous answer, librosa.load returns a time series that in librosa glossary is defined as:

"time series: Typically an audio signal, denoted by y, and represented as a one-dimensional numpy.ndarray of floating-point values. y[t] corresponds to the amplitude of the waveform at sample t."

The amplitude is usually measured as a function of the change in pressure around the microphone or receiver device that originally picked up the audio. (See more here).

Thorma answered 16/9, 2020 at 11:4 Comment(2)

Excellent! This is the clarification I was looking for . – Lime 17/9, 2020 at 16:19

This wasn't easy to find. Thank you for the help! – Mitch 8/3, 2022 at 15:5

A

12

According to my knowledge, the amplitude is the measurement of the change in atmospheric pressure while recording. According to librosa.load documentation here, this method returns two things:

The sample rate sr: which means how many samples are recorded per second.
A 2D array:
- The first axis: represents the recorded samples of amplitudes (change of air pressure) in the audio.
- The second axis: represents the number of channels in the audio.

Here is an example from the official documentation:

>>> import librosa

>>> filename = librosa.util.example_audio_file()
>>> y, sr = librosa.load(filename)
>>> sr  #sample rate
22050
>>> y.shape   #mono (1 channel)
(1355168,)
>> y.shape[0] / sr  #duration of audio file in seconds
61.45886621315193

As we can see:

The sample rate is 22050 which means that the recorder was recording 22050 times per second.
The y.shape = (1355168,) which means that there were 1355168 samples recorded on just one channel (Mono) over the whole audio.
Using simple math, you can calculate the duration of this audio file by dividing the total_number_of_samples over the sample_rate

Added from comments

Do note that if you read the file as y, sr = librosa.load(filename), librosa will resample the signal to 22050 Hz by default. As stated in the documentation, if you want to get the native sampling rate, you should read the signal as y, sr = librosa.load(filename, sr=None).

Alyshaalysia answered 24/5, 2020 at 13:31 Comment(2)

Do note that if you read the file as y, sr = librosa.load(filename), librosa will resample the signal to 22050 Hz by default. As stated in the documentation, if you want to get the native sampling rate, you should read the signal as y, sr = librosa.load(filename, sr=None) – Beeline 24/8, 2022 at 16:44

thx @ArturoMoncada-Torres for this important info, I have added it to the answer! – Alyshaalysia 24/8, 2022 at 18:2

T

10

To confirm the previous answer, librosa.load returns a time series that in librosa glossary is defined as:

"time series: Typically an audio signal, denoted by y, and represented as a one-dimensional numpy.ndarray of floating-point values. y[t] corresponds to the amplitude of the waveform at sample t."

The amplitude is usually measured as a function of the change in pressure around the microphone or receiver device that originally picked up the audio. (See more here).

Thorma answered 16/9, 2020 at 11:4 Comment(2)

Excellent! This is the clarification I was looking for . – Lime 17/9, 2020 at 16:19

This wasn't easy to find. Thank you for the help! – Mitch 8/3, 2022 at 15:5

C

3

To add to the above answer, you may also use librosa function librosa.get_duration(y,sr) to get the duration of the audio file in seconds. Or you may use len(y)/sr to get the audio file duration in seconds

Chancery answered 2/9, 2020 at 16:43 Comment(0)

Added from comments

Recommended topics

Hot tags