sampling audio doesn't preserve waves (vectors)!
Asked Answered
M

2

7

I made a Telegram robot, and one of its jobs is to create samples from audio files. Now for most audios that is sent to it, the sample is perfectly fine; something like this:

enter image description here

However, for some audios, the sample looks a bit odd:

enter image description here

As you can see, the waves in this file are not shown! (I can assure you that the voice is not empty)

For creating the sample, I use pydub (Thanks, James!). Here's the part that I create the sample:

song = AudioSegment.from_mp3('song.mp3')
sliced = song[start*1000:end*1000]
sliced.export('song.ogg', format='ogg', parameters=["-acodec", "libopus"])

And then I send the sample using bot.send_voice method. Like this:

bot.send_voice(
    chat_id=update.message.chat.id,
    voice=open('song.ogg', 'rb'),
    caption=settings.caption,
    parse_mode=ParseMode.MARKDOWN,
    timeout=1000
)

The documentation of Telegram Bot API says:

Use this method to send audio files, if you want Telegram clients to display the file as a playable voice message. For this to work, your audio must be in an .ogg file encoded with OPUS (other formats may be sent as Audio or Document).

That's why in this line of code:

sliced.export('song.ogg', format='ogg', parameters=["-acodec", "libopus"])

I used parameters=["-acodec", "libopus"].

Can anyone tell me what I'm doing wrong? Thanks in advance!

Mastodon answered 26/3, 2019 at 17:34 Comment(0)
I
0

Shot in the dark guess:

Having just sampled those two Muse songs, "Pressure" is a much louder rock song than "The Void". I suspect Telegram service itself just detects the music as noise when performing speech to text translation. Unlike speech, which has an wide dynamic range between spoken words, music tends to be all the same volume. Hence, the relative volume of each sample is relatively the same - hence, a flat line.

Introject answered 28/3, 2019 at 18:33 Comment(0)
S
0

Since it happen only to some of the songs, I believe the issues is linked with the original song format. Make sure that pudub got file parameters right, e.g.: number of channels, sample width, frame rate, etc. Sometimes the resulting format also changes, so you can get audio in range [-1..1] (float), and sometimes [-32767..32768] (integer).

Scabious answered 31/3, 2019 at 6:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.