FFmpeg - resampling from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16 got very bad sound quality (slow, out of tune, noise)

Asked 2/4, 2014 at 20:37 Answered 13/1, 2015 at 0:42

I was confused with resampling result in new ffmpeg. I decode an AAC audio into PCM, the ffmpeg show audio information as:

Stream #0:0: Audio: aac, 44100 Hz, stereo, fltp, 122 kb/s

In new ffmpeg, the output samples are fltp format, so I have to convert it from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16

PS: in old ffmpeg as libavcodec 54.12.100, it is directly S16, so do not need resampling and without any sound quality problem.

Then I've tried three ways to resampling,

using swr_convert
using avresample_convert
convert manualy

But all of them yield the same result, the sound quality is really bad, very slow and out of tune, with some noise too.

My resampling code is as follows:

void resampling(AVFrame* frame_, AVCodecContext* pCodecCtx, int64_t want_sample_rate, uint8_t* outbuf){
    SwrContext      *swrCtx_ = 0;
    AVAudioResampleContext *avr = 0;

    // Initializing the sample rate convert. We only really use it to convert float output into int.
    int64_t wanted_channel_layout = AV_CH_LAYOUT_STEREO;

#ifdef AV_SAMPLEING
    avr = avresample_alloc_context();
    av_opt_set_int(avr, "in_channel_layout", frame_->channel_layout, 0);
    av_opt_set_int(avr, "out_channel_layout", wanted_channel_layout, 0);
    av_opt_set_int(avr, "in_sample_rate", frame_->sample_rate, 0);
    av_opt_set_int(avr, "out_sample_rate", 44100, 0);
    av_opt_set_int(avr, "in_sample_fmt", pCodecCtx->sample_fmt, 0); //AV_SAMPLE_FMT_FLTP
    av_opt_set_int(avr, "out_sample_fmt", AV_SAMPLE_FMT_S16, 0);
    av_opt_set_int(avr, "internal_sample_fmt", pCodecCtx->sample_fmt, 0);
    avresample_open(avr);
    avresample_convert(avr, &outbuf, frame_->linesize[0], frame_->nb_samples, frame_->extended_data, frame_->linesize[0], frame_->nb_samples);
    avresample_close(avr);
    return;
#endif

#ifdef USER_SAMPLEING
    if (pCodecCtx->sample_fmt == AV_SAMPLE_FMT_FLTP)
    {
            int nb_samples = frame_->nb_samples;
            int channels = frame_->channels;
            int outputBufferLen = nb_samples & channels * 2;
            auto outputBuffer = (int16_t*)outbuf;

            for (int i = 0; i < nb_samples; i++)
            {
                    for (int c = 0; c < channels; c++)
                    {
                            float* extended_data = (float*)frame_->extended_data[c];
                            float sample = extended_data[i];
                            if (sample < -1.0f) sample = -1.0f;
                            else if (sample > 1.0f) sample = 1.0f;
                            outputBuffer[i * channels + c] = (int16_t)round(sample * 32767.0f);
                    }
            }
            return;
    }
#endif
    swrCtx_ = swr_alloc_set_opts(
            NULL, //swrCtx_,
            wanted_channel_layout,
            AV_SAMPLE_FMT_S16,
            want_sample_rate,
            pCodecCtx->channel_layout,
            pCodecCtx->sample_fmt,
            pCodecCtx->sample_rate,
            0,
            NULL);

    if (!swrCtx_ || swr_init(swrCtx_) < 0) {
            printf("swr_init: Failed to initialize the resampling context");
            return;
    }

    // convert audio to AV_SAMPLE_FMT_S16
    int swrRet = swr_convert(swrCtx_, &outbuf, frame_->nb_samples, (const uint8_t **)frame_->extended_data, frame_->nb_samples);
    if (swrRet < 0) {
            printf("swr_convert: Error while converting %d", swrRet);
            return;
    }
}

What should to do?

PS1: playing with ffplay is just all right.

PS2: save resample S16 PCM into file and playing it will have the same sound quality problem.

Thanks a lot for your help and suggestions!

I've also noticed that, in old ffmpeg, aac is recongized as FLT format and directly decoded into 16-bit PCM, while in new ffmpeg, aac is counted as FLTP format and produce still 32-bit IEEE float output.

Thus the same code will produce quite different outputs with different versions of ffmpeg. Then, I'd like to ask what is the right way to convert a AAC audio to 16-bit PCM in new version?

Thanks a lot in advance!

Crowson answered 2/4, 2014 at 20:37 Comment(7)

Why not let FFmpeg do the work and output 16-bit PCM for you? – Pen 2/4, 2014 at 21:3

Please tell me how to? It's supposed to be an audio stream. Here I did my test from aac file in order to analyze the problem easily, but the result is the same. Please explain me how to decode AAC and output directly 16 bit PCM? (in old ffmpeg, it is exactly like this by default, I appreciate it very much) Thanks a lot! – Crowson 2/4, 2014 at 21:44

-f s16le -acodec pcm_s16le – Pen 2/4, 2014 at 21:49

It's in code, I can not use external exe file. Please tell me how to code this with ffmpeg. Thanks! – Crowson 2/4, 2014 at 21:52

There's many ways to interface with FFmpeg. I can't help you with that, but the principle is the same. Set the format to s16le and the output audio codec to pcm_s16le. – Pen 2/4, 2014 at 22:19

Well, new FFmpeg force the aac as FLTP format, see for example here: github.com/libav/libav/blob/master/libavcodec/aacdec.c#L993 while the old one not, see example ffmpeg.org/doxygen/0.11/libavcodec_2aacdec_8c-source.html line:00878 and there is no function to change this format! – Crowson 3/4, 2014 at 0:48

Could you help me with an example how to convert AAC to PCM please? we prefer no dll, no exe, only source code or static lib link is ok. Thanks a lot. – Crowson 3/4, 2014 at 0:54

You need to remember that AV_SAMPLE_FMT_FLTP is a planar mode. If your code is expecting an AV_SAMPLE_FMT_S16 (interleaved mode) output, you need to reorder the samples after converting. Considering 2 audio channels and using interleaved mode, the samples are ordered as "c0, c1, c0, c1, c0, c1, ...". Planar mode is "c0, c0, c0, ..., c1, c1, c1, ...".

Details here: http://www.ffmpeg.org/doxygen/2.0/samplefmt_8h.html

Scow answered 29/4, 2014 at 3:40 Comment(0)

I've had good luck doing something similar. On your code block

int nb_samples = frame_->nb_samples;
int channels = frame_->channels;
int outputBufferLen = nb_samples & channels * 2;
auto outputBuffer = (int16_t*)outbuf;

for (int i = 0; i < nb_samples; i++) {
   for (int c = 0; c < channels; c++) {
      float* extended_data = (float*)frame_->extended_data[c];
      float sample = extended_data[i];
      if (sample < -1.0f) sample = -1.0f;
      else if (sample > 1.0f) sample = 1.0f;
      outputBuffer[i * channels + c] = (int16_t)round(sample * 32767.0f);
   }

}

Try replacing with the following:

int nb_samples = frame_->nb_samples;
int channels = frame_->channels;
int outputBufferLen = nb_samples & channels * 2;
auto outputBuffer = (int16_t*)outbuf;

for(int i=0; i < nb_samples; i++) {
   for(int c=0; c < channels; c++) {
      outputBuffer[i*channels+c] = (int16_t)(((float *)frame_->extended_data[c]) * 32767.0f);
   }
}

Sapanwood answered 18/11, 2014 at 18:17 Comment(0)

You need to resample only when you convert to a different sample rate. If the sample rate is the same, you only need to convert from the floating point planar format to the fixed 16 interleaved format.

Serpigo answered 13/1, 2015 at 0:42 Comment(0)

Recommended topics

Hot tags