So I want to grab a frame from a video at a specific time using libav for the use as a thumbnail.
What I'm using is the following code. It compiles and works fine (in regards to retrieving a picture at all), yet I'm having a hard time getting it to retrieve the right picture.
I simply can't get my head around the all but clear logic behind libav's apparent use of multiple time-bases per video. Specifically figuring out which functions expect/return which type of time-base.
The docs were of basically no help whatsoever, unfortunately. SO to the rescue?
#define ABORT(x) do {fprintf(stderr, x); exit(1);} while(0)
av_register_all();
AVFormatContext *format_context = ...;
AVCodec *codec = ...;
AVStream *stream = ...;
AVCodecContext *codec_context = ...;
int stream_index = ...;
// open codec_context, etc.
AVRational stream_time_base = stream->time_base;
AVRational codec_time_base = codec_context->time_base;
printf("stream_time_base: %d / %d = %.5f\n", stream_time_base.num, stream_time_base.den, av_q2d(stream_time_base));
printf("codec_time_base: %d / %d = %.5f\n\n", codec_time_base.num, codec_time_base.den, av_q2d(codec_time_base));
AVFrame *frame = avcodec_alloc_frame();
printf("duration: %lld @ %d/sec (%.2f sec)\n", format_context->duration, AV_TIME_BASE, (double)format_context->duration / AV_TIME_BASE);
printf("duration: %lld @ %d/sec (stream time base)\n\n", format_context->duration / AV_TIME_BASE * stream_time_base.den, stream_time_base.den);
printf("duration: %lld @ %d/sec (codec time base)\n", format_context->duration / AV_TIME_BASE * codec_time_base.den, codec_time_base.den);
double request_time = 10.0; // 10 seconds. Video's total duration is ~20sec
int64_t request_timestamp = request_time / av_q2d(stream_time_base);
printf("requested: %.2f (sec)\t-> %2lld (pts)\n", request_time, request_timestamp);
av_seek_frame(format_context, stream_index, request_timestamp, 0);
AVPacket packet;
int frame_finished;
do {
if (av_read_frame(format_context, &packet) < 0) {
break;
} else if (packet.stream_index != stream_index) {
av_free_packet(&packet);
continue;
}
avcodec_decode_video2(codec_context, frame, &frame_finished, &packet);
} while (!frame_finished);
// do something with frame
int64_t received_timestamp = frame->pkt_pts;
double received_time = received_timestamp * av_q2d(stream_time_base);
printf("received: %.2f (sec)\t-> %2lld (pts)\n\n", received_time, received_timestamp);
Running this with a test movie file I get this output:
stream_time_base: 1 / 30000 = 0.00003
codec_time_base: 50 / 2997 = 0.01668
duration: 20062041 @ 1000000/sec (20.06 sec)
duration: 600000 @ 30000/sec (stream time base)
duration: 59940 @ 2997/sec (codec time base)
requested: 10.00 (sec) -> 300000 (pts)
received: 0.07 (sec) -> 2002 (pts)
The times don't match. What's going on here? What am I doing wrong?
While searching for clues I stumbled upon this this statement from the libav-users mailing list…
[...] packet PTS/DTS are in units of the format context's time_base,
where the AVFrame->pts value is in units of the codec context's time_base.In other words, the container can have (and usually does) a different time_base than the codec. Most libav players don't bother using the codec's time_base or pts since not all codecs have one, but most containers do. (This is why the dranger tutorial says to ignore AVFrame->pts)
…which confused me even more, given that I couldn't find any such mention in the official docs.
Anyway, I replaced…
double received_time = received_timestamp * av_q2d(stream_time_base);
…with…
double received_time = received_timestamp * av_q2d(codec_time_base);
…and the output changed to this…
...
requested: 10.00 (sec) -> 300000 (pts)
received: 33.40 (sec) -> 2002 (pts)
Still no match. What's wrong?
avformat_seek_file
instead ofav_seek_frame
. Thanks for your great explanation of timestamps in libav though. Have an upvote! – Thinner