Understanding PTS and DTS in video frames
Asked Answered
B

1

29

I had fps issues when transcoding from avi to mp4(x264). Eventually the problem was in PTS and DTS values, so lines 12-15 where added before av_interleaved_write_frame function:

1.  AVFormatContext* outContainer = NULL;
2.  avformat_alloc_output_context2(&outContainer, NULL, "mp4", "c:\\test.mp4";
3.  AVCodec *encoder = avcodec_find_encoder(AV_CODEC_ID_H264);
4.  AVStream *outStream = avformat_new_stream(outContainer, encoder);
5.  // outStream->codec initiation
6.  // ...
7.  avformat_write_header(outContainer, NULL);

8.  // reading and decoding packet
9.  // ...
10. avcodec_encode_video2(outStream->codec, &encodedPacket, decodedFrame, &got_frame)
11. 
12. if (encodedPacket.pts != AV_NOPTS_VALUE)
13.     encodedPacket.pts =  av_rescale_q(encodedPacket.pts, outStream->codec->time_base, outStream->time_base);
14. if (encodedPacket.dts != AV_NOPTS_VALUE)
15.     encodedPacket.dts = av_rescale_q(encodedPacket.dts, outStream->codec->time_base, outStream->time_base);
16. 
17. av_interleaved_write_frame(outContainer, &encodedPacket)

After reading many posts I still do not understand:

  1. outStream->codec->time_base = 1/25 and outStream->time_base = 1/12800. The 1st one was set by me but I cannot figure out why and who set 12800? I noticed that before line (7) outStream->time_base = 1/90000 and right after it it changes to 1/12800, why? When I transcode from avi to avi, meaning changing the line (2) to avformat_alloc_output_context2(&outContainer, NULL, "avi", "c:\\test.avi"; , so before and after line (7) outStream->time_base remains always 1/25 and not like in mp4 case, why?
  2. What is the difference between time_base of outStream->codec and outStream?
  3. To calc the pts av_rescale_q does: takes 2 time_base, multiplies their fractions in cross and then compute the pts. Why it does this in this way? As I debugged, the encodedPacket.pts has value incremental by 1, so why changing it if it does has value?
  4. At the beginning the dts value is -2 and after each rescaling it still has negative number, but despite this the video played correctly! Shouldn't it be positive?
Bayonne answered 27/11, 2012 at 23:46 Comment(0)
T
47
  1. The time_base is just a unit of measurement. Different units may be used to represent the same times (approximately, if they are not exact multiples). In some cases a container format requires a certain time base and it will be set to that by the muxer. In other cases the container doesn't require a time base but it has a default that you might have to override. I'm not sure about 1/12800 specifically, I know 1/600 is a special value in mp4 spec.

  2. The two time bases are the units of measurement of time for the codec and for the container. If using constant fps, the codec unit of measurement is commonly set to the interval between each frame and the next (the duration that each frame gets displayed), so that frame times are successive integers. It doesn't have to be set to 1/fps, however, as long as the pts times are correct in whatever units are used.

  3. What you describe is simply what one would have to do to convert from one unit to another. (ie: multiply by old unit, divide by new). A time t in units of a/b can be converted to units c/d as t*(a*d)/(b*c).

  4. The dts sequence can start from any value, there is no special significance to dts 0. At start of playback, the difference between wall clock time and the starting dts is computed, and all future dts are converted to wall clock using that. A video stream with dts=-10, -9, -8, ... is perfectly ok. The difference between successive dts is what is used, the absolute values don't matter.

Tuberculous answered 30/11, 2012 at 13:32 Comment(7)
Sorry, I didn't saw your answer. I didn't understood the 2nd paragraph. What "duration of one frame" means? If I use 25 fps so the pts will be in interval of 40 (1/25)?Bayonne
@theateist: Duration of one frame is 1 second / fps (might be more clear to call that interval between frames, but it's also the duration of how long one frame is supposed to be displayed for). So at 25fps, it is 0.040 seconds. If the timebase is set to 1/25 and your stream is 25fps, then frame times (pts in the timebase units) will be simply 1, 2, 3, ...Tuberculous
I thought timebase determines the fps, I mean if timebase is 1/25 so fps is 25, if timebase 1/40 so the fps is 40, is not it?Bayonne
@theateist: Timebase and fps are independent but somewhat related. You could have fps 25 and timebase (for example) 1/1000. Then successive frames will have times which are (1/25) / (1/1000) = 40 timebase units apart. If the two are not mutually divisible, then frame times have to be approximate/rounded. That is why it is customary to pick (for example) timebase 30/1001 for 29.97 fps. The timebase is the unit of time measurement, and it is customary to pick a unit which is a simple fraction which either is equal to, or is an integer fraction of, 1/fps.Tuberculous
P.S. That is also why transport streams have timebase 1/90000 or 1/27000000: these are small enough units that all common fps to be are expressed almost exactly, and the most common fps (25, 24, 30) are exact.Tuberculous
I have one video that has a PTS at 0 and then another PTS at 900ms with 30 FPS after that. The idea is that the first frame is presented and then lingers for 900 ms (it's the intro to a movie trailer so the screen is static for a long time). So you can't reply on 1/FPSHorribly
@ajs410: Of course, you are correct, 1/FPS is only true under the assumption of constant FPS. I will add a clarification to that effect in the answer. You may find that not all players will handle the video you're describing; especially, a lot of set-top boxes will have problems with it.Tuberculous

© 2022 - 2024 — McMap. All rights reserved.