Flush & Latency Issue with Fragmented MP4 Creation in FFMPEG
Asked Answered
A

4

10

I'm creating a fragmented mp4 for html5 streaming, using the following command:

-i rtsp://172.20.28.52:554/h264 -vcodec copy -an -f mp4 -reset_timestamps 1 -movflags empty_moov+default_base_moof+frag_keyframe -loglevel quiet -
  1. "-i rtsp://172.20.28.52:554/h264" because the source is h264 in rtp packets stream from an ip camera. For the sake of testing, the camera is set with GOP of 1 (i.e. all frames are key frames)
  2. "-vcodec copy" because I don't need transcoding, only remuxing to mp4.
  3. "-movflags empty_moov+default_base_moof+frag_keyframe" to create a fragmented mp4 according to the media source extensions spec.
  4. "-" at the end in order to output the mp4 to stdout. I'm grabbing the ouput and sending it to the webclient through web sockets.

Everything is working well, expect for a latency issue which I'm trying to solve. If I'm logging every time a data is coming in from stdout, with the timestamp of arrival, I get this output:

16/06/2015 15:40:45.239 got data size = 24

16/06/2015 15:40:45.240 got data size = 7197

16/06/2015 15:40:45.241 got data size = 32768

16/06/2015 15:40:45.241 got data size = 4941

16/06/2015 15:40:45.241 got data size = 12606

16/06/2015 15:40:45.241 got data size = 6345

16/06/2015 15:40:45.241 got data size = 6339

16/06/2015 15:40:45.242 got data size = 6336

16/06/2015 15:40:45.242 got data size = 6361

16/06/2015 15:40:45.242 got data size = 6337

16/06/2015 15:40:45.242 got data size = 6331

16/06/2015 15:40:45.242 got data size = 6359

16/06/2015 15:40:45.243 got data size = 6346

16/06/2015 15:40:45.243 got data size = 6336

16/06/2015 15:40:45.243 got data size = 6338

16/06/2015 15:40:45.243 got data size = 6357

16/06/2015 15:40:45.243 got data size = 6357

16/06/2015 15:40:45.243 got data size = 6322

16/06/2015 15:40:45.243 got data size = 6359

16/06/2015 15:40:45.244 got data size = 6349

16/06/2015 15:40:45.244 got data size = 6353

16/06/2015 15:40:45.244 got data size = 6382

16/06/2015 15:40:45.244 got data size = 6403

16/06/2015 15:40:45.304 got data size = 6393

16/06/2015 15:40:45.371 got data size = 6372

16/06/2015 15:40:45.437 got data size = 6345

16/06/2015 15:40:45.504 got data size = 6352

16/06/2015 15:40:45.571 got data size = 6340

16/06/2015 15:40:45.637 got data size = 6331

16/06/2015 15:40:45.704 got data size = 6326

16/06/2015 15:40:45.771 got data size = 6360

16/06/2015 15:40:45.838 got data size = 6294

16/06/2015 15:40:45.904 got data size = 6328

16/06/2015 15:40:45.971 got data size = 6326

16/06/2015 15:40:46.038 got data size = 6326

16/06/2015 15:40:46.105 got data size = 6340

16/06/2015 15:40:46.171 got data size = 6341

16/06/2015 15:40:46.238 got data size = 6332

As you can see, the first 23 lines (which contain data of about 1.5 secs of video) are arriving almost instantly, and then the delay between each 2 consecutive lines is ~70ms which makes sense because the video is 15 frames per sec. This behavior introduces a latency of about 1.5 sec.

It looks like a flushing issue because I don't see any reason why would ffmpeg need to hold the first 23 frames in memory, especially since each frame is a fragment of it's own inside the mp4. I couldn't however, find any method that would cause ffmpeg to flush this data faster.

Has anyone got a suggestion?

I'd like to note that this is a follow up question to this one: Live streaming dash content using mp4box

Algor answered 16/6, 2015 at 13:23 Comment(10)
It occurred to me, that you have control over the blocksize used for buffering the output. Check ffmpeg.org/ffmpeg-all.html#toc-pipe and see if tweaking that value can help you there.Diphthong
@PabloMontilla I tried to play with some different values of blocksize and although it effected the output in some way, it didn't solve the initial delay.Algor
Hello @galbarm! I can't get video running on page with your ffmpeg params, always getting Skipping unrecognized top-level box: ftyp. (h264 ip cam). I also tried to change -vcodec to libx264, that case i get Skipping unrecognized top-level box: mdat. Can you please describe your code more or to gist it somewhere? Most interesting part is .addSourceBuffer param, i.e. codec string. Thanks in advance!Dulcle
Hi, @Dulcle I'm also seeing the "skipping ftyp" error but it doesn't seem to have any functional effect. Here a gist of the client code, I'm sure it will help you: gist.github.com/galbarm/8cb1b684652de648ded3Algor
My problem was not about the code or params -- it was about cam encodingDulcle
@Algor what's your overall latency doing this ? I'd be interested in knowing a little more on how you "grab the output and send it to the webclient through web sockets", can you please tell me more about this ? If the latency is good I'd be interested in having similar approach. Thanks !Hydrostatic
@Hydrostatic The overall latency is ~700ms+GOP in Chrome. Here you'll find why there's a GOP restriction: code.google.com/p/chromium/issues/detail?id=229412 Once this is fixed, I expect a latency of no more than 700ms in Chrome.Algor
@Hydrostatic regarding how I grab the output and send it through web socket: Try looking for server side implementation of web sockets, I used Fleck (C#). You'll need to use it to send your MP4 as binary data to the client. Regarding grabbing the output: try reading about opening a process and reading it's standard output.Algor
@Algor thanks I made a little test using node.js... it works BUT my latency is huge, around 30 sec ! What's your environment? Are you using a local network server ? What should I do to lower latency ? Did you make a specific encoding for this ?Hydrostatic
@Hydrostatic see the accepted answer I posted now. Maybe this is your issue.Algor
A
5

The key to removing the delay is to use the -probesize argument:

probesize integer (input)

Set probing size in bytes, i.e. the size of the data to analyze to get stream information. A higher value will enable detecting more information in case it is dispersed into the stream, but will increase latency. Must be an integer not lesser than 32. It is 5000000 by default.

By default the value is 5,000,000 bytes which was equivalent to ~1.5 sec of video. I was able to almost completely eliminate the delay by reducing the value to 200,000.

Algor answered 30/9, 2015 at 7:49 Comment(1)
I'm a little late to the party but I'm working on doing exactly the same thing. Except I'm planning to send data with an RTCDataChannel (essentially UDP) instead of WebSockets, which should technically give me even better latency. I'm quite new at all this video stuff and I am having a really hard time understanding all this talk of moof, mdat and moov and what I need to do with the mp4 chunks I receive, before passing them on the the SourceBuffer. Can you provide some guidance?Kilburn
C
1

As some already pointed out, one way is to transcode the video using ffmpeg and choose a small GOP size, e.g. -g 1. This works for me, and gives a latency of a few hundred ms delay from the IP camera to the html5 <video> element when using MediaSource. My guess is that when you set your GOP to 1 on the camera, it doesn't actually give you every frame as a keyframe, but rather 1 keyframe per second.

However for my use case, transcoding is not an option, so what was the key to reducing latency to a few hundred ms was to add the ffmpeg option -frag_duration 100. This makes ffmpeg create very small MP4f fragments and give out a quick and steady stream of packets to stdout instead of batching them to 1-2 seconds.

Corbitt answered 1/8, 2023 at 12:12 Comment(0)
C
0

I solved the latency issue by using the -g option to set the number of frames in the group. In my case I used -g 2. I suspect that if you don't make it explicit, the fragment either waits for the source to provide the keyframe or uses a really large default value to generate the keyframe before closing off the fragment and dumping it to stdout.

Convexoconvex answered 30/7, 2015 at 5:14 Comment(1)
I guess you're transcoding the video so you have control over the output GOP size. I'm using ffmpeg in "vcodec copy" mode thus only remuxing it into fragmented mp4. I tried setting the source of my video (ip camera) to provide key frames only but it didn't help with the initial latency.Algor
N
0

Usually the buffering for stdout is disabled in case of console output. If you run ffmpeg from code, the buffering is enabled, so you will get your data only when the buffer is full or the command ends.

You have to eliminate the stdout buffering of your os. On windows its impossible imo, but on ubuntu for ex. There is http://manpages.ubuntu.com/manpages/maverick/man1/stdbuf.1.html

Neotype answered 26/8, 2015 at 6:33 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.