Sending per frame metadata with H264 encoded frames
Asked Answered
S

1

7

We're looking for a way to send per frame metadata (for example an ID) with H264 encoded frames from a server to a client.

We're currently developing a remote rendering application, where both client and server side are actively involved. The server renders a high quality image with all effects, lighting etc. The client also has model-informations and renders a diffuse image that is used when the bandwidth is too low or the images have to be warped in order to avoid stuttering .

So far we're encoding the frames on the server side with ffmpeg and streaming them with live555 to the client, who receives an rtsp-stream and decodes the frames again using ffmpeg.

For our application, we now need to send per frame metadata. We want the client to tell the server where the camera is right now. Ideally we'd be able to send the client's view matrix to the server, render the corresponding frame and send it back to the client together with its view matrix. So when the client receives a frame, we need to know exactly at what camera position the frame was rendered.

Alternatively we could also tag each view matrix with an ID, send it to the server, render the frame and tag it with the same ID and send it back. In this case we'd have to assign the right matrix to the frame again on the client side.

After several attempts to realize the above intent with ffmpeg we came to the conclusion that ffmpeg does not provide the required functionality. ffmpeg only provides a fix, predefined set of fields for metadata, that either cannot store a matrix or can only be set for every key frame, which is not frequently enough for our purpose.

Now we're considering using live555. So far we have an on demand Server, witch gets a VideoSubsession with a H264VideoStreamDiscreteFramer to contain our own FramedSource class. In this class we load the encoded AVPacket (from ffmpeg) and send its data-buffer over the network. Now we need a way to send some kind of metadata with every frame to the client.

Do you have any ideas how to solve this metadata problem with live555 oder another library?

Thanks for your help!

Spoondrift answered 22/6, 2013 at 20:16 Comment(2)
Did you find a solution to this problem?Tager
pipe the output of ffmpeg through a custom tool that embedded the data in the 264 elementary stream via an SEI.Arietta
B
0

It seems this question was answered in the comments:

pipe the output of ffmpeg through a custom tool that embedded the data in the 264 elementary stream via an SEI

Someone also gave the following answer, which was deleted a few years ago for dubious reasons (it is brief but does seem to contain sufficient information):

You can do so using MPEG-4. See details for MPEG-4 Part 14 for details.

Bryonbryony answered 3/8, 2021 at 20:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.