Xuggler encoding and muxing
Asked Answered
B

2

12

I'm trying to use Xuggler (which I believe uses ffmpeg under the hood) to do the following:

  • Accept a raw MPJPEG video bitstream (from a small TTL serial camera) and encode/transcode it to h.264; and
  • Accept a raw audio bitsream (from a microphone) and encode it to AAC; then
  • Mux the two (audio and video) bitsreams together into a MPEG-TS container

I've watched/read some of their excellent tutorials, and so far here's what I've got:

// I'll worry about implementing this functionality later, but
// involves querying native device drivers.
byte[] nextMjpeg = getNextMjpegFromSerialPort();

// I'll also worry about implementing this functionality as well;
// I'm simply providing these for thoroughness.
BufferedImage mjpeg = MjpegFactory.newMjpeg(nextMjpeg);

// Specify a h.264 video stream (how?)
String h264Stream = "???";

IMediaWriter writer = ToolFactory.makeWriter(h264Stream);
writer.addVideoStream(0, 0, ICodec.ID.CODEC_ID_H264);
writer.encodeVideo(0, mjpeg);

For one, I think I'm close here, but it's still not correct; and I've only gotten this far by reading the video code examples (not the audio - I can't find any good audio examples).

Literally, I'll be getting byte-level access to the raw video and audio feeds coming into my Xuggler implementation. But for the life of me I can't figure out how to get them into an h.264/AAC/MPEG-TS format. Thanks in advance for any help here.

Bak answered 12/12, 2012 at 12:31 Comment(3)
I should also have mentioned in the bounty text that I am not "married" to Xuggler. If someone can figure out how to do everything I need (specified in the bounty) with, say, ffmpeg or some other tool that can run on Linux, I'd be interested in that solution as well!Bak
can you attache the camera through USB ? do you know if xuggle can read input from your camera through SPIRuthenium
Yes on USB & SPI but I won't be using that option for reasons that are outside the scope of this question. The only thing that really matters is that I will be getting raw audio and video bitstreams in the form of byte[]s.Bak
B
16

Looking at Xuggler this sample code, the following should work to encode video as H.264 and mux it into a MPEG2TS container:

IMediaWriter writer = ToolFactory.makeWriter("output.ts");
writer.addVideoStream(0, 0, ICodec.ID.CODEC_ID_H264, width, height);
for (...)
{

   BufferedImage mjpeg = ...;

   writer.encodeVideo(0, mjpeg);
}

The container type is guessed from the file extension, the codec is specified explicitly.

To mux audio and video, you would do something like this:

writer.addVideoStream(videoStreamIndex, 0, videoCodec, width, height);
writer.addAudioStream(audioStreamIndex, 0, audioCodec, channelCount, sampleRate);

while (... have more data ...)
{
    BufferedImage videoFrame = ...;
    long videoFrameTime = ...; // this is the time to display this frame
    writer.encodeVideo(videoStreamIndex, videoFrame, videoFrameTime, DEFAULT_TIME_UNIT);

    short[] audioSamples = ...; // the size of this array should be number of samples * channelCount
    long audioSamplesTime = ...; // this is the time to play back this bit of audio
    writer.encodeAudio(audioStreamIndex, audioSamples, audioSamplesTime, DEFAULT_TIME_UNIT);
}

In this case I believe your code is responsible for interleaving the audio and video: you want to call either encodeAudio() or encodeVideo() on each pass through the loop, based on which data available (a chunk of audio samples or a video frame) has an earlier timestamp.

There is another, lower-level API you may end up using, based on IStreamCoder, which gives more control over various parameters. I don't think you will need to use that.

To answer the specific questions you asked:

(1) "Encode a BufferedImage (M/JPEG) into a h.264 stream" - you already figured that out, writer.addVideoStream(..., ICodec.ID.CODEC_ID_H264) makes sure you get the H.264 codec. To get a transport stream (MPEG2 TS) container, simply call makeWriter() with a filename with a .ts extension.

(2) "Figure out what the "BufferedImage-equivalent" for a raw audio feed is" - that is either a short[] or an IAudioSamples object (both seem to work, but IAudioSamples has to be constructed from an IBuffer which is much less straightforward).

(3) "Encode this audio class into an AAC audio stream" - call writer.addAudioStream(..., ICodec.ID.CODEC_ID_AAC, channelCount, sampleRate)

(4) "multiplex both stream into the same MPEG-TS container" - call makeWriter() with a .ts filename, which sets the container type. For correct audio/video sync you probably need to call encodeVideo()/encodeAudio() in the correct order.

P.S. Always pass the earliest audio/video available first. For example, if you have audio chunks which are 440 samples long (at 44000 Hz sample rate, 440 / 44000 = 0.01 seconds), and video at exactly 25fps (1 / 25 = 0.04 seconds), you would give them to the writer in this order:

video0 @ 0.00 sec
audio0 @ 0.00 sec
audio1 @ 0.01 sec
audio2 @ 0.02 sec
audio3 @ 0.03 sec
video1 @ 0.04 sec
audio4 @ 0.04 sec
audio5 @ 0.05 sec

... and so forth

Most playback devices are probably ok with the stream as long as the consecutive audio/video timestamps are relatively close, but this is what you'd do for a perfect mux.

P.S. There are a few docs you may want to refer to: Xuggler class diagram, ToolFactory, IMediaWriter, ICodec.

Bunk answered 18/12, 2012 at 14:9 Comment(5)
How exactly do you get the short[] samples array?Dabbs
@wrahool: Use a TargetDataLine or AudioInputStream with an AudioFormat which has sampleSizeInBits=16, encoding=PCM_SIGNED, bigEndian=false. Then convert the bytes to shorts, like this: shortBuf[i] = (byteBuf[2*i + 1] << 8) | byteBuf[2*i]; I don't think there is a more direct method, although it depends on where you are getting the audio from.Bunk
I'm getting the audio from a laptop microphone, though I may need to also get it from USB mics later.Dabbs
@wrahool: You can post that as a separate question if you want, sounds a bit too complicated to fully explore in comments.Bunk
have done that now. #21570203Dabbs
A
0

I think you should look at gstreamer: http://gstreamer.freedesktop.org/ You would have to look for plugin that can capture the camera input and then pipe it to libx264 and aac plugins and them pass them through a mpegts muxer.

A pipeline in gstreamer would look like:

v4l2src queue-size=15 ! video/x-raw,framerate=25/1,width=384,height=576 ! \
  avenc_mpeg4 name=venc \
alsasrc ! audio/x-raw,rate=48000,channels=1 ! audioconvert ! lamemp3enc name=aenc \
avimux name=mux ! filesink location=rec.avi venc. ! mux. aenc. ! mux.

In this pipeline mpeg4 and mp3 encoders are being used and the stream is muxed to avi. You should be able to find plugins for libx264 and aac. Let me know if you need further pointers.

Abyssal answered 15/12, 2012 at 14:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.