Merging mp4 clips with mp4parser makes the audio behind the video
Asked Answered
F

2

6

I am developing an application that merges mp4 clips using the mp4parser library (isoparser-1.0-RC-27.jar and aspectjrt-1.8.0.jar). When two clips are merged, they become a single clip but as more clips are added to it, the output mp4 has it's audio behind the video.

Here is the code:

    Movie[] clips = new Movie[2];

    //location of the movie clip storage
    File mediaStorageDir = new File(Environment.getExternalStoragePublicDirectory(
            Environment.DIRECTORY_PICTURES), "TestMerge");

    //Build the two clips into movies
    Movie firstClip = MovieCreator.build(first);
    Movie secondClip = MovieCreator.build(second);

    //Add both movie clips
    clips[0] = firstClip;
    clips[1] = secondClip;

    //List for audio and video tracks
    List<Track> videoTracks = new LinkedList<Track>();
    List<Track> audioTracks = new LinkedList<Track>();

    //Iterate all the movie clips and find the audio and videos
    for (Movie movie: clips) {
        for (Track track : movie.getTracks()) {
            if (track.getHandler().equals("soun")) 
                audioTracks.add(track);                
            if (track.getHandler().equals("vide"))
                videoTracks.add(track);
        }
    }

    //Result movie from putting the audio and video together from the two clips
    Movie result = new Movie();

    //Append all audio and video
    if (videoTracks.size() > 0)
        result.addTrack(new AppendTrack(videoTracks.toArray(new Track[videoTracks.size()])));

    if (audioTracks.size() > 0) 
        result.addTrack(new AppendTrack(audioTracks.toArray(new Track[audioTracks.size()])));

    //Output the resulting movie to a new mp4 file
    String timeStamp = new SimpleDateFormat("yyyyMMdd_HHmmss").format(new Date());
    String outputLocation = mediaStorageDir.getPath()+timeStamp;
    Container out = new DefaultMp4Builder().build(result);
    FileChannel fc = new RandomAccessFile(String.format(outputLocation), "rw").getChannel();
    out.writeContainer(fc);
    fc.close();

    //Now set the active URL to play as the combined videos!
    setURL(outputLocation);
}

My guess is that as more clips are being added, the synchronization of video to audio is being messed up, since if two longer clips are merged then the audio/video is fine. Is there anyway to prevent this poor sync of video and audio in multiple smaller clips, or has anyone found a solution to doing so using mp4parser?? FFMpeg is another solution I am considering, but haven't found anyone else using it to do this

EDIT: I have discovered that the audio is typically longer than the video, therefore, this is what causes the final resulting video to be offset so much when more and more clips are added to create one clip. I am going to solve by chopping off audio samples

Flatt answered 15/9, 2014 at 21:50 Comment(0)
F
1

I was able to fix this problem by using the technique with the above edit. The trick is to keep track of how many clips are being merged together, and remove samples from the end of the audio track of the most recent clip added. As the resulting output mp4 grows with more clips, you need to strip more and more off the end. This is due in part to the difference in timing of the audio and video tracks, since the audio track might be 1020ms and the video is 1000ms, with 5 clips added you then would have an offset of about 100ms for audio vs. video lengths so you have to compensate for that.

Flatt answered 18/5, 2015 at 1:40 Comment(4)
Please add code for cropping the audio tracks for other people that find your solution. Thanks!Heilman
I wrote this a while ago, let me dig up my solutionFlatt
Thanks for interest :). I found out how to technically crop the audio, but still having problem with choosing right number of samples to remove. I'm wondering if you've found some solution to this.Heilman
The problem is the audio is a head of the video, so removing audio samples won't help. It's removing video samples, enough to sync audio with video but not enough that the human eye would notice you did.Flatt
R
2

just putting a code to Lucas's answer above:

1.

LinkedList<Track> videoTracks = new LinkedList<>();
            LinkedList<Track> audioTracks = new LinkedList<>();
            double[] audioDuration = {0}, videoDuration = {0};
            for (Movie m : clips) {
                for (Track t : m.getTracks()) {
                    if (t.getHandler().equals("soun")) {
                        for (long a : t.getSampleDurations()) audioDuration[0] += ((double) a) / t.getTrackMetaData().getTimescale();
                        audioTracks.add(t);
                    } else if (t.getHandler().equals("vide")) {
                        for (long v : t.getSampleDurations()) videoDuration[0] += ((double) v) / t.getTrackMetaData().getTimescale();
                        videoTracks.add(t);
                    }
                }

                adjustDurations(videoTracks, audioTracks, videoDuration, audioDuration);
            }

2.

private void adjustDurations(LinkedList<Track> videoTracks, LinkedList<Track> audioTracks, double[] videoDuration, double[] audioDuration) {
    double diff = audioDuration[0] - videoDuration[0];

    //nothing to do
    if (diff == 0) {
        return;
    }

    //audio is longer
    LinkedList<Track> tracks = audioTracks;

    //video is longer
    if (diff < 0) {
        tracks = videoTracks;
        diff *= -1;
    }

    Track track = tracks.getLast();
    long[] sampleDurations = track.getSampleDurations();
    long counter = 0;
    for (int i = sampleDurations.length - 1; i > -1; i--) {
        if (((double) (sampleDurations[i]) / track.getTrackMetaData().getTimescale()) > diff) {
            break;
        }
        diff -= ((double) (sampleDurations[i]) / track.getTrackMetaData().getTimescale());
        audioDuration[0] -= ((double) (sampleDurations[i]) / track.getTrackMetaData().getTimescale());
        counter++;
    }

    if (counter == 0) {
        return;
    }

    track = new CroppedTrack(track, 0, track.getSamples().size() - counter);

    //update the original reference
    tracks.removeLast();
    tracks.addLast(track);
}
Renzo answered 16/6, 2016 at 23:0 Comment(1)
Just one suggestion, while calculating sample duration, it should be divided by track time scale to get actual duration.Waylan
F
1

I was able to fix this problem by using the technique with the above edit. The trick is to keep track of how many clips are being merged together, and remove samples from the end of the audio track of the most recent clip added. As the resulting output mp4 grows with more clips, you need to strip more and more off the end. This is due in part to the difference in timing of the audio and video tracks, since the audio track might be 1020ms and the video is 1000ms, with 5 clips added you then would have an offset of about 100ms for audio vs. video lengths so you have to compensate for that.

Flatt answered 18/5, 2015 at 1:40 Comment(4)
Please add code for cropping the audio tracks for other people that find your solution. Thanks!Heilman
I wrote this a while ago, let me dig up my solutionFlatt
Thanks for interest :). I found out how to technically crop the audio, but still having problem with choosing right number of samples to remove. I'm wondering if you've found some solution to this.Heilman
The problem is the audio is a head of the video, so removing audio samples won't help. It's removing video samples, enough to sync audio with video but not enough that the human eye would notice you did.Flatt

© 2022 - 2024 — McMap. All rights reserved.