Extracting metadata from incomplete video files
Asked Answered
C

1

12

Can anyone tell me where metadata is stored in common video file formats? And if it would be located towards the start of the file, or scattered throughout.

I'm working with a remote object store containing a lot of video files and I want to extract metadata, in particular video duration and video dimensions from those files, without streaming the entire file contents to the local machine.

I'm hoping that this metadata will be stored in the first X bytes of files, and so I can just fetch a byte range starting at the beginning instead of the whole file, passing this partial file data to ffprobe.

For testing purposes I created a 22MB MP4 file, and used the following command to supply only the first 1MB of data to ffprobe:

head -c1024K '2013-07-04 12.20.07.mp4' | ffprobe -

It prints:

avprobe version 0.8.6-4:0.8.6-0ubuntu0.12.04.1, Copyright (c) 2007-2013 the Libav developers
  built on Apr  2 2013 17:02:36 with gcc 4.6.3
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x1a6b7a0] stream 0, offset 0x10beab: partial file
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'pipe:':
  Metadata:
    major_brand     : isom
    minor_version   : 0
    compatible_brands: isom3gp4
    creation_time   : 1947-07-04 11:20:07
  Duration: 00:00:09.84, start: 0.000000, bitrate: N/A
    Stream #0.0(eng): Video: h264 (High), yuv420p, 1920x1080, 20028 kb/s, PAR 65536:65536 DAR 16:9, 29.99 fps, 30 tbr, 90k tbn, 180k tbc
    Metadata:
      creation_time   : 1947-07-04 11:20:07
    Stream #0.1(eng): Audio: aac, 48000 Hz, stereo, s16, 189 kb/s
    Metadata:
      creation_time   : 1947-07-04 11:20:07

So I see the first 1MB was enough to extract video duration 9.84 seconds and video dimensions 1920x1080, even though ffprobe printed the warning about detecting a partial file. If I supply less than 1MB, it fails completely.

Would this approach work for other common video file formats to reliably extract metadata, or do any common formats scatter metadata throughout the file?

I'm aware of the concept of container formats and that various codecs may be used represent the audio/video data inside those containers. I'm not familiar with the details though. So I guess the question may apply to common combinations of containers + codecs? Thanks in advance.

Cheerful answered 5/7, 2013 at 14:20 Comment(0)
C
19

Okay to answer my own question after a lot of digging through the specs for MP4, 3GP and AVI...

AVI

Metadata is at the start of AVI files, according to the AVI file format specification.

Video duration is not stored verbatim in AVI files, but is calculated (in microseconds) as dwMicroSecPerFrame x dwTotalFrames.

Reading between the lines of the spec, it seems that many items of metadata can be read directly from offsets within AVI files without parsing at all. But the spec does not mention these offsets explicitly so using this rule of thumb could be risky.

Offset 32: dwMicroSecPerFrame, offset 48: dwTotalFrames, offset 64: dwWidth, offset 68: dwHeight.

So for AVI, it is possible to extract this metadata with only the first X bytes of the file.

MP4, 3GP (3GPP), 3G2 (3GPP2)

All of these file formats are based on the ISO base media file format known as ISO/IEC 14496-12 (MPEG-4 Part 12).

This format allows metadata to be stored anywhere in the file, but in practice it will be either at the start or the end because the raw captured audio/video data is saved contiguously in the middle. (An exception however, would be "fragmented" MP4 files, which are rare.)

Only files with the metadata stored at the start can be played via progressive download, but it is up to the capture device or decoder to support this.

AFAICT this means that to extract metadata from these files, only the first X bytes of the file would be required, and from that information it could be determined that potentially also the last X bytes would be required. But bytes in the middle would not be required.

Cheerful answered 15/7, 2013 at 16:5 Comment(2)
Can getting the metadata from the file beginning be relied on in practice?Zoomorphism
This method is failing for videos in certain codecs (DXV and HAP), where every tool requires the entire file. Any chance you have an updates or solutions for those?Ponceau

© 2022 - 2025 — McMap. All rights reserved.