Media Source Extension (MSE) needs fragmented mp4 for playback in the browser.
A fragmented MP4 contains a series of segments which can be requested individually if your server supports byte-range requests.
Boxes aka Atoms
All MP4 files use an object oriented format that contains boxes aka atoms.
You can view a representation of the boxes in your MP4 using an online tool such as MP4 Parser or if you're using Windows, MP4 Explorer. Let's compare a normal MP4 with one that is fragmented:
Non-Fragmented MP4
This screenshot (from MP4 Parser) shows an MP4 that hasn't been fragmented and quite simply has one massive mdat
(Movie Data) box.
If we were building a video player that supports adaptive bitrate, we might need to know the byte position of the 10 sec mark in a 0.5Mbps and a 1Mbps file in order to switch the video source between the two files at that moment. Determining this exact byte position within one massive mdat
in each respective file is not trivial.
Fragmented MP4
This screenshot shows a fragmented MP4 which has been segmented using MP4Box with the onDemand
profile.
You'll notice the sidx
and series of moof
+mdat
boxes. The sidx
is the Segment Index and stores meta data of the precise byte range locations of the moof
+mdat
segments.
Essentially, you can independently load the sidx
(its byte-range will be defined in the accompanying .mpd
Media Presentation Descriptor file) and then choose which segments you'd like to subsequently load and add to the MSE SourceBuffer.
Importantly, each segment is created at a regular interval of your choosing (ie. every 5 seconds), so the segments can have temporal alignment across files of different bitrates, making it easy to adapt the bitrate during playback.
Media File Formats
Media data streams are wrapped in a container format. The container includes the physical data of the media but also metadata that are necessary for playback. For example it signals to the video player the codec used, subtitles tracks etc. In video streaming there are two main formats that are used for storage and presentation of multimedia content: MPEG- 2 Transport Streams (MPEG-2 TS)[25] and ISO Base Media File Formats (ISOBMFF)[24](MP4 and fragmented MP4).
MPEG-2 Transport Streams are specified by [25] and are designed for broadcasting video through satellite networks. However, Apple adopted it for its adaptive streaming protocol making it an important format. In MPEG-2 TS audio, video and subtitle streams are multiplexed together. MP4 and fragmented MP4 (fMP4), are both part of the MPEG-4, Part 12 standard that covers the ISOBMFF. MP4 is the most known multimedia container format and it’s widely supported in different operating systems and devices. The structure of an MP4 video file, is shown in figure 2.2a. As shown, MP4 consist of different boxes, each with a different function- ality. These boxes are the basic building block of every container in MP4.
For example the file type box (’ftyp’), specifies the compatible brands (spe- cifications) of the file. MP4 files have a Movie Box (’moov’) that contains metadata of the media file and sample tables that are important for timing and indexing the media samples (’stbl’). Also there is a Media Data Box (’mdat’) that contains the corresponding samples. In the fragmented con- tainer, shown in figure 2.2b, media samples are interleaved by using Movie Fragment boxes (’moof’) which contain the sample table for the specific fragment(mdat box).
Ref : https://repository.tudelft.nl/islandora/object/uuid%3Ae06cde4c-1514-4a8d-90be-7e10eee5aac1
© 2022 - 2025 — McMap. All rights reserved.