During the last few weeks, I had the opportunity to read two documents:
- The MPEG-4 Part 2 specification (ISO/IEC 14496-2), which people just call "mpeg-4"
- The MPEG-4 Part 10 specification (ISO/IEC 14496-10), which is also called "h.264" or "AVC"
After having read all the cool ideas in "mpeg-4" like identifying facial expression, motion of limbs of people, and sprites, I got really excited. The ideas sound very fun, maybe even fantastic, for an idea from 1999.
But then I read the "h.264" standard, and none of those ideas were there. There was a lot of discussion on how to encode pixels, but none of the really cool ideas.
What happened? Why were these ideas removed?
This is not a code question, but as a programmer I feel I should attempt to understand as much of the intent behind a specification. If the code I write adheres to the spirit in which the specification was meant to be used, it's more likely to be positioned to take advantage of the entire specification.