As described in this question I'm trying to create a custom audio codec and save it with the Sink Writer in a MP4 file. I succeeded with the aid of setting MF_MT_MPEG4_SAMPLE_DESCRIPTION, following some information from this link.
The sample descriptor I set is this one:
UINT8 cz[72] = {
0x0,0x0,0x0,0x48, // len
0x73,0x74,0x73,0x64, // stsd
0x0,0x0,0x0,0x0, // verflag
0x0,0x0,0x0,0x1, // num
0x0,0x0,0x0,0x38, // len
'e','c','d','c', // code
0x0,0x0,0x0,0x0,0x0,0x0, // 6-null
// index
0x0,0x1,
// version
0x0,0x1,
// Revision Level
0x0,0x0,
// vendor
0x0,0x0,0x0,0x0,
// number channels
0x0,0x2,
// sample size
0x0,0x10,
// Compression ID
0xFF,0xFE,
// Packet Size,
0x0,0x0,
// Sample rate
0xBB,0x80,0x0,0x0,
// Sound Level 1 fields ?
0x0,0x0,0x0,0x0,
0x0,0x0,0x0,0x0,
0x0,0x0,0x0,0x0,
0x0,0x0,0x0,0x0,
};
My custom audio media type has the guid {0000ECDC-0000-0010-8000-00AA00389B71}
, set with MF_MT_SUBTYPE.
Just before calling Finalize
I test the media type of the writing mp4 and it's indeed valid:
MF_MT_MPEG4_SAMPLE_DESCRIPTION byte array
MF_MT_AUDIO_NUM_CHANNELS 2
MF_MT_MAJOR_TYPE MFMediaType_Audio
MF_MT_AUDIO_SAMPLES_PER_SECOND 48000
MF_MT_MPEG4_CURRENT_SAMPLE_ENTRY 0
MF_MT_SUBTYPE {0000ECDC-0000-0010-8000-00AA00389B71}
Now the weird thing, after reopening the file:
CComPtr<IMFSourceReader> srr;
FCreateSourceReaderFromURL(fi.c_str(), 0, &srr);
CComPtr<IMFMediaType> c;
srr->GetCurrentMediaType(MF_SOURCE_READER_FIRST_AUDIO_STREAM, &c);
LogMediaType(c);
Now this time I get this weird thing:
MF_MT_AUDIO_AVG_BYTES_PER_SECOND 383
MF_MT_AVG_BITRATE 3071
MF_MT_MPEG4_SAMPLE_DESCRIPTION byte array
MF_MT_AUDIO_NUM_CHANNELS 2
MF_MT_MAJOR_TYPE MFMediaType_Audio
MF_MT_AUDIO_SAMPLES_PER_SECOND 48000
MF_MT_MPEG4_CURRENT_SAMPLE_ENTRY 0
MF_MT_AUDIO_BITS_PER_SAMPLE 16
MF_MT_SUBTYPE {65636463-767A-494D-B478-F29D25DC9037}
Now I 'm forced to register with my decoder the weird {65636463-767A-494D-B478-F29D25DC9037}
guid as subtype and also I get some garbage like the AVG bitrate.
What could cause this?
If I push the AAC descriptor, then the media type is correctly returned from the source reader. This is the AAC descriptor:
UINT8 caac[100] = {
0x0,0x0,0x0,0x64, // len
0x73,0x74,0x73,0x64, // stsd
0x0,0x0,0x0,0x0, // verflag
0x0,0x0,0x0,0x1, // num of dscriptiors
0x0,0x0,0x0,0x54, // len
0x6D,0x70,0x34,0x61, // 'mp4a' AAC
0x0,0x0,0x0,0x0,0x0,0x0, //6-null
0x0,0x1, // index
0x0,0x0, // version
0x0,0x0, // revision
0x0,0x0,0x0,0x0, // vendor
0x0,0x2, // channels
0x0,0x10, // sample size
0x0,0x0, // compression ID
0x0,0x0, // packet size
// Sample rate
0xBB,0x80,0x0,0x0,
0x0,0x0,0x0,0x30, // 48 bytes
0x65,0x73,0x64,0x73, // 'esds'
0x0,0x0,0x0,0x0,0x3,0x80,0x80,0x80,0x1F,0x0,0x0,0x0,0x4,0x80,0x80,0x80,0x14,0x40,0x15,0x0,0x6,0x0,0x0,0x2,0xEE,0x0,0x0,0x2,0xEE,0x0,0x5,0x80,0x80,0x80,0x2,0x11,0x90,0x6,0x1,0x2
};
So it includes an 'esds' descriptor... why? But even if I make a duplicate of the AAC code with the only difference being the 'ecdc' string, it still results in a corrupt media type.