Currently, I'm doing a little test project to see if I can get samples from an AVAssetReader to play back using an AudioQueue on iOS.
I've read this: ( Play raw uncompressed sound with AudioQueue, no sound ) and this: ( How to correctly read decoded PCM samples on iOS using AVAssetReader -- currently incorrect decoding ),
Which both actually did help. Before reading, I was getting no sound at all. Now, I'm getting sound, but the audio is playing SUPER fast. This is my first foray into audio programming, so any help is greatly appreciated.
I initialize the reader thusly:
NSDictionary * outputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey,
[NSNumber numberWithFloat:44100.0], AVSampleRateKey,
[NSNumber numberWithInt:2], AVNumberOfChannelsKey,
[NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
[NSNumber numberWithBool:NO], AVLinearPCMIsFloatKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
nil];
output = [[AVAssetReaderAudioMixOutput alloc] initWithAudioTracks:uasset.tracks audioSettings:outputSettings];
[reader addOutput:output];
...
And I grab the data thusly:
CMSampleBufferRef ref= [output copyNextSampleBuffer];
// NSLog(@"%@",ref);
if(ref==NULL)
return;
//copy data to file
//read next one
AudioBufferList audioBufferList;
NSMutableData *data = [NSMutableData data];
CMBlockBufferRef blockBuffer;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(ref, NULL, &audioBufferList, sizeof(audioBufferList), NULL, NULL, 0, &blockBuffer);
// NSLog(@"%@",blockBuffer);
if(blockBuffer==NULL)
{
[data release];
return;
}
if(&audioBufferList==NULL)
{
[data release];
return;
}
//stash data in same object
for( int y=0; y<audioBufferList.mNumberBuffers; y++ )
{
// NSData* throwData;
AudioBuffer audioBuffer = audioBufferList.mBuffers[y];
[self.delegate streamer:self didGetAudioBuffer:audioBuffer];
/*
Float32 *frame = (Float32*)audioBuffer.mData;
throwData = [NSData dataWithBytes:audioBuffer.mData length:audioBuffer.mDataByteSize];
[self.delegate streamer:self didGetAudioBuffer:throwData];
[data appendBytes:audioBuffer.mData length:audioBuffer.mDataByteSize];
*/
}
which eventually brings us to the audio queue, set up in this way:
//Apple's own code for canonical PCM
audioDesc.mSampleRate = 44100.0;
audioDesc.mFormatID = kAudioFormatLinearPCM;
audioDesc.mFormatFlags = kAudioFormatFlagsAudioUnitCanonical;
audioDesc.mBytesPerPacket = 2 * sizeof (AudioUnitSampleType); // 8
audioDesc.mFramesPerPacket = 1;
audioDesc.mBytesPerFrame = 1 * sizeof (AudioUnitSampleType); // 8
audioDesc.mChannelsPerFrame = 2;
audioDesc.mBitsPerChannel = 8 * sizeof (AudioUnitSampleType); // 32
err = AudioQueueNewOutput(&audioDesc, handler_OSStreamingAudio_queueOutput, self, NULL, NULL, 0, &audioQueue);
if(err){
#pragma warning handle error
//never errs, am using breakpoint to check
return;
}
and we enqueue thusly
while (inNumberBytes)
{
size_t bufSpaceRemaining = kAQDefaultBufSize - bytesFilled;
if (bufSpaceRemaining < inNumberBytes)
{
AudioQueueBufferRef fillBuf = audioQueueBuffer[fillBufferIndex];
fillBuf->mAudioDataByteSize = bytesFilled;
err = AudioQueueEnqueueBuffer(audioQueue, fillBuf, 0, NULL);
}
bufSpaceRemaining = kAQDefaultBufSize - bytesFilled;
size_t copySize;
if (bufSpaceRemaining < inNumberBytes)
{
copySize = bufSpaceRemaining;
}
else
{
copySize = inNumberBytes;
}
if (bytesFilled > packetBufferSize)
{
return;
}
AudioQueueBufferRef fillBuf = audioQueueBuffer[fillBufferIndex];
memcpy((char*)fillBuf->mAudioData + bytesFilled, (const char*)(inInputData + offset), copySize);
bytesFilled += copySize;
packetsFilled = 0;
inNumberBytes -= copySize;
offset += copySize;
}
}
I tried to be as code inclusive as possible so as to make it easy for everyone to point out where I'm being a moron. That being said, I have a feeling my problem occurs either in the output settings declaration of the track reader or in the actual declaration of the AudioQueue (where I describe to the queue what kind of audio I'm going to be sending it). The fact of the matter is, I don't really know mathematically how to actually generate those numbers (bytes per packet, frames per packet, what have you). An explanation of that would be greatly appreciated, and thanks for the help in advance.