Recording audio and passing the data to a UIWebView (JavascriptCore) on iOS 8/9
Asked Answered
B

1

6

We have an app that is mostly a UIWebView for a heavily javascript based web app. The requirement we have come up against is being able to play audio to the user and then record the user, play back that recording for confirmation and then send the audio to a server. This works in Chrome, Android and other platforms because that ability is built into the browser. No native code required.

Sadly, the iOS (iOS 8/9) web view lacks the ability to record audio.

The first workaround we tried was recording the audio with an AudioQueue and passing the data (LinearPCM 16bit) to a JS AudioNode so the web app could process the iOS audio exactly the same way as other platforms. This got to a point where we could pass the audio to JS, but the app would eventually crash with a bad memory access error or the javascript side just could not keep up with the data being sent.

The next idea was to save the audio recording to a file and send partial audio data to JS for visual feedback, a basic audio visualizer displayed during recording only.

The audio records and plays back fine to a WAVE file as Linear PCM signed 16bit. The JS visualizer is where we are stuck. It is expecting Linear PCM unsigned 8bit so I added a conversion step that may be wrong. I've tried several different ways, mostly found online, and have not found one that works which makes me think something else is wrong or missing before we even get to the conversion step.

Since I don't know what or where exactly the problem is I'll dump the code below for the audio recording and playback classes. Any suggestions would be welcome to resolve, or bypass somehow, this issue.

One idea I had was to record in a different format (CAF) using different format flags. Looking at the values that are produced, non of the signed 16bit ints come even close to the max value. I rarely see anything above +/-1000. Is that because of the kLinearPCMFormatFlagIsPacked flag in the AudioStreamPacketDescription? Removing that flag cuases the audio file to not be created because of an invalid format. Maybe switching to CAF would work but we need to convert to WAVE before sending the audio back to our server.

Or maybe my conversion from signed 16bit to unsigned 8bit is wrong? I have also tried bitshifting and casting. The only difference is, with this conversion all the audio values get compressed to between 125 and 130. Bit shifting and casting change that to 0-5 and 250-255. That doesn't really solve any problems on the JS side.

The next step would be, instead of passing the data to JS run it through a FFT function and produce values to be used directly by JS for the audio visualizer. I'd rather figure out if I have done something obviously wrong before going that direction.

AQRecorder.h - EDIT: updated audio format to LinearPCM 32bit Float.

#ifndef AQRecorder_h  
#define AQRecorder_h  
#import <AudioToolbox/AudioToolbox.h>  
#define NUM_BUFFERS 3  
#define AUDIO_DATA_TYPE_FORMAT float  
#define JS_AUDIO_DATA_SIZE 32  
@interface AQRecorder : NSObject {  
    AudioStreamBasicDescription  mDataFormat;  
    AudioQueueRef                mQueue;  
    AudioQueueBufferRef          mBuffers[ NUM_BUFFERS ];  
    AudioFileID                  mAudioFile;  
    UInt32                       bufferByteSize;  
    SInt64                       mCurrentPacket;  
    bool                         mIsRunning;  
}  
- (void)setupAudioFormat;  
- (void)startRecording;  
- (void)stopRecording;  
- (void)processSamplesForJS:(UInt32)audioDataBytesCapacity audioData:(void *)audioData;  
- (Boolean)isRunning;  
@end  
#endif 

AQRecorder.m - EDIT: updated audio format to LinearPCM 32bit Float. Added FFT step in processSamplesForJS instead of sending audio data directly.

#import <AVFoundation/AVFoundation.h>  
#import "AQRecorder.h"  
#import "JSMonitor.h"  
@implementation AQRecorder  
void AudioQueueCallback(void * inUserData,   
                        AudioQueueRef inAQ,  
                        AudioQueueBufferRef inBuffer,  
                        const AudioTimeStamp * inStartTime,  
                        UInt32 inNumberPacketDescriptions,  
                        const AudioStreamPacketDescription* inPacketDescs)  
{  

    AQRecorder *aqr = (__bridge AQRecorder *)inUserData;  
    if ( [aqr isRunning] )  
    {  
        if ( inNumberPacketDescriptions > 0 )  
        {  
            AudioFileWritePackets(aqr->mAudioFile, FALSE, inBuffer->mAudioDataByteSize, inPacketDescs, aqr->mCurrentPacket, &inNumberPacketDescriptions, inBuffer->mAudioData);  
            aqr->mCurrentPacket += inNumberPacketDescriptions;  
            [aqr processSamplesForJS:inBuffer->mAudioDataBytesCapacity audioData:inBuffer->mAudioData];  
        }  

        AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, NULL);  
    }  
}  
- (void)debugDataFormat  
{  
    NSLog(@"format=%i, sampleRate=%f, channels=%i, flags=%i, BPC=%i, BPF=%i", mDataFormat.mFormatID, mDataFormat.mSampleRate, (unsigned int)mDataFormat.mChannelsPerFrame, mDataFormat.mFormatFlags, mDataFormat.mBitsPerChannel, mDataFormat.mBytesPerFrame);  
}  
- (void)setupAudioFormat  
{  
    memset(&mDataFormat, 0, sizeof(mDataFormat));  

    mDataFormat.mSampleRate = 44100.;  
    mDataFormat.mChannelsPerFrame = 1;  
    mDataFormat.mFormatID = kAudioFormatLinearPCM;  
    mDataFormat.mFormatFlags = kLinearPCMFormatFlagIsFloat | kLinearPCMFormatFlagIsPacked;  

    int sampleSize = sizeof(AUDIO_DATA_TYPE_FORMAT);  
    mDataFormat.mBitsPerChannel = 32;
    mDataFormat.mBytesPerPacket = mDataFormat.mBytesPerFrame = (mDataFormat.mBitsPerChannel / 8) * mDataFormat.mChannelsPerFrame;
    mDataFormat.mFramesPerPacket = 1;
    mDataFormat.mReserved = 0;  

    [self debugDataFormat];  
}  
- (void)startRecording/  
{  
    [self setupAudioFormat];  

    mCurrentPacket = 0;  

    NSString *recordFile = [NSTemporaryDirectory() stringByAppendingPathComponent: @"AudioFile.wav"];  
    CFURLRef url = CFURLCreateWithString(kCFAllocatorDefault, (CFStringRef)recordFile, NULL);;  
    OSStatus *stat =  
    AudioFileCreateWithURL(url, kAudioFileWAVEType, &mDataFormat, kAudioFileFlags_EraseFile, &mAudioFile);  
    NSError *error = [NSError errorWithDomain:NSOSStatusErrorDomain code:stat userInfo:nil];  
    NSLog(@"AudioFileCreateWithURL OSStatus :: %@", error);  
    CFRelease(url);  

    bufferByteSize = 896 * mDataFormat.mBytesPerFrame;  
    AudioQueueNewInput(&mDataFormat, AudioQueueCallback, (__bridge void *)(self), NULL, NULL, 0, &mQueue);  
    for ( int i = 0; i < NUM_BUFFERS; i++ )  
    {  
        AudioQueueAllocateBuffer(mQueue, bufferByteSize, &mBuffers[i]);  
        AudioQueueEnqueueBuffer(mQueue, mBuffers[i], 0, NULL);  
    }  
    mIsRunning = true;  
    AudioQueueStart(mQueue, NULL);  
}  
- (void)stopRecording  
{  
     mIsRunning = false;  
    AudioQueueStop(mQueue, false);  
    AudioQueueDispose(mQueue, false);  
    AudioFileClose(mAudioFile);  
}  
- (void)processSamplesForJS:(UInt32)audioDataBytesCapacity audioData:(void *)audioData  
{  
    int sampleCount = audioDataBytesCapacity / sizeof(AUDIO_DATA_TYPE_FORMAT);  
    AUDIO_DATA_TYPE_FORMAT *samples = (AUDIO_DATA_TYPE_FORMAT*)audioData;  

    NSMutableArray *audioDataBuffer = [[NSMutableArray alloc] initWithCapacity:JS_AUDIO_DATA_SIZE];

    // FFT stuff taken mostly from Apples aurioTouch example
    const Float32 kAdjust0DB = 1.5849e-13;

    int bufferFrames = sampleCount;
    int bufferlog2 = round(log2(bufferFrames));
    float fftNormFactor = (1.0/(2*bufferFrames));
    FFTSetup fftSetup = vDSP_create_fftsetup(bufferlog2, kFFTRadix2);

    Float32 *outReal = (Float32*) malloc((bufferFrames / 2)*sizeof(Float32));
    Float32 *outImaginary = (Float32*) malloc((bufferFrames / 2)*sizeof(Float32));
    COMPLEX_SPLIT mDspSplitComplex = { .realp = outReal, .imagp = outImaginary };

    Float32 *outFFTData = (Float32*) malloc((bufferFrames / 2)*sizeof(Float32));

    //Generate a split complex vector from the real data
    vDSP_ctoz((COMPLEX *)samples, 2, &mDspSplitComplex, 1, bufferFrames / 2);

    //Take the fft and scale appropriately
    vDSP_fft_zrip(fftSetup, &mDspSplitComplex, 1, bufferlog2, kFFTDirection_Forward);
    vDSP_vsmul(mDspSplitComplex.realp, 1, &fftNormFactor, mDspSplitComplex.realp, 1, bufferFrames / 2);
    vDSP_vsmul(mDspSplitComplex.imagp, 1, &fftNormFactor, mDspSplitComplex.imagp, 1, bufferFrames / 2);

    //Zero out the nyquist value
    mDspSplitComplex.imagp[0] = 0.0;

    //Convert the fft data to dB
    vDSP_zvmags(&mDspSplitComplex, 1, outFFTData, 1, bufferFrames / 2);

    //In order to avoid taking log10 of zero, an adjusting factor is added in to make the minimum value equal -128dB
    vDSP_vsadd(outFFTData, 1, &kAdjust0DB, outFFTData, 1, bufferFrames / 2);
    Float32 one = 1;
    vDSP_vdbcon(outFFTData, 1, &one, outFFTData, 1, bufferFrames / 2, 0);

    // Average out FFT dB values
    int grpSize = (bufferFrames / 2) / 32;
    int c = 1;
    Float32 avg = 0;
    int d = 1;
    for ( int i = 1; i < bufferFrames / 2; i++ )
    {
        if ( outFFTData[ i ] != outFFTData[ i ] || outFFTData[ i ] == INFINITY )
        { // NAN / INFINITE check
            c++;
        }
        else
        {
            avg += outFFTData[ i ];
            d++;
            //NSLog(@"db = %f, avg = %f", outFFTData[ i ], avg);

            if ( ++c >= grpSize )
            {
                uint8_t u = (uint8_t)((avg / d) + 128); //dB values seem to range from -128 to 0.
                NSLog(@"%i = %i (%f)", i, u, avg);
                [audioDataBuffer addObject:[NSNumber numberWithUnsignedInt:u]];
                avg = 0;
                c = 0;
                d = 1;
            }
        }
    } 

    [[JSMonitor shared] passAudioDataToJavascriptBridge:audioDataBuffer];  
}  
- (Boolean)isRunning  
{  
    return mIsRunning;  
}  
@end 

Audio playback and recording contrller classes Audio.h

#ifndef Audio_h  
#define Audio_h  
#import <AVFoundation/AVFoundation.h>  
#import "AQRecorder.h"  
@interface Audio : NSObject <AVAudioPlayerDelegate> {  
    AQRecorder* recorder;  
    AVAudioPlayer* player;  
    bool mIsSetup;  
    bool mIsRecording;  
    bool mIsPlaying;  
}  
- (void)setupAudio;  
- (void)startRecording;  
- (void)stopRecording;  
- (void)startPlaying;  
- (void)stopPlaying;  
- (Boolean)isRecording;  
- (Boolean)isPlaying;  
- (NSString *) getAudioDataBase64String;  
@end  
#endif 

Audio.m

#import "Audio.h"  
#import <AudioToolbox/AudioToolbox.h>  
#import "JSMonitor.h"  
@implementation Audio  
- (void)setupAudio  
{  
    NSLog(@"Audio->setupAudio");  
    AVAudioSession *session = [AVAudioSession sharedInstance];  
    NSError * error;  
    [session setCategory:AVAudioSessionCategoryPlayAndRecord error:&error];  
    [session setActive:YES error:nil];  

    recorder = [[AQRecorder alloc] init];  

    mIsSetup = YES;  
}  
- (void)startRecording  
{  
    NSLog(@"Audio->startRecording");  
    if ( !mIsSetup )  
    {  
        [self setupAudio];  
    }  

    if ( mIsRecording ) {  
        return;  
    }  

    if ( [recorder isRunning] == NO )  
    {  
        [recorder startRecording];  
    }  

    mIsRecording = [recorder isRunning];  
}  
- (void)stopRecording  
{  
    NSLog(@"Audio->stopRecording");  
    [recorder stopRecording];  
    mIsRecording = [recorder isRunning];  

    [[JSMonitor shared] sendAudioInputStoppedEvent];  
}  
- (void)startPlaying  
{  
    if ( mIsPlaying )  
    {  
        return;  
    }  

    mIsPlaying = YES;  
    NSLog(@"Audio->startPlaying");  
    NSError* error = nil;  
    NSString *recordFile = [NSTemporaryDirectory() stringByAppendingPathComponent: @"AudioFile.wav"];  
    player = [[AVAudioPlayer alloc] initWithContentsOfURL:[NSURL fileURLWithPath:recordFile] error:&error];  

    if ( error )  
    {  
        NSLog(@"AVAudioPlayer failed :: %@", error);  
    }  

    player.delegate = self;  
    [player play];  
}  
- (void)stopPlaying  
{  
    NSLog(@"Audio->stopPlaying");  
    [player stop];  
    mIsPlaying = NO;  
    [[JSMonitor shared] sendAudioPlaybackCompleteEvent];  
}  
- (NSString *) getAudioDataBase64String  
{  
    NSString *recordFile = [NSTemporaryDirectory() stringByAppendingPathComponent: @"AudioFile.wav"];  

    NSError* error = nil;  
    NSData *fileData = [NSData dataWithContentsOfFile:recordFile options: 0 error: &error];  
    if ( fileData == nil )  
    {  
        NSLog(@"Failed to read file, error %@", error);  
        return @"DATAENCODINGFAILED";  
    }  
    else  
    {  
        return [fileData base64EncodedStringWithOptions:0];  
    }  
}  
- (Boolean)isRecording { return mIsRecording; }  
- (Boolean)isPlaying { return mIsPlaying; }  

- (void)audioPlayerDidFinishPlaying:(AVAudioPlayer *)player successfully:(BOOL)flag  
{  
    NSLog(@"Audio->audioPlayerDidFinishPlaying: %i", flag);  
    mIsPlaying = NO;  
    [[JSMonitor shared] sendAudioPlaybackCompleteEvent];  
}  
- (void)audioPlayerDecodeErrorDidOccur:(AVAudioPlayer *)player error:(NSError *)error  
{  
    NSLog(@"Audio->audioPlayerDecodeErrorDidOccur: %@", error.localizedFailureReason);  
    mIsPlaying = NO;  
    [[JSMonitor shared] sendAudioPlaybackCompleteEvent];  
}  
@end 

The JSMonitor class is a bridge between the UIWebView javascriptcore and the native code. I'm not including it because it doesn't do anything for audio other than pass data / calls between these classes and JSCore.

EDIT

The format of the audio has changed to LinearPCM Float 32bit. Instead of sending the audio data it is sent through an FFT function and the dB values are averaged and sent instead.

Brigadier answered 12/11, 2015 at 14:54 Comment(4)
Did you check the values that you receive from inside objective-c or only at the very end in the UIWebView?Dentoid
You are not passing the samples directly. Instead, you seem to pass something like a moving average of the last 32 samples (avg += *v; avg /= 2;). Is this your intention?Dentoid
why is AUDIO_DATA_TYPE_FORMAT *v a pointer? Shouldn't it be a sample value?Dentoid
The values are being checked in Obj-C and on the JS side. The averaging is done because the JS visualizer doesn't need all the data, just enough to visualize correctly. The code has also changed and that isn't a pointer anymore.Brigadier
M
0

Core Audio is a pain to work with. Fortunately, AVFoundation provides AVAudioRecorder to record video and also gives you access to the average and peak audio power that you can send to back to your JavaScript to update your UI visualizer. From the docs:

An instance of the AVAudioRecorder class, called an audio recorder, provides audio recording capability in your application. Using an audio recorder you can:

  • Record until the user stops the recording
  • Record for a specified duration
  • Pause and resume a recording
  • Obtain input audio-level data that you can use to provide level metering

This Stack Overflow question has an example of how to use AVAudioRecorder.

Megmega answered 10/12, 2015 at 3:2 Comment(1)
The JS visualizer needs more than peak and average power to work. I can get those values from the AudioQueue as well.Brigadier

© 2022 - 2024 — McMap. All rights reserved.