Realtime Audio with AVAudioEngine

Asked 24/6, 2014 at 9:30 Answered 25/7, 2018 at 19:50

swift avfoundation real-time core-audio avaudioengine

Hej. I want to implement a realtime audio application with the new AVAudioEngine in Swift. Has someone experience with the new framework? How does real time applications work?

My first idea was to store the (processed) input data into a AVAudioPCMBuffer object and then let it play by an AVAudioPlayerNode as you can see in my demo class:

import AVFoundation

class AudioIO {
    var audioEngine: AVAudioEngine
    var audioInputNode : AVAudioInputNode
    var audioPlayerNode: AVAudioPlayerNode
    var audioMixerNode: AVAudioMixerNode
    var audioBuffer: AVAudioPCMBuffer

    init(){
        audioEngine = AVAudioEngine()
        audioPlayerNode = AVAudioPlayerNode()
        audioMixerNode = audioEngine.mainMixerNode

        let frameLength = UInt32(256)
        audioBuffer = AVAudioPCMBuffer(PCMFormat: audioPlayerNode.outputFormatForBus(0), frameCapacity: frameLength)
        audioBuffer.frameLength = frameLength

        audioInputNode = audioEngine.inputNode

        audioInputNode.installTapOnBus(0, bufferSize:frameLength, format: audioInputNode.outputFormatForBus(0), block: {(buffer, time) in
            let channels = UnsafeArray(start: buffer.floatChannelData, length: Int(buffer.format.channelCount))
            let floats = UnsafeArray(start: channels[0], length: Int(buffer.frameLength))

            for var i = 0; i < Int(self.audioBuffer.frameLength); i+=Int(self.audioMixerNode.outputFormatForBus(0).channelCount)
            {
                // doing my real time stuff
                self.audioBuffer.floatChannelData.memory[i] = floats[i];
            }
            })

        // setup audio engine
        audioEngine.attachNode(audioPlayerNode)
        audioEngine.connect(audioPlayerNode, to: audioMixerNode, format: audioPlayerNode.outputFormatForBus(0))
        audioEngine.startAndReturnError(nil)

        // play player and buffer
        audioPlayerNode.play()
        audioPlayerNode.scheduleBuffer(audioBuffer, atTime: nil, options: .Loops, completionHandler: nil)
    }
}

But this is far away from real time and not very efficient. Any ideas or experiences? And it does not matter, if you prefer Objective-C or Swift, I am grateful for all notes, remarks, comments, solutions, etc.

Faultless answered 24/6, 2014 at 9:30 Comment(6)

Objective-C is not recommended for real-time programming. I'm not aware of Apple taking an official position on real-time programming in Swift yet, but there was some discussion on prod.lists.apple.com/archives/coreaudio-api/2014/Jun/… – Coactive 24/6, 2014 at 12:2

Thank you for the link, but the essential from this discussion until now: no one knows anything. ;-) But the question is rather about the new programming language or if Objective-C is able to process in realtime, then how can I use the AVAudioEngine for real time applications, which is advertised by Apple in its WWDC14 session no. 502. – Faultless 24/6, 2014 at 14:39

Objective-C can be used for writing real-time audio apps, but there are restrictions on what can be done inside Core Audio's IOProcs. For example, no memory allocation, no locks, no Objective-C method calls, etc. See rossbencina.com/code/… I imagine that internally AVAudioEngine uses only C inside the realtime methods, and I also bet that the taps have the same restrictions as IOProcs. – Coactive 24/6, 2014 at 22:19

Michael, for buffer taps I would suggest to use simple and plain C. Swift and ObjC both introduce an unpredictable overhead because of ARC, internal locks, and memory allocations. C is best used to process buffers. When it comes to feed data to the main thread for display, use lock-free circular buffers and ObjC. But why are you copying the input buffer yourself? You can connect AVAudioEngine.inputNode directly to AVAudioEngine.outputNode. – Vaughn 18/6, 2016 at 19:7

By "real-time" do you mean recording, and doing stuff like drawing am waveform of the microphone's signal, or feeding the captured audio to a speech recognizer on the fly? If so, let me know, and I will post my code as an answer. – Dearden 9/2, 2017 at 9:27

The second one, signal processing in real-time. Thanks in advance. – Faultless 9/2, 2017 at 9:41

I've been experimenting with AVAudioEngine in both Objective-C and Swift. In the Objective-C version of my engine, all audio processing is done purely in C (by caching the raw C sample pointers available through AVAudioPCMBuffer, and operating on the data with only C code). The performance is impressive. Out of curiosity, I ported this engine to Swift. With tasks like playing an audio file linearly, or generating tones via FM synthesis, the performance is quite good, but as soon as arrays are involved (e.g. with granular synthesis, where sections of audio are played back and manipulated in a non-linear fashion), there is a significant performance hit. Even with the best optimization, CPU usage is 30-40% greater than with the Objective-C/C version. I'm new to Swift, so perhaps there are other optimizations of which I am ignorant, but as far as I can tell, C/C++ are still the best choice for realtime audio. Also look at The Amazing Audio Engine. I'm considering this, as well as direct use of the older C API.

If you need to process live audio, then AVAudioEngine may not be for you. See my answer to this question: I want to call 20 times per second the installTapOnBus:bufferSize:format:block:

Malaysia answered 28/10, 2014 at 13:29 Comment(6)

But: Actually my question was not related to the AVAudioEngine framework itself, not to the Objective-C/Swift. But of course, it is closely connected. – Faultless 28/10, 2014 at 20:3

Michael Dorner, since you said that I didn't answer your question, and it seems to me that I did, perhaps if you rephrased it, I could add some additional info that could useful. I've been working through similar problems in my free time (experimenting with AVAudioEngine with higher-level languages), and am interested in sharing what I've learned. – Malaysia 28/10, 2014 at 20:30

@JasonMcClinsey: Would you be willing to post some code that demonstrates how you are using C and AVAudioPCMBuffer to do FM synthesis? (There is a new class called AVAudioUnitGenerator, which sounds promising, but the documentation is thin and says it is an API 'in development'.) – Organ 28/4, 2015 at 9:42

@JasonMcClinsey: I should have been more specific: I'm looking for Swift code. – Organ 28/4, 2015 at 9:58

@JasonMcClinsey I'm very confused.. You had AVAudioEngine running just fine in Objective-C - you ported it to Swift for some reason. You then saw the performance go way down.. And you blamed.. AVAudioEngine? Sounds like the Swift port was 100% to blame. – Telford 19/5, 2020 at 2:14

@RoyLovejoy, please read my comment again, as I did not blame AVAudioEngine. – Malaysia 21/5, 2020 at 16:38

I think this does what you want: https://github.com/arielelkin/SwiftyAudio

Comment out the distortions and you have a clean loop.

Hitandmiss answered 15/6, 2015 at 15:12 Comment(1)

I am getting stream audio buffer through socket connection. How to play that buffers with Audio Engine? – Brookebrooker 12/9, 2019 at 13:9

Apple has taken an official position on real-time coding in Swift. In the 2017 WWDC session on Core Audio, an Apple engineer said not to use either Swift or Objective C methods inside the real-time audio callback context (implying use only C, or maybe C++ or assembly code).

This is because use of Swift and Objective C methods can involve internal memory management operations that do not have bounded latency.

This implies the proper scheme might be to use Objective C for your AVAudioEngine controller class, but only the C subset (of Objective C) inside the tapOnBus block.

Queston answered 29/7, 2017 at 0:37 Comment(1)

WWDC video on Apple's current site is developer.apple.com/videos/play/wwdc2017/501/?time=1268 ("Now here comes the actual render logic. And note that this part of the code is written in C++, and that is because as I mentioned it's not safe to use Objective-C or Swift runtime from a real-time context.") [update: I see @LloydRochester's quoted a similar section too.] – Jat 6/12, 2019 at 19:19

According to Apple at the WWDC2017 in the What's New in Core Audio Video at approximately 19:20 the Engineer says using Swift is "not safe" from a real-time context. Posted from the transcript.

Now, because the rendering of the Engine happens from a real-time context, you will not be able to use the render offline Objective-C or Swift meta that we saw in the demo.

And that is because, it is not safe to use Objective-C or Swift runtime from a real-time context. So, instead, the engine itself provides you a render block that you can search and cache, and then later use this render block to render the engine from the real-time context. The next thing is -- to do, is to set up your input node so that you can provide your input data to the Engine. And here, you specify the format of the input that you will provide, and this can be a different format than the output. And you also provide a block which the Engine will call, whenever it needs the input data. And when this block gets called, the Engine will let you know how many number of input frames it actually needs.

Illuminating answered 25/7, 2018 at 19:50 Comment(0)

Recommended topics

Hot tags