AVAudioEngine synchronization for MIDI playback and recording

Asked 20/10, 2018 at 5:35 Answered 26/10, 2018 at 6:11

ios swift avfoundation avaudioengine avaudioplayernode

Question 1 My first question concerns playback synchronization when using an AVAudioPlayerNode and an AVAudioSequencer for MIDI. Basically I'm trying to play something over MIDI, but they need to be perfectly synchronized.

I'm aware there are sync methods for AVAudioPlayerNodes, but the sequencer does not seem to have something like that.

Currently I've tried using CAMediaTime() + delay and usleep on separate threads, but they don't seem to work very well.

Question 2 I'm using the tap on the engine.inputNode to get the recording, separate from the music playback. However, it seems like the recording starts earlier. When I compare the recorded data with the original playback, the difference is around 300 ms. I could start recording 300 ms later, but even then, that does not guarantee precise sync and is likely to be machine dependent.

So my question is, what would be a good way to ensure that the recording starts precisely at the moment the playback starts?

Guido answered 20/10, 2018 at 5:35 Comment(0)

For synchronizing audio io, it is often best to create a reference time, then use this time for all timing related calculations.

AVAudioPlayerNode.play(at:) is what you need for the player. For the tap you need to filter out (partial) buffers manually using the time provided in the closure. AVAudioSequencer unfortunately does not have a facility for starting at a specific time, but you can get a reference time correlated to a beat with an already playing sequencer using hostTime(forBeats). If I remember correctly, you cannot set the sequencer to a negative position, so this is not ideal.

Here's a hacky workaround that should yield very accurate results:

AVAudioSequencer has to be started before getting a reference time, offset all of your midi data by 1, start the sequencer, then immediately get the reference time correlated to beat 1, then synchronize the start of the player to this time, and also use it to filter out unwanted audio captured by the tap.

func syncStart() throws {
    //setup
    sequencer.currentPositionInBeats = 0
    player.scheduleFile(myFile, at: nil)
    player.prepare(withFrameCount: 4096)

    // Start and get reference time of beat 1
    try sequencer.start()
    // Wait until first render cycle completes or hostTime(forBeats) will err - AVAudioSequencer is fragile :/
    while (self.sequencer.currentPositionInBeats <= 0) { usleep(UInt32(0.001 * 1000000.0)) }
    var nsError: NSError?
    let hostTime = sequencer.hostTime(forBeats: 1, error: &nsError)
    let referenceTime = AVAudioTime(hostTime: hostTime)

    // AVAudioPlayer is great for this.
    player.play(at: referenceTime)

    // This just rejects buffers that come too soon. To do this right you need to record partial buffers.
    engine.inputNode.installTap(onBus: 0, bufferSize: 1024, format: nil) { (buffer, audioTime) in
        guard audioTime.hostTime >= referenceTime.hostTime else { return }
        self.recordBuffer(buffer: buffer)
    }
}

Actually answered 24/10, 2018 at 1:43 Comment(2)

Thanks. This unfortunately didn't work for me because hostTime(forBeats:error:) kept causing a crash, but it definitely set me on the correct path. I'll reference the answer in my own answer, and post what I did. – Guido 26/10, 2018 at 6:4

There, I fixed it by waiting for the first render cycle to complete before calling hostTime(forBeats:) – Actually 26/10, 2018 at 13:48

dave234's answer unfortunately didn't work for me because hostTime(forBeats:error:) kept crashing even after starting the sequencer first. (It did work when I dispatched asynchronously after some delay, but that would cause further complication). However, it provided valuable insight into synchronization methods, and here's what I did:

var refTime: AVAudioTime

if isMIDIPlayer {
    sequencer!.tracks.forEach { $0.offsetTime = 1 }
    sequencer!.currentPositionInBeats = 0

    let sec = sequencer!.seconds(forBeats: 1)
    let delta = AVAudioTime.hostTime(forSeconds: sec) + mach_absolute_time()
    refTime = AVAudioTime(hostTime: delta)

    try sequencer!.start()
} else {
    player!.prepare(withFrameCount: 4096)

    let delta = AVAudioTime.hostTime(forSeconds: 0.5) + mach_absolute_time()
    refTime = AVAudioTime(hostTime: delta)

    player!.play(at: refTime)
}

mixer.installTap(
    onBus: 0,
    bufferSize: 8,
    format: mixer.outputFormat(forBus: 0)
) { [weak self] (buffer, time) in
    guard let strongSelf = self else { return }
    guard time.hostTime >= refTime.hostTime else { print("NOPE"); return }

    do {
        try strongSelf.recordFile!.write(from: buffer)
    } catch {
        // TODO: Handle error
        print(error)
    }
}

Some explanation about the code snippet:

I have made a generic AudioPlayer that can play both MIDI and other song files, and the code is from a method inside AudioPlayer.
sequencer is used for MIDI playback.
player is used for other song files.

MIDI playback synchronization uses a similar method like so:

midi!.sequencer!.tracks.forEach { $0.offsetTime = 1 }

let sec = midi!.sequencer!.seconds(forBeats: 1)
let delta = AVAudioTime.hostTime(forSeconds: sec) + mach_absolute_time()
let refTime = AVAudioTime(hostTime: delta)

do {
    try midi!.play()
} catch {
    // TODO: Add error handler
    print(error)
}

song2.playAtTime(refTime)

Here, midi is the AVAudioSequencer object, and song2 is the AVAudioPlayerNode that plays a regular song.

Works like a charm!

Guido answered 26/10, 2018 at 6:11 Comment(5)

This is a pretty inaccurate method here. If sample accuracy is not needed it may be fine, but the method I posted below should get you very accurate results. – Actually 26/10, 2018 at 13:59

@Actually I wonder where you got the notion of inaccuracy. As far as I can tell, both the AVAudioSequencer and AVAudioPlayerNode is scheduled to play at a future point in time as counted by the mach_absolute_time(), a.k.a. host time. The only inaccuracy can stem from the write buffer being too large in size, which your code has also. I've personally tested my code. Have you tested yours? – Guido 27/10, 2018 at 5:30

You are making the assumption that the machine time when start() is called will correlate with the actual start time of the sequencer, but this isn't the case. The actual start time of the sequence after start() is called is undocumented, but empirically it is the start of the next audio render cycle - if you called prepareToPlay() prior. As far as testing goes, compare delta with hostTime(forBeats: 0) and you'll see the discrepancy using this method. – Actually 27/10, 2018 at 22:3

Yeah well as long as people hear notes played at the same time they'll be happy. They won't come to be saying one player is playing 10 CPU ticks fast, so I'm done here. – Guido 28/10, 2018 at 0:54

It’s more like 0.02 seconds, but if it sounds good, it sounds good :) – Actually 28/10, 2018 at 2:13

Recommended topics

Hot tags