Time delay estimation between two audio signals
Asked Answered
P

3

16

I have two audio recordings of a same signal by 2 different microphones (for example, in a WAV format), but one of them is recorded with delay, for example, several seconds.

It's easy to identify such a delay visually when viewing these signals in some kind of waveform viewer - i.e. just spotting first visible peak in every signal and ensuring that they're the same shape:


(source: greycat.ru)

But how do I do it programmatically - find out what this delay (t) is? Two digitized signals are slightly different (because microphones are different, were at different positions, due to ADC setups, etc).

I've digged around a bit and found out that this problem is usually called "time-delay estimation" and it has myriads of approaches to it - for example, one of them.

But are there any simple and ready-made solutions, such as command-line utility, library or straight-forward algorithm available?

Conclusion: I've found no simple implementation and done a simple command-line utility myself - available at https://bitbucket.org/GreyCat/calc-sound-delay (GPLv3-licensed). It implements a very simple search-for-maximum algorithm described at Wikipedia.

Platto answered 11/2, 2011 at 9:33 Comment(0)
C
14

The technique you're looking for is called cross correlation. It's a very simple, if somewhat compute intensive technique which can be used for solving various problems, including measuring the time difference (aka lag) between two similar signals (the signals do not need to be identical).

If you have a reasonable idea of your lag value (or at least the range of lag values that are expected) then you can reduce the total amount of computation considerably. Ditto if you can put a definite limit on how much accuracy you need.

Curbstone answered 11/2, 2011 at 9:46 Comment(6)
Yes, cross-correlation, exactly. Good for mentioning the computation can be reduced if a good starting point can be guesstimated.Iraq
I digged around and found no simplistic implementations of this algorithm, so I've made one myself and published it bitbucket.org/GreyCat/calc-sound-delayPlatto
Cross-correlation is a lot faster if you use an FFT. gist.github.com/376572Elan
Intuition for convolution and cross-correlation: youtube.com/watch?v=MQm6ZP1F6msAlves
Is it possible to use this technique for acoustic signalsRorry
@Aniiya0978: yes, audio signals are a very common use case.Curbstone
B
2

Having had the same problem and without success to find a tool to sync the start of video/audio recordings automatically, I decided to make syncstart (github).

It is a command line tool. The basic code behind it is this:

import numpy as np
from scipy import fft
from scipy.io import wavfile
r1,s1 = wavfile.read(in1)
r2,s2 = wavfile.read(in2)
assert r1==r2, "syncstart normalizes using ffmpeg"
fs = r1
ls1 = len(s1)
ls2 = len(s2)
padsize = ls1+ls2+1
padsize = 2**(int(np.log(padsize)/np.log(2))+1)
s1pad = np.zeros(padsize)
s1pad[:ls1] = s1
s2pad = np.zeros(padsize)
s2pad[:ls2] = s2
corr = fft.ifft(fft.fft(s1pad)*np.conj(fft.fft(s2pad)))
ca = np.absolute(corr)
xmax = np.argmax(ca)
if xmax > padsize // 2:
    file,offset = in2,(padsize-xmax)/fs
else:
    file,offset = in1,xmax/fs
Bekelja answered 19/2, 2021 at 22:36 Comment(2)
Roland I'm currently testing your code and there's a mistake there. In the last if statement you're calling the variables "in1 or in2" but they are not defined anywhereLebrun
That would be the file names. See the github version.Bekelja
P
1

A very straight forward thing todo is just to check if the peaks exceed some threshold, the time between high-peak on line A and high-peak on line B is probably your delay. Just try tinkering a bit with the thresholds and if the graphs are usually as clear as the picture you posted, then you should be fine.

Parthenogenesis answered 11/2, 2011 at 9:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.