How to find the lag between two time series using cross-correlation - McMap

About

How to find the lag between two time series using cross-correlation

Asked 9/9, 2021 at 11:46 Answered 9/9, 2021 at 11:56

Solved python numpy cross-correlation

C

1

7

Say the two series are:

x = [4,4,4,4,6,8,10,8,6,4,4,4,4,4,4,4,4,4,4,4,4,4,4]
y = [4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,6,8,10,8,6,4,4]

Series x clearly lags y by 12 time periods. However, using the following code as suggested in Python cross correlation:

import numpy as np
c = np.correlate(x, y, "full")
lag = np.argmax(c) - c.size/2

leads to an incorrect lag of -0.5.
What's wrong here?

Cystic answered 9/9, 2021 at 11:46 Comment(1)

What's the desired output? – Thibault 9/9, 2021 at 11:46

C

9

If you want to do it the easy way you should simply use scipy correlation_lags

Also, remember to subtract the mean from the inputs.

import numpy as np
from scipy import signal
x = [4,4,4,4,6,8,10,8,6,4,4,4,4,4,4,4,4,4,4,4,4,4,4]
y = [4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,6,8,10,8,6,4,4]
correlation = signal.correlate(x-np.mean(x), y - np.mean(y), mode="full")
lags = signal.correlation_lags(len(x), len(y), mode="full")
lag = lags[np.argmax(abs(correlation))]

This gives lag=-12, that is the difference between the index of the first six in x and in y, if you swap inputs it gives +12

Edit

Why to subtract the mean

If the signals have non-zero mean the terms at the center of the correlation will become larger, because there you have a larger support sample to compute the correlation. Furthermore, for very large data, subtracting the mean makes the calculations more accurate.

Here I illustrate what would happen if the mean was not subtracted for this example.

plt.plot(abs(correlation))
plt.plot(abs(signal.correlate(x, y, mode="full")))
plt.plot(abs(signal.correlate(np.ones_like(x)*np.mean(x), np.ones_like(y)*np.mean(y))))
plt.legend(['subtracting mean', 'constant signal', 'keeping the mean'])

Notice that the maximum on the blue curve (at 10) does not coincide with the maximum of the orange curve.

Chabazite answered 9/9, 2021 at 11:56 Comment(8)

why do you need to subtract the mean when calculating the correlation? – Dungeon 13/5, 2022 at 13:19

If the two signals have the same length the number of terms in each will be a triangle shape, that will probably place the maximum correlation at the center. – Chabazite 15/5, 2022 at 16:18

Added one plot to help there. – Chabazite 15/5, 2022 at 16:27

Thank you. You say 'the terms at the center of the correlation will become larger'. Why is this not reported in any official documentation? Do you have an official link that elaborates more on that and the use of the mean? – Dungeon 16/5, 2022 at 8:17

They give the the definition documentation. I use this to calculate an unnormalized Pearson correlation coefficient version for all the possible shifts. – Chabazite 16/5, 2022 at 9:13

Sure, I just can't get from the definition how the correlation becomes stronger towards the center of the array, nor there is any mention to the mean subtraction. – Dungeon 16/5, 2022 at 9:22

So maybe maybe need to post a specific question for your specific doubts. I will be happy to give an answer with more details if I can. – Chabazite 16/5, 2022 at 9:37

I have done it here following this thread. – Dungeon 16/5, 2022 at 9:39

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.