I have two numpy arrays containing timeseries (unix timestamps).
I want to find pairs of timestamps (1 from each array) whose difference is within a threshold.
For achieving this, I need to align two of the time series data into two arrays, such that each index has its closest pair. (In case of two timestamps in arrays equally close to another timestamp in another array, I don't mind choosing either one, as the count of pairs is more important than the actual values.)
So the aligned data set will have two arrays of same size, plus a smaller array being filled with empty data .
I was thinking of using timeseries
package and the align
function.
But am not sure how to use aligned for my data which is a timeseries.
Example consider two timeseries arrays:
ts1=np.array([ 1311242821.0, 1311242882.0, 1311244025.0, 1311244145.0, 1311251330.0,
1311282555.0, 1311282614.0])
ts2=np.array([ 1311226761.0, 1311227001.0, 1311257033.0, 1311257094.0, 1311281265.0])
Output sample:
Now for ts2[2] (1311257033.0)
, its closest pair should be ts1[4] (1311251330.0)
because the difference is 5703.0
, which is within the threshold
, and it is the smallest. Now that ts2[2]
and ts1[4]
are already paired they should be left out of other calculations.
Such Pairs should be found, so the Output array might be longer than the actual arrays
abs(ts1[0]-ts2[0]) = 16060
abs(ts1[0]-ts2[1]) = 15820 //pair
abs(ts1[0]-ts2[2]) = 14212
abs(ts1[0]-ts2[3]) = 14273
abs(ts1[0]-ts2[4]) = 38444
abs(ts1[1]-ts2[0]) = 16121
abs(ts1[1]-ts2[1]) = 15881
abs(ts1[1]-ts2[2]) = 14151
abs(ts1[1]-ts2[3]) = 14212
abs(ts1[1]-ts2[4]) = 38383
abs(ts1[2]-ts2[0]) = 17264
abs(ts1[2]-ts2[1]) = 17024
abs(ts1[2]-ts2[2]) = 13008
abs(ts1[2]-ts2[3]) = 13069
abs(ts1[2]-ts2[4]) = 37240
abs(ts1[3]-ts2[0]) = 17384
abs(ts1[3]-ts2[1]) = 17144
abs(ts1[3]-ts2[2]) = 12888
abs(ts1[3]-ts2[3]) = 17144
abs(ts1[3]-ts2[4]) = 37120
abs(ts1[4]-ts2[0]) = 24569
abs(ts1[4]-ts2[1]) = 24329
abs(ts1[4]-ts2[2]) = 5703 //pair
abs(ts1[4]-ts2[3]) = 5764
abs(ts1[4]-ts2[4]) = 29935
abs(ts1[5]-ts2[0]) = 55794
abs(ts1[5]-ts2[1]) = 55554
abs(ts1[5]-ts2[2]) = 25522
abs(ts1[5]-ts2[3]) = 25461
abs(ts1[5]-ts2[4]) = 1290 //pair
abs(ts1[6]-ts2[0]) = 55853
abs(ts1[6]-ts2[1]) = 55613
abs(ts1[6]-ts2[2]) = 25581
abs(ts1[6]-ts2[3]) = 25520
abs(ts1[6]-ts2[4]) = 1349
So the pairs are: (ts1[0],ts2[1]), (ts1[4],ts2[2]), (ts1[5],ts2[4]
)
The rest of elements should have null
as their pair
The final two arrays will be of size 9.
Please let me know if this question is clear.
data_a = np.array([12345, 12846, 789789])
etc. Would help people trying to help you. – Flout