Multidimensional/multivariate dynamic time warping (DTW) library/code in Python
Asked Answered
L

5

11

I am working on a time series data. The data available is multi-variate. So for every instance of time there are three data points available. Format:

| X | Y | Z |

So one time series data in above format would be generated real time. I am trying to find a good match of this real time generated time series within another time series base data, which is already stored (which is much larger in size and was collected at a different frequency). If I apply standard DTW to each of the series (X,Y,Z) individually they might end up getting a match at different points within the base database, which is unfavorable. So I need to find a point in base database where all three components (X,Y,Z) match well and at the same point.

I have researched into the matter and found out that multidimensional DTW is a perfect solution to such a problem. In R the dtw package does include multidimensional DTW but I have to implement it in Python. The R-Python bridging package namely "rpy2" can probably of help here but I have no experience in R. I have looked through available DTW packages in Python like mlpy, dtw but are not help. Can anyone suggest a package in Python to do the same or the code for multi-dimensional DTW using rpy2.

Thanks in advance!

Ledaledah answered 20/5, 2016 at 14:48 Comment(0)
W
6

It seems like tslearn's dtw_path() is exactly what you are looking for. to quote the docs linked before:

Compute Dynamic Time Warping (DTW) similarity measure between (possibly multidimensional) time series and return both the path and the similarity.

[...]

It is not required that both time series share the same size, but they must be the same dimension. [...]

The implementation they provide follows:

H. Sakoe, S. Chiba, “Dynamic programming algorithm optimization for spoken word recognition,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 26(1), pp. 43–49, 1978.

Weswesa answered 27/7, 2019 at 17:21 Comment(0)
L
5

Thanks @lgautier I dug deeper and found implementation of multivariate DTW using rpy2 in Python. Just passing the template and query as 2D matrices (matrices as in R) would allow rpy2 dtw package to do a multivariate DTW. Also if you have R installed, loading the R dtw library and "?dtw" would give access to the library's documentation and different functionalities available with the library.

For future reference to other users with similar questions: Official documentation of R dtw package: https://cran.r-project.org/web/packages/dtw/dtw.pdf Sample code, passing two 2-D matrices for multivariate DTW, the open_begin and open_end arguments enable subsequence matching:

import numpy as np
import rpy2.robjects.numpy2ri
rpy2.robjects.numpy2ri.activate()
from rpy2.robjects.packages import importr
import rpy2.robjects as robj

R = rpy2.robjects.r
DTW = importr('dtw')

# Generate our data
template = np.array([[1,2,3,4,5],[1,2,3,4,5]]).transpose()
rt,ct = template.shape
query = np.array([[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]]).transpose()
rq,cq = query.shape

#converting numpy matrices to R matrices
templateR=R.matrix(template,nrow=rt,ncol=ct)
queryR=R.matrix(query,nrow=rq,ncol=cq)

# Calculate the alignment vector and corresponding distance
alignment = R.dtw(templateR,queryR,keep=True, step_pattern=R.rabinerJuangStepPattern(4,"c"),open_begin=True,open_end=True)

dist = alignment.rx('distance')[0][0]

print dist
Ledaledah answered 24/5, 2016 at 11:21 Comment(8)
Hey @Ledaledah , I am currently trying to implement multivariate dynamic time warping to cluster my timeseries. I couldn't understand your example. Does the template to timestamps for 5-D data. What is query ? Thanks in advance .Besetting
Hi, so the above example is carrying out a pattern matching using subsequence DTW, trying to find what sub-part/subsequence of the query matches with the template. The template is a 5X2 matrix, meaning it is a bi-variate data and has 5 datapoints(5 timestamps) for each of the two variables, query is again a bi-variate data but with 16 datapoints.Ledaledah
Thanks so much ! Did you try mlpy or do you think there is multivariate DTW in it. mlpy.sourceforge.net/docs/3.5Besetting
Yes I tried mlpy but they don't support (a) multivariate DTW (b) give very little freedom to fine tune your DTW performance using properties like step pattern, different distance measures.I would recommend using rpy2 for a long list of reasons and performance wise also rpy2 is faster than any other libraries available in python even though it needs to access R. For big data applications there are few customized libraries in Python which perform better.Ledaledah
Hey @moskdr when i tried running the above code. I am getting the following "RRuntimeError: Error in globalCostMatrix(lm, step.matrix = dir, window.function = wfun, : step.matrix is no stepMatrix object". Do you know why this might be happeningBesetting
When i removed step_pattern from the arguments.Not sure why, It worked. Can you please print the output of the above code. I got "93.3380951166". Thanks again !Besetting
Yes 93.338 is what I get too (but its wrong), I have just updated the code with two more arguments passed open_begin=True and open_end=True. What this does is enables subsequence matching. By default dtw tries to match two sequences in their entirety so the size of query and template should be equal, whereas here we have passed a smaller template than query so we are looking for finding template as a sub-part of query. The final output should be 0 as entire template can be found in the query so its a complete match.Ledaledah
@AdityaPatel you need to check your R library path to see if you have the package. First run base = importr('base') followed by print(base._libPaths()) in python. Then check the library directory that is printed out to see if you have the package.Reno
U
2

I think that it is a good idea to try out a method in whatever implementation is already available before considering whether it worth working on a reimplementation.

Did you try the following ?

from rpy2.robjects.packages import importr
# You'll obviously need the R package "dtw" installed with your R
dtw = importr("dtw")

# all functions and objects in the R package "dtw" are now available
# with `dtw.<function or object>`
Unmixed answered 20/5, 2016 at 17:56 Comment(0)
A
0

I happened upon this post and thought I would provide some updated information in case anyone else is trying to locate a way to do multivariate DTW in Python. The DTADistance package has the option to perform multivariate DTW.

Anoa answered 14/2, 2023 at 0:39 Comment(0)
H
0

As Jamie mentioned DTAIDistance can do this and it is very fast compared to TSLearn for the same task.

Here is the project page: link

Here is the pypi page: link

From the project page, here is example code:

from dtaidistance import dtw
import numpy as np
s1 = np.array([0.0, 0, 1, 2, 1, 0, 1, 0, 0])
s2 = np.array([0.0, 1, 2, 0, 0, 0, 0, 0, 0])
d = dtw.distance_fast(s1, s2)
Henri answered 17/8, 2023 at 9:59 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Marthmartha

© 2022 - 2024 — McMap. All rights reserved.