First, your data sample is in a proprietary format, am I right? Even using the biosig toolbox for Python this format cannot be read. Maybe I'm wrong, but I didn't succeed to read it.
Thus, I'll base my answer on artificial data, generated from a Rössler-oscillator. It is a chaotic, 3d-oscillator, often used in the field of nonlinear timeseries analysis.
import numpy as np
from scipy.signal import butter, lfilter
##############################################################
# For generating sample-data
##############################################################
from scipy.integrate import odeint
def roessler_ode(y,t,omega=1,a=0.165,b=0.2,c=10):
dy = np.zeros((3))
dy[0] = -1.0*(omega*y[1] + y[2]) #+ e1*(y[3]-y[0])
dy[1] = omega * y[0] + a * y[1]
dy[2] = b + y[2] * (y[0] - c)
return dy
class Roessler(object):
"""A single coupled Roessler oscillators"""
def __init__(self, y=None, omega=1.0, a=0.165,b=0.2,c=10):
self.omega = omega
self.a = a
self.b = b
self.c = c
if y==None:
self.y = np.random.random((3))+0.5
else:
self.y = y
def ode(self,y,t):
dy = roessler_ode(y[:],t,self.omega,self.a,self.b,self.c)
return dy
def integrate(self,ts):
rv = odeint(self.ode,self.y,ts)
self.y = rv[-1,:]
return rv
###############################################################
Now come your function definitions:
def butter_bandpass(lowcut,highcut,fs,order=8):
nyq = 0.5*fs
low = lowcut/nyq
high = highcut/nyq
b,a = butter(order, [low, high], btype='band')
return b,a
def butter_bandpass_filter(data,lowcut,highcut,fs,order=8):
b,a = butter_bandpass(lowcut,highcut,fs,order=order)
return lfilter(b,a,data)
I left them unchanged. I generate some data with my oscillator, but I only take the 3rd component of it. I add some gaussian noise in order to have something to filter out.
# generate sample data
data = Roessler().integrate(np.arange(0,1000,0.1))[:,2]
data += np.random.normal(size=data.shape)
Now let's come to the speed question. I use the timeit
-module to check execution times. These statements execute the filtering 100 times, and measure the overall time. I do this measurement for order 2 and order 8 (yes, you want sharper spectral edges, I know, but wait)
# time execution
from timeit import timeit
time_order8 = timeit("butter_bandpass_filter(data,300,2000,20000,8)", "from __main__ import butter_bandpass_filter, butter_bandpass, data", number=100)
time_order2 = timeit("butter_bandpass_filter(data,300,2000,20000,2)", "from __main__ import butter_bandpass_filter, butter_bandpass, data", number=100)
print "For order 8: %.2f seconds" % time_order8
print "For order 2: %.2f seconds" % time_order2
The output of these two print statements is:
For order 8: 11.70 seconds
For order 2: 0.54 seconds
This makes a factor of 20! Using order 8 for a butterworth filter is definitely not a good idea. I cannot think of any situation where this would make sense. To prove the other problems that arise when using such a filter, let's look at the effect of those filters. We apply bandpass filtering to our data, once with order 8 and once with order 2:
data_bp8 = butter_bandpass_filter(data,300,2000,20000,8)
data_bp2 = butter_bandpass_filter(data,300,2000,20000,2)
Now let's do some plots. First, simple lines (I didn't care about the x-axis)
# plot signals
import matplotlib.pyplot as plt
plt.figure(1)
plt.plot(data, label="raw")
plt.plot(data_bp8, label="order 8")
plt.plot(data_bp2, label="order 2")
plt.legend()
This gives us:
Oh, what happened to the green line? Weird, isn't it? The reason is that butterworth filters of order 8 become a rather instable thing. Ever heard of resonance disaster / catastrophe? This is what it looks like.
The power spectral densities of these signals can be plotted like:
# plot power spectral densities
plt.figure(2)
plt.psd(data, Fs=200000, label="raw")
plt.psd(data_bp8, Fs=20000, label="order 8")
plt.psd(data_bp2, Fs=20000, label="order 2")
plt.legend()
plt.show()
Here, you see that the green line has sharper edges, but at what price? Artificial peak at approx. 300 Hz. The signal is completely distorted.
So what shall you do?
- Never use butterworth filter of order 8
- Use lower order, if it is sufficient.
- If not, create some FIR filter with the Parks-McGlellan or Remez-Exchange-Algorithms. There is scipy.signal.remez, for example.
Another hint: if you care about the phase of your signal, you should definitely filter forwards and backwards in time. lfilter
does shift the phases otherwise. An implementation of such an algorithm, commonly refered to as filtfilt
, can be found at my github repository.
One more programming hint:
If you have the situation of pass-through parameters (four parameters of butter_bandpass_filter
are only passed through to butter_bandpass
, you could make use of *args
and **kwargs
.
def butter_bandpass(lowcut,highcut,fs,order=8):
nyq = 0.5*fs
low = lowcut/nyq
high = highcut/nyq
b,a = butter(order, [low, high], btype='band')
return b,a
def butter_bandpass_filter(data, *args, **kwargs):
b,a = butter_bandpass(*args, **kwargs)
return lfilter(b,a,data)
This reduces code redundancy and will make your code less error-prone.
Finally, here is the complete script, for easy copy-pasting to try it out.
import numpy as np
from scipy.signal import butter, lfilter
##############################################################
# For generating sample-data
##############################################################
from scipy.integrate import odeint
def roessler_ode(y,t,omega=1,a=0.165,b=0.2,c=10):
dy = np.zeros((3))
dy[0] = -1.0*(omega*y[1] + y[2]) #+ e1*(y[3]-y[0])
dy[1] = omega * y[0] + a * y[1]
dy[2] = b + y[2] * (y[0] - c)
return dy
class Roessler(object):
"""A single coupled Roessler oscillators"""
def __init__(self, y=None, omega=1.0, a=0.165,b=0.2,c=10):
self.omega = omega
self.a = a
self.b = b
self.c = c
if y==None:
self.y = np.random.random((3))+0.5
else:
self.y = y
def ode(self,y,t):
dy = roessler_ode(y[:],t,self.omega,self.a,self.b,self.c)
return dy
def integrate(self,ts):
rv = odeint(self.ode,self.y,ts)
self.y = rv[-1,:]
return rv
###############################################################
def butter_bandpass(lowcut,highcut,fs,order=8):
nyq = 0.5*fs
low = lowcut/nyq
high = highcut/nyq
b,a = butter(order, [low, high], btype='band')
return b,a
def butter_bandpass_filter(data,lowcut,highcut,fs,order=8):
b,a = butter_bandpass(lowcut,highcut,fs,order=order)
return lfilter(b,a,data)
# generate sample data
data = Roessler().integrate(np.arange(0,1000,0.1))[:,2]
data += np.random.normal(size=data.shape)
# time execution
from timeit import timeit
time_order8 = timeit("butter_bandpass_filter(data,300,2000,20000,8)", "from __main__ import butter_bandpass_filter, butter_bandpass, data", number=100)
time_order2 = timeit("butter_bandpass_filter(data,300,2000,20000,2)", "from __main__ import butter_bandpass_filter, butter_bandpass, data", number=100)
print "For order 8: %.2f seconds" % time_order8
print "For order 2: %.2f seconds" % time_order2
data_bp8 = butter_bandpass_filter(data,300,2000,20000,8)
data_bp2 = butter_bandpass_filter(data,300,2000,20000,2)
# plot signals
import matplotlib.pyplot as plt
plt.figure(1)
plt.plot(data, label="raw")
plt.plot(data_bp8, label="order 8")
plt.plot(data_bp2, label="order 2")
plt.legend()
# plot power spectral densities
plt.figure(2)
plt.psd(data, Fs=200000, label="raw")
plt.psd(data_bp8, Fs=20000, label="order 8")
plt.psd(data_bp2, Fs=20000, label="order 2")
plt.legend()
plt.show()
data
would be helpful. This is interesting dsp.stackexchange.com/questions/2864/… – Uncalledfornumpy
to perform the calculations than a pure python implementation.numpy
uses C and Fortran compiled code to implement functions and that will prove to be much much faster (factor of 4 or more). – Ametropiascipy.signal.lfilter
to do the meat of the job. – Accrue