Why do the convolution results have different lengths when performed in time domain vs in frequency domain?
Asked Answered
I

4

12

I'm not a DSP expert, but I understand that there are two ways that I can apply a discrete time-domain filter to a discrete time-domain waveform. The first is to convolve them in the time domain, and the second is to take the FFT of both, multiply both complex spectrums, and take IFFT of the result. One key difference in these methods is the second approach is subject to circular convolution.

As an example, if the filter and waveforms are both N points long, the first approach (i.e. convolution) produces a result that is N+N-1 points long, where the first half of this response is the filter filling up and the 2nd half is the filter emptying. To get a steady-state response, the filter needs to have fewer points than the waveform to be filtered.

Continuing this example with the second approach, and assuming the discrete time-domain waveform data is all real (not complex), the FFT of the filter and the waveform both produce FFTs of N points long. Multiplying both spectrums IFFT'ing the result produces a time-domain result also N points long. Here the response where the filter fills up and empties overlap each other in the time domain, and there's no steady state response. This is the effect of circular convolution. To avoid this, typically the filter size would be smaller than the waveform size and both would be zero-padded to allow space for the frequency convolution to expand in time after IFFT of the product of the two spectrums.

My question is, I often see work in the literature from well-established experts/companies where they have a discrete (real) time-domain waveform (N points), they FFT it, multiply it by some filter (also N points), and IFFT the result for subsequent processing. My naive thinking is this result should contain no steady-state response and thus should contain artifacts from the filter filling/emptying that would lead to errors in interpreting the resulting data, but I must be missing something. Under what circumstances can this be a valid approach?

Any insight would be greatly appreciated

Interlinear answered 2/10, 2010 at 20:1 Comment(2)
Here is one such example: pcisig.com/specifications/pciexpress/technical_library/… See Figure 20 (blue curve is before and red curve is after filtering FFT spectrum), and Figure 21 is IFFT of red curve in Figure 20.Interlinear
Please consider a more specific title. I found it necessary to read all 4 paragraphs to get a vague idea of what you are asking, and even then it's not completely clear.Increase
C
8

The basic problem is not about zero padding vs the assumed periodicity, but that Fourier analysis decomposes the signal into sine waves which, at the most basic level, are assumed to be infinite in extent. Both approaches are correct in that the IFFT using the full FFT will return the exact input waveform, and both approaches are incorrect in that using less than the full spectrum can lead to effects at the edges (that usually extend a few wavelengths). The only difference is in the details of what you assume fills in the rest of infinity, not in whether you are making an assumption.

Back to your first paragraph: Usually, in DSP, the biggest problem I run into with FFTs is that they are non-causal, and for this reason I often prefer to stay in the time domain, using, for example, FIR and IIR filters.

Update:

In the question statement, the OP correctly points out some of the problems that can arise when using FFTs to filter signals, for example, edge effects, that can be particularly problematic when doing a convolution that is comparable in the length (in the time domain) to the sampled waveform. It's important to note though that not all filtering is done using FFTs, and in the paper cited by the OP, they are not using FFT filters, and the problems that would arise with an FFT filter implementation do not arise using their approach.

Consider, for example, a filter that implements a simple average over 128 sample points, using two different implementations.

FFT: In the FFT/convolution approach one would have a sample of, say, 256, points and convolve this with a wfm that is constant for the first half and goes to zero in the second half. The question here is (even after this system has run a few cycles), what determines the value of the first point of the result? The FFT assumes that the wfm is circular (i.e. infinitely periodic) so either: the first point of the result is determined by the last 127 (i.e. future) samples of the wfm (skipping over the middle of the wfm), or by 127 zeros if you zero-pad. Neither is correct.

FIR: Another approach is to implement the average with an FIR filter. For example, here one could use the average of the values in a 128 register FIFO queue. That is, as each sample point comes in, 1) put it in the queue, 2) dequeue the oldest item, 3) average all of the 128 items remaining in the queue; and this is your result for this sample point. This approach runs continuously, handling one point at a time, and returning the filtered result after each sample, and has none of the problems that occur from the FFT as it's applied to finite sample chunks. Each result is just the average of the current sample and the 127 samples that came before it.

The paper that OP cites takes an approach much more similar to the FIR filter than to the FFT filter (note though that the filter in the paper is more complicated, and the whole paper is basically an analysis of this filter.) See, for example, this free book which describes how to analyze and apply different filters, and note also that the Laplace approach to analysis of the FIR and IIR filters is quite similar what what's found in the cited paper.

Cerveny answered 3/10, 2010 at 18:23 Comment(28)
Thanks tom10, "(that usually extend a few wavelengths)." This is what I observe on the few test cases I've investigated. That is, the middle 1/3'rd of the waveform appears perfectly stable after IFFT of the product of the two spectrums (e.g. freq conv approach). Is there any theory/math that can show the dividing line where the steady-state response becomes perfectly valid in this approach? That is, I'm looking for a way to predict based on theory what region of the filtered time-domain waveform (e.g. the result from IFFT of the product of the filter FFT and the waveform FFT) is stable.Interlinear
sigh... now I have to get out a book... to quote form Numerical Recipes: "To summarize — we need to pad the data with a number of zeros on one end equal to the maximum positive duration or maximum negative duration of the response function, whichever is larger. (For a symmetric response function of duration M , you will need only M/2 zero pads.) Combining this operation with the padding of the response rk described above, we effectively insulate the data from artifacts of undesired periodicity." see hebb.mit.edu/courses/9.29/2002/readings/c13-1.pdf around Fig. 13.1.14Cerveny
At least this answers the specific question about convolutions, though not the general consequences of assumed periodicity of FFTs in general. As I recall, there is some theory for that too, but I don't recall it off-hand.Cerveny
Thanks tom10. However, I probably didn't communicate the problem correctly above. Note, the FILTER length EQUALS the waveform length, both N points long (no zero padding in either array). The link above has the filter length < the waveform length. In my case, where both filter and waveform are N points long, I don't know how 0-padding can help... one could add N zero points to both arrays, FFT both arrays, and take the IFFT of the result to end up with 2N-1 time-response, but even here there is no steady-state response (the 1st N pts are filter filling, and last N pts are filter emptying).Interlinear
As I stated above, my preliminary results show if I have a filter N points long (no zero padding) and a waveform N points long (no zero padding), take the FFT of both, multiply the complex spectrums, and take IFFT, then look at this time-domain result, the middle 1/3rd (approximately) of the time-domain waveform appears perfectly stable (matches expected result exactly). I would have expected the whole time-domain waveform to be the filter filling up, and 100% overlapped with the filter emptying. So it's nice to see some of the array being stable, but I don't know any theory why this occurs.Interlinear
It's a bit unusual to have a filter convolution kernel that's as long as your sample length. Are you sure that what you're doing makes sense, for example, maybe you're trying to filter frequencies that are too long wavelength to be represented in your (too short) sample?Cerveny
But, I think if you're convolution kernel is this long, you can't guarantee that your waveform will look "right" in the center, since the convolution can drag edge effects into the center. E.g., it's possible to construct a kernel so that the result's center is only edge effects. Also, I can't find a discussion that answers this question spot-on, but there are numerous discussions that talk about windowing and circular convolutions, etc, and these are the right discussions to read to understand the issues. But with such long kernels, the rules of what will work won't be simple.Cerveny
Hi tom10, yes this may be 'unusual' in mainstream DSP circles, but commonly practiced in certain industry circles (here is one such example: pcisig.com/specifications/pciexpress/technical_library/… See Figure 20 (blue curve is before and red curve is after filtering FFT spectrum), and Figure 21 is IFFT of red curve in Figure 20). Here, the filter length is the same length as the signal length, since the filter is DERIVED from the signal. Textbook theory says the filtered response will not contain any steady state signal, yet well-established people/companies are using such results every day.Interlinear
One can, for example, think of a sample average as a convolution with a flat constant kernel that spans the sample, and this may be a reasonable thing to do. But if the kernel is as long as the sample, you are likely to have edge effects, and the details of these, from dramatic to zero, will depend on the details of your kernel in a complicated way that are even complicated to specify as they combine time and frequency domains. There's not, as far a I know, a "rule of thumb" for this, but you'd need to understand the details of the signals involved, case by case, what's the windowing, etc...Cerveny
(Also, btw, I didn't read the 56 page manual you linked to. I might have one more round on this left in me, but you'd need to link to something a bit more focused.)Cerveny
Thanks Tom, I've really appreciated your feedback here. Gosh, no need to read the PCI express document I linked to -- I only intended for people to look at figures 20 and 21 which show (in Fig 20) the freq spectrum of the signal and the filter (both N pts long), and (in Fig 21) the IFFT of the product of the two (also N pts long).Interlinear
I'll keep searching, thanks. Will let you know if I can dig anything else up that's interesting.Interlinear
I see, you did say Fig 20 and 21 very clearly. My mistake, and I apologize.Cerveny
Looking at these figs though, I think you may be misinterpreting this. My guess here is that they aren't doing what you say, but instead they are applying a causal filter to the data, and the figures you see are only an analysis of this filter. That is, the filter is not applied by convolution with a kernel; instead it's more like, in real time, it takes in a value and puts out a different value. In doing this it uses the values that came before it, but not after it (more like my answers 2nd paragraph recommends), so no problem.Cerveny
For a simplified example of what they are doing, consider an RC circuit. It's reasonable to talk about it as a filter, put in one spectrum, get out another, and to plot the data using FFTs (like fig 20). But to analyze it, you don't want to think of convolving a filter but you'd do more of a transfer function/Laplace analysis, like Eq 4.3 and 4.4; that is, just like for an RC, e.g. en.wikipedia.org/wiki/RC_circuit#Series_circuit.Cerveny
Actually, Appendix A in this reference provides the entire Matlab code for Figures 20 and 21. More specifically, PDF page 51 shows where the code multiplies the complex FFT spectrums for the filter, HFinal(i), and the signal, L(i), storing the results in array M(i). See the section of code on page 51 below comment "% apply transfer function Ht to data set L and store in M." As you can see from the code deriving both L and HFinal, no zero-padding is used.Interlinear
Yes, so there are two transforms going on here. The first is in how the system works, and we agree, I guess that this is fine... real time and causal no end-point issues. The second is in an after-the-fact analysis, and that's where the Matlab code comes into play. There things are much easier because you can usually use very long samples, which is what the author does. The author doesn't specify the important point for your question, but I'm sure if you calculate it, it won't be a problem, mostly because the author says to use this method you must use 150K data points...Cerveny
To work this through completely, just calculate the impulse response fcn from the transfer fcn and I highly suspect that it'll be short relative to the sample length. I say this because all timescales in the system are ps to 10 ns at the most, so the impulse response, at the longest will be what?, maybe 10 ns, but lets give ourselves some margin and say 1000 ns. The author recommends 150K pts sampled at 100MHz, or a 1.5 ms, that is 1000x the exaggerated impulse response, and the edge effects wouldn't be visible on the plot.Cerveny
It's been in the back of my mind here, so I just want to be sure that we're clear on this... what matters is the length of the convolution waveform in the time domain, right? (In the freq domain, it must be the same as the input waveform.) What makes you think that this waveform will be long in the time domain?Cerveny
Hi tom10, I'm just trying to understand which portion of the time-domain waveform is usable steady-state result (I want to truncate the transient response, for example, so that it's not used for subsequent analysis). I don't have a feeling if the impulse response function will be long (or short) in the time domain. I think your analysis of timescales is getting me on the right track (that the edge-effect transient is in the output but it's just very short). Can you baby-step me through your thinking process again, leading to "the edge effects wouldn't be visible on the plot"? Thanks so much!Interlinear
Let's say I start in the frequency domain with N-pt complex FFT coefficients of a real time-domain signal and N-pt complex FFT coefficients of a filter (neither signal nor filter have 0-padding), I multiply the two and take an IFFT. How would one know where the transient portion of the IFFT time-domain waveform would be, versus the steady-state output? For example, could I just IFFT the filter alone and analyze the IFFT(filter) time-domain response to somehow identify the length of the transient response that should be expected?Interlinear
Yes, your last sentence is the exact idea. The IFFT of the transfer function (filter) is the impulse response function. If you IFFT that (starting with the full N-pt frequency space representation), you should see the impulse response fcn as a short blip with zeros everywhere else.Cerveny
So, the non-zero duration of IFFT(filter) is the time duration I should expect the transient response to consume in the time-domain response obtained by: IFFT(fft_filter x fft_signal), is that right? Will this transient response exist on BOTH the start end of the time sequence obtained by IFFT(fft_filter x fft_signal), so I should truncate both ends by the same duration?Interlinear
I suppose a general rule would be sharp changes in the freq-domain filter would take a long time to play out in the time-domain impulse response (and vice versa), whereas if the freq-domain filter changes slowly, it'll consume a very short duration in the time domain.Interlinear
Yes (on the "non-zero duration of..." point). And the transient response will affect both ends (see the NR section I linked to above for the details). I think the paragraph immediately above is less correct, in that it assumes "necessary" is "sufficient"... not all slow changing filters will have short time domain wfms... but anyway, this is tangential to what's been a very long discussion.Cerveny
Also, when you do plot the impulse response function, please say what the timescale is.Cerveny
Thanks tom10, the linked paper is an example, so I don't have data for the timescale of the impulse response function. I'm working with the same concepts however. Thanks so much for your input here. I have a better understanding now. I was originally confused why convolving an N pt filter with an N pt signal in time domain created 2N-1 pt response, whereas taking FFTs of each, multiplying in freq domain, and IFFTing the result produced N pt time response, and why this was different than 2N-1.Interlinear
... I know freq convolution is the reason why, but I felt, as in time conv, that freq conv should also produce a similar (N pt) transient at the start and end where the filter fills up and empties. From our discussions it seems that was my incorrect thinking! If I want to get the same time conv response using freq conv, I can always zero pad the filter and signal to 2N total length (then FFT each, multiply, and IFFT). But the response from FFTing an N pt filter and N pt signal, multiplying them and IFFTing the result only has a transient determined by impluse response of the filter. Thanks!!!Interlinear
M
8

Here's an example of convolution without zero padding for the DFT (circular convolution) vs linear convolution. This is the convolution of a length M=32 sequence with a length L=128 sequence (using Numpy/Matplotlib):

f = rand(32); g = rand(128)
h1 = convolve(f, g)
h2 = real(ifft(fft(f, 128)*fft(g)))
plot(h1); plot(h2,'r')
grid()

alt text The first M-1 points are different, and it's short by M-1 points since it wasn't zero padded. These differences are a problem if you're doing block convolution, but techniques such as overlap and save or overlap and add are used to overcome this problem. Otherwise if you're just computing a one-off filtering operation, the valid result will start at index M-1 and end at index L-1, with a length of L-M+1.

As to the paper cited, I looked at their MATLAB code in appendix A. I think they made a mistake in applying the Hfinal transfer function to the negative frequencies without first conjugating it. Otherwise, you can see in their graphs that the clock jitter is a periodic signal, so using circular convolution is fine for a steady-state analysis.

Edit: Regarding conjugating the transfer function, the PLLs have a real-valued impulse response, and every real-valued signal has a conjugate symmetric spectrum. In the code you can see that they're just using Hfinal[N-i] to get the negative frequencies without taking the conjugate. I've plotted their transfer function from -50 MHz to 50 MHz:

N = 150000                    # number of samples. Need >50k to get a good spectrum. 
res = 100e6/N                 # resolution of single freq point  
f = res * arange(-N/2, N/2)   # set the frequency sweep [-50MHz,50MHz), N points
s = 2j*pi*f                   # set the xfer function to complex radians 

f1 = 22e6       # define 3dB corner frequency for H1 
zeta1 = 0.54    # define peaking for H1 
f2 = 7e6        # define 3dB corner frequency for H2 
zeta2 = 0.54    # define peaking for H2    
f3 = 1.0e6      # define 3dB corner frequency for H3 

# w1 = natural frequency   
w1 = 2*pi*f1/((1 + 2*zeta1**2 + ((1 + 2*zeta1**2)**2 + 1)**0.5)**0.5)  
# H1 transfer function 
H1 = ((2*zeta1*w1*s + w1**2)/(s**2 + 2*zeta1*w1*s + w1**2))            

# w2 = natural frequency 
w2 = 2*pi*f2/((1 + 2*zeta2**2 + ((1 + 2*zeta2**2)**2 + 1)**0.5)**0.5)  
# H2 transfer function  
H2 = ((2*zeta2*w2*s + w2**2)/(s**2 + 2*zeta2*w2*s + w2**2))            

w3 = 2*pi*f3        # w3 = 3dB point for a single pole high pass function. 
H3 = s/(s+w3)       # the H3 xfer function is a high pass

Ht = 2*(H1-H2)*H3   # Final transfer based on the difference functions

subplot(311); plot(f, abs(Ht)); ylabel("abs")
subplot(312); plot(f, real(Ht)); ylabel("real")
subplot(313); plot(f, imag(Ht)); ylabel("imag")

alt text

As you can see, the real component has even symmetry and the imaginary component has odd symmetry. In their code they only calculated the positive frequencies for a loglog plot (reasonable enough). However, for calculating the inverse transform they used the values for the positive frequencies for the negative frequencies by indexing Hfinal[N-i] but forgot to conjugate it.

Mayramays answered 19/11, 2010 at 9:6 Comment(5)
Thanks eryksun, for your nice plot and comments. I agree with the first paragraph. Regarding the second paragraph, you may be right (I'm not a DSP expert) -- would you have any online or other references where I could learn more about how to apply a transfer function to an FFT spectrum?Interlinear
Also, I'm not sure which graphs you refer to above. The graphs before p.14 use a simple sinusoidal model for illustration only. The graph on p.27 is a snapshot in time, where the start and end times are arbitrarily chosen (note the curve starts and ends on different values). This routine is applied to real analog signals, where nothing is truly periodic. Period jitter is the difference in each clock period from the average clock period. Another way to think about it: a real analog signal always has some random noise on its clock edges, preventing it from being periodic.Interlinear
Thanks eryksun, with your analysis above, I tend to agree with you that they forgot the conjugate.Interlinear
Regarding the 33kHz component, it's introduced by spread-spectrum technology, and this is optional to the standard. Thus, the code can't depend on a periodic signal.Interlinear
Regarding the Laplace xfr function, the standard provides the required xfr function written in Laplace form. I think your approach sounds reasonable though. Wouldn't both end up producing the same result (or, where do errors get introduced by sampling the Laplace xfr function; why to avoid)?Interlinear
L
2

I can shed some light to the reason why "windowing" is applied before FFT is applied.

As already pointed out the FFT assumes that we have a infinite signal. When we take a sample over a finite time T this is mathematically the equivalent of multiplying the signal with a rectangular function.

Multiplying in the time domain becomes convolution in the frequency domain. The frequency response of a rectangle is the sync function i.e. sin(x)/x. The x in the numerator is the kicker, because it dies down O(1/N).

If you have frequency components which are exactly multiples of 1/T this does not matter as the sync function is zero in all points except that frequency where it is 1.

However if you have a sine which fall between 2 points you will see the sync function sampled on the frequency point. It lloks like a magnified version of the sync function and the 'ghost' signals caused by the convolution die down with 1/N or 6dB/octave. If you have a signal 60db above the noise floor, you will not see the noise for 1000 frequencies left and right from your main signal, it will be swamped by the "skirts" of the sync function.

If you use a different time window you get a different frequency response, a cosine for example dies down with 1/x^2, there are specialized windows for different measurements. The Hanning window is often used as a general purpose window.

The point is that the rectangular window used when not applying any "windowing function" creates far worse artefacts than a well chosen window. i.e by "distorting" the time samples we get a much better picture in the frequency domain which closer resembles "reality", or rather the "reality" we expect and want to see.

Leucoma answered 4/10, 2010 at 22:3 Comment(1)
Do you mean sinc function and Hamming or Hann window?Increase
P
1

Although there will be artifacts from assuming that a rectangular window of data is periodic at the FFT aperture width, which is one interpretation of what circular convolution does without sufficient zero padding, the differences may or may not be large enough to swamp the data analysis in question.

Passivism answered 3/10, 2010 at 16:29 Comment(2)
Thanks hotpaw2, it's certainly possible that some data doesn't get effected much, but such a practice seems so prevalent I have a hard time thinking all data is never effected. Typically the literature will use windowing to reduce leakage (e.g. when the samplings are not perfectly periodic in the window of captured data). However, the issue of filter filling/emptying causing a transient leading to additional artifacts is never discussed, which makes me wonder what I'm missing.Interlinear
The filter can be assumed to be filled by all the periodic images. It may be a common assumption that the tail of the filter response from the previous images is either small enough, or overlaps exactly as if the data really is periodic with the FFT aperture as an integer multiple the period (sometimes it is).Passivism

© 2022 - 2024 — McMap. All rights reserved.