Python NumPy - FFT and Inverse FFT?
Asked Answered
R

4

7

I've been working with FFT, and I'm currently trying to get a sound waveform from a file with FFT, (modify it eventually), but then output that modified waveform back to a file. I've gotten the FFT of the soundwave and then used an inverse FFT function on it, but the output file doesn't sound right at all. I haven't done any filtering on the waveform - I'm just testing out getting the frequency data and then putting it back into a file - it should sound the same, but it sounds wildly different.

I have since been working on this project a bit, but haven't yet gotten desired results. The outputted sound file is noisy (both more loud, as well as extra noise that wasn't present in the original file), and sound from one channel leaked into the other channel (which was previously silent). The input sound file is a stereo, 2-channel file with sound only coming from one channel. Here's my code:

import scipy
import wave
import struct
import numpy
import pylab

from scipy.io import wavfile

rate, data = wavfile.read('./TriLeftChannel.wav')

filtereddata = numpy.fft.rfft(data, axis=0)
print(data)

filteredwrite = numpy.fft.irfft(filtereddata, axis=0)
print(filteredwrite)

wavfile.write('TestFiltered.wav', rate, filteredwrite)

I don't quite see why this doesn't work.

I've zipped up the problem .py file and audio file, if that can help solve the issue here.

Rematch answered 19/4, 2012 at 6:36 Comment(3)
try adding filteredwrite = numpy.round(filteredwrite).astype('int16') before you saveAdjacent
@Bago - Thanks a lot! That totally fixed the problem. I was wondering, does forcing the filtered ifft to 'int16' mean that it will be a 16-bit depth sound file?Rematch
I don't know too much about wav files, I always assumed they were raw, uncompressed data, but you'll have to read up on the wav format specs to know for sure.Adjacent
A
5
>>> import numpy as np
>>> a = np.vstack([np.ones(11), np.arange(11)])

# We have two channels along axis 0, the signals are along axis 1
>>> a
array([[  1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.],
       [  0.,   1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.]])
>>> np.fft.irfft(np.fft.rfft(a, axis=1), axis=1)
array([[  1.1       ,   1.1       ,   1.1       ,   1.1       ,
          1.1       ,   1.1       ,   1.1       ,   1.1       ,
          1.1       ,   1.1       ],
       [  0.55      ,   1.01836542,   2.51904294,   3.57565618,
          4.86463721,   6.05      ,   7.23536279,   8.52434382,
          9.58095706,  11.08163458]])
# irfft returns an even number along axis=1, even though a was (2, 11)

# When a is even along axis 1, we get a back after the irfft.
>>> a = np.vstack([np.ones(10), np.arange(10)])
>>> np.fft.irfft(np.fft.rfft(a, axis=1), axis=1)
array([[  1.00000000e+00,   1.00000000e+00,   1.00000000e+00,
          1.00000000e+00,   1.00000000e+00,   1.00000000e+00,
          1.00000000e+00,   1.00000000e+00,   1.00000000e+00,
          1.00000000e+00],
       [  7.10542736e-16,   1.00000000e+00,   2.00000000e+00,
          3.00000000e+00,   4.00000000e+00,   5.00000000e+00,
          6.00000000e+00,   7.00000000e+00,   8.00000000e+00,
          9.00000000e+00]])

# It seems like you signals are along axis 0, here is an example where the signals are on axis 0
>>> a = np.vstack([np.ones(10), np.arange(10)]).T
>>> a
array([[ 1.,  0.],
       [ 1.,  1.],
       [ 1.,  2.],
       [ 1.,  3.],
       [ 1.,  4.],
       [ 1.,  5.],
       [ 1.,  6.],
       [ 1.,  7.],
       [ 1.,  8.],
       [ 1.,  9.]])
>>> np.fft.irfft(np.fft.rfft(a, axis=0), axis=0)
array([[  1.00000000e+00,   7.10542736e-16],
       [  1.00000000e+00,   1.00000000e+00],
       [  1.00000000e+00,   2.00000000e+00],
       [  1.00000000e+00,   3.00000000e+00],
       [  1.00000000e+00,   4.00000000e+00],
       [  1.00000000e+00,   5.00000000e+00],
       [  1.00000000e+00,   6.00000000e+00],
       [  1.00000000e+00,   7.00000000e+00],
       [  1.00000000e+00,   8.00000000e+00],
       [  1.00000000e+00,   9.00000000e+00]])
Adjacent answered 25/4, 2012 at 1:37 Comment(0)
M
7
  1. You don't appear to be applying any filter here
  2. You probably want to take the ifft of the fft (post-filtering), not of the input waveform.
Mcclinton answered 19/4, 2012 at 6:39 Comment(0)
I
5

Shouldn't it be more like this?

filtereddata = numpy.fft.fft(data)
# do fft stuff to filtereddata
filteredwrite = numpy.fft.ifft(filtereddata)
wavfile.write('TestFiltered.wav', rate, filteredwrite)
Impecunious answered 19/4, 2012 at 6:49 Comment(1)
@Mcclinton - Sorry about that - edited my original post to have more info.Rematch
A
5
>>> import numpy as np
>>> a = np.vstack([np.ones(11), np.arange(11)])

# We have two channels along axis 0, the signals are along axis 1
>>> a
array([[  1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.],
       [  0.,   1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.]])
>>> np.fft.irfft(np.fft.rfft(a, axis=1), axis=1)
array([[  1.1       ,   1.1       ,   1.1       ,   1.1       ,
          1.1       ,   1.1       ,   1.1       ,   1.1       ,
          1.1       ,   1.1       ],
       [  0.55      ,   1.01836542,   2.51904294,   3.57565618,
          4.86463721,   6.05      ,   7.23536279,   8.52434382,
          9.58095706,  11.08163458]])
# irfft returns an even number along axis=1, even though a was (2, 11)

# When a is even along axis 1, we get a back after the irfft.
>>> a = np.vstack([np.ones(10), np.arange(10)])
>>> np.fft.irfft(np.fft.rfft(a, axis=1), axis=1)
array([[  1.00000000e+00,   1.00000000e+00,   1.00000000e+00,
          1.00000000e+00,   1.00000000e+00,   1.00000000e+00,
          1.00000000e+00,   1.00000000e+00,   1.00000000e+00,
          1.00000000e+00],
       [  7.10542736e-16,   1.00000000e+00,   2.00000000e+00,
          3.00000000e+00,   4.00000000e+00,   5.00000000e+00,
          6.00000000e+00,   7.00000000e+00,   8.00000000e+00,
          9.00000000e+00]])

# It seems like you signals are along axis 0, here is an example where the signals are on axis 0
>>> a = np.vstack([np.ones(10), np.arange(10)]).T
>>> a
array([[ 1.,  0.],
       [ 1.,  1.],
       [ 1.,  2.],
       [ 1.,  3.],
       [ 1.,  4.],
       [ 1.,  5.],
       [ 1.,  6.],
       [ 1.,  7.],
       [ 1.,  8.],
       [ 1.,  9.]])
>>> np.fft.irfft(np.fft.rfft(a, axis=0), axis=0)
array([[  1.00000000e+00,   7.10542736e-16],
       [  1.00000000e+00,   1.00000000e+00],
       [  1.00000000e+00,   2.00000000e+00],
       [  1.00000000e+00,   3.00000000e+00],
       [  1.00000000e+00,   4.00000000e+00],
       [  1.00000000e+00,   5.00000000e+00],
       [  1.00000000e+00,   6.00000000e+00],
       [  1.00000000e+00,   7.00000000e+00],
       [  1.00000000e+00,   8.00000000e+00],
       [  1.00000000e+00,   9.00000000e+00]])
Adjacent answered 25/4, 2012 at 1:37 Comment(0)
B
2

Two problems.

You are FFTing 2 channel data. You should only FFT 1 channel of mono data for the FFT results to make ordinary sense. If you want to process 2 channels of stereo data, you should IFFT(FFT()) each channel separately.

You are using a real fft, which throws away information, and thus makes the fft non-invertible.

If you want to invert, you will need to use an FFT which produces a complex result, and then IFFT this complex frequency domain vector back to the time domain. If you modify the frequency domain vector, make sure it stays conjugate symmetric if you want a strictly real result (minus numerical noise).

Bern answered 19/4, 2012 at 22:50 Comment(6)
you can fft multi-channel data, you just need to use a 2d array and make sure the axis keyword is set correctly (-1 by default), and irfft(rfft(n)) should return n (within machine precision).Adjacent
* irfft(rfft(n)) seems to be best behaved if n.shape[axis] is even.Adjacent
@Bago - Sorry about taking so long to go about this, but could you expand a bit on what you mean? What do you mean by 'use a 2d array'? You mean a NumPy array, right?Rematch
@Bago - I think I'm doing it correctly, but it's difficult to tell. The array is being read in with wavfile.read('./TriLeftChannel.wav'), but the shape is (x, 2), with x being a high number of samples. So, it's already a 2 dimensional array. I'm specifying the axis when I use FFT and IFFT, but it didn't change the output...Rematch
@SolarLune, In cases like this a short example would really help. I think you'll find that if you give enough info on SO to help others reproduce your problem you'll get a lot more feedback.Adjacent
@Bago - You're right. I've posted my problem code above, and made a zip with the .py file and my audio file, in case that's helpful.Rematch

© 2022 - 2024 — McMap. All rights reserved.