MATLAB - Pitch Shifting an Audio Signal

Asked 19/11, 2013 at 15:59 Answered 19/11, 2013 at 23:36

algorithm matlab signals pitch pitch-shifting

My group is developing a simple MATLAB Graphical User Interface (GUI) that is supposed to record audio from a microphone - plugged in or built in to the computer - and play back the signal. So far we have that completed. Our GUI also can load a sample (a .wav file, etc..) and play it back using the same "Play" pushbutton on the GUI. We have a Play, Record, Load, and Save push button. Now for the pitch-shifting of loaded or recorded samples... We know we need a peak-picking algorithm to find the fundamental frequencies of the signals. We were then thinking that we could multiply each of those values by a constant to shift the pitch of all those frequencies. What we aim to do it use this algorithm and assign the separate shifts to different Pushbuttons or radiobuttons, in which we can load our sample, press the button and manipulate the pitch by doing so, then play it back. Will using a peak-picking algorithm sufficiently shift the pitch of our signals, or will the signal be screwed up during playback?

(THIS IS NOT REAL-TIME PROCESSING)

Cephalonia answered 19/11, 2013 at 15:59 Comment(2)

pitch shifting can be achieved a few ways, you can alter the phase of the signal or you can downsample but continue to play it at a higher sampling rate. The first option will not distort time, but the second will – Heartache 19/11, 2013 at 16:10

The first method is called phase vocoding. Phase Vocoder – Heartache 19/11, 2013 at 16:14

As mentioned in my comments above, there are really two approaches you can use, Phase Vocoders or higher sampling rates. The first method, using a vocoder will maintain signal length while shifting the contained frequencies higher. I am not going to go through the algorithm on how to do this, but code is openly available for this from Columbia University - http://www.ee.columbia.edu/ln/labrosa/matlab/pvoc/

The second method is simply writing the *.wav file to a higher sampling rate.

say you have a 440 Hz signal you want to be 880 Hz, simply double the sampling rate.

so instead of say wavwrite(signal,fs,'file'), use wavwrite(signal,2*fs,'file')

This however, will shorten the length of the audio file by whatever factor you increased the sampling rate.

Overall I think the better and more impressive method is the vocoder, I would not recommend just blindly using the code from Columbia, but actually taking time to understand it and being able to explain the logic behind it all mathematically

Heartache answered 19/11, 2013 at 16:23 Comment(2)

code is openly available : Page not found The requested page could not be found. – Ramires 17/11, 2020 at 23:0

I think it was this one (new url, can't edit the previous comment anymore..): ee.columbia.edu/~dpwe/resources/matlab/pvoc – Ramires 17/11, 2020 at 23:7

Something a bit simpler than the Columbia algorithm (not as high performing, but maybe gives you an appreciation of how it works) would go something like this:

Take the FFT
Use interp1 to re-sample the FFT at a higher sample rate; for example, to shift up by 1 full note (2 half notes), you could do

F1 = fft(originalSignal);
N = numel(F1);
F1a = F1(1:N/2);         % the lower half of the fft
F1b = F1(end:-1:N/2+1);  % the upper half of the fft - flipped "the right way around"
t1 = 1:N/2;              % indices of the lower half
t2 = 1+ (t1-1) / (1 + 2.^(2/12)); % finer sampling... will make peaks appear higher
F2a = interp1(t1, F1a, t2); % perform sampling of lower half
F2b = interp1(t1, F1b, t2); % resample upper half
F2 = [F2a F2b(end:-1:1)];   % put the two together again
shiftedSignal = ifft(F2);   % and do the inverse FFT

I did not do any windowing etc, so this is "approximate". In reality you want to process little overlapping chunks of data one at a time, rather than the entire file at once. So the above should be considered really "for illustration only", and not working code.

Peripteral answered 19/11, 2013 at 17:52 Comment(2)

I have never tried this before but a definite +1 for introducing me to something slightly different. – Heartache 19/11, 2013 at 19:18

I don't like bringing the dead back from the grave but there might be something wrong with the resampling here. The line for generating the new time instants for interpolation seems to be inappropriate for pitch shifting by 1 full tone. it should be: t2 = t1/2^(2/12); – Doak 8/10, 2015 at 3:39

A strong characteristic of a Pitch Shift is change the pitch without change the speed of the sound, if you change the sample rate your speed is changed and you will need resample your signal.

If your input from microphone is always monophonic, you should consider the PSOLA method, works in time-domain and you can get nice results in voice signals

Sacttler answered 19/11, 2013 at 23:36 Comment(0)

Recommended topics

Hot tags