Incrementally / gradually change pitch of signal over time using octave / matlab code

clear, clc [ya, fs, nbitsraw] = wavread('/tmp/original_signal.wav'); num_per_sec=2.4; %// Define total number of times we see the signal %// Get total number of integer times we see the signal num_whole = floor(num_per_sec); %// Replicate signal yb=repmat(ya,num_whole,1); %// Determine how many samples the partial signal consists of portion = floor((num_per_sec - num_whole)*length(ya)); %// Sample from the original signal and stack this on top of replicated signal yb = [yb; ya(1:portion)]; %interpolation xxo=linspace(0,1,length(yb))'; xxi=linspace(0,1,length(ya))'; yi_t=interp1(xxo,yb,xxi,'linear'); wavwrite([yi_t'] ,fs,16,strcat('/tmp/processed_signal.wav')); % export file

My answer doesn't give exactly the same result as the one you posted, but I think it's interesting and simple enough to give you the important concepts behind pitch stretching. I haven't found the method I'm proposing elsewhere on the web, but I can't imagine no one has thought of this before, so it might have a name.

The first thing to realise is that if you want to apply transformations to the pitch over time, and not just offset it over the entire timecourse, you need to work with pitch "features" that are defined at each time-point (eg time-frequency transforms), as opposed to ones that summarise the entire signal contents (eg Fourier).

It's important to realise this, because it becomes evident that we need to involve things like the instantaneous frequency of your signal, which is defined as the derivative of the Hilbert phase (typically taken as (1/2Pi) * dPhi/ dt to work in Hz instead of rad/s).

Assuming that we can transform the instantaneous frequency of a signal, we can then translate the idea of "increasing the pitch incrementally" formally into "adding a linearly increasing offset to the instantaneous frequency". And the good news is, that we can transform the instantaneous frequency of a signal quite easily using an analytic transform. Here is how:

function y = add_linear_pitch( x, fs, df )
%
% y = add_linear_pitch( x, fs, df )
%
% x, fs: audio signal (1-dimensional)
% df: the amplitude of frequency offset, in Hz
%
% See also: hilbert
%

    x = x(:);
    n = numel(x); % number of timepoints
    m = mean(x); % average of the signal
    k = transpose(0:n-1); 

    h = hilbert( x - m ); % analytic signal
    e = abs(h); % envelope
    p = angle(h) + df*pi*k.^2/fs/n; % phase + linearly increasing offset
    y = m - imag(hilbert( e .* sin(p) )); % inverse-transform

end

The only non-obvious thing in the previous code is that we need to integrate the "linearly increasing pitch offset" (or whatever transformation of the instantaneous frequency) before applying it to the phase, and multiply it by 2Pi (to work in radians). In our case, the integral of a linear function is simply a quadratic function, but you can play with more complicated things :)

Recommended topics

Hot tags