How does the Ableton warp algorithm work exactly? [closed]

Asked 31/3, 2012 at 5:14 Answered 4/5, 2013 at 12:2

I'm looking for any documentation or definitive information on Ableton's warp feature. I understand that it has something to do with finding transients, aligning them with an even rhythm and shifting audio samples accordingly. I'm hoping to find ways to approximate warping with more basic audio editing tools.

I understand that this is ableton's unique device, really any information about how it works would be helpful.

So...does anyone have any 411?

Peril answered 31/3, 2012 at 5:14 Comment(0)

The auto-warp feature in ableton live consists basically of two processing steps: detecting beats with an automatic beat detection algorithm and dynamically changing the tempo according to the beat information.

For the tempo detection, they licensed an older version of zplane aufTAKT.

ableton live offers several algorithms for time-stretching. Most of them work in the time domain (compare: overlap and add (OLA) algorithms). Two of them, "Complex" and "Complex Pro" are licensed from zplane as well (compare the zplane élastique algorithms). They are not time-domain algorithms. To learn more about frequency domain algorithms, "Phase Vocoder" would be the best google start. An excellent introduction to the theory of time stretching and pitch shifting can be found in Zölzer's DAFX book.

Englis answered 4/5, 2013 at 12:2 Comment(0)

"Warping" the audio is to be able to change the speed of it without changing the pitch. Ableton Live has a handful of algorithms to do this, each optimized for different types of content. I'll explain how it works from a generic level.

Audio is usually captured and quantified with samples. The pressure level is measured for a short period of time. Each measurement (sample) is taken and played back very rapidly. (44.1kHz for CD audio) This means that the audio signal is in the time domain.

If we simply speed up something recorded in the time domain, we change its pitch as well, since frequency is closely related. What we need to do is convert the audio from time domain into the frequency domain. That is, rather than capturing the general pressure level for a sample, we will instead capture what frequencies are present.

To do this, first we lower the sample rate considerably. Usually to around 10ms or so. This gives us enough time to run an fourier transform (implemented as FFT usually) on the sample window and get fairly useful results. Lower frequencies are usually rolled off (since they don't fit within the window well), so various algorithms are used to boost them. These algorithms usually look at windows nearby.

Anyway, what we end up with is various frequencies present for windows. That means, to speed up the audio, we just playback each window for a shorter time, and to slow down the auydio, we playback each window for a longer time. Each window has a little snapshot of the frequencies that are present within it.

There are also a lot of fixes to this method, to make things sound better, but this is how it works generally.

Also note that MP3 encoding works the exact same way.

Chubby answered 11/4, 2012 at 17:25 Comment(7)

First of all, thanks for your answer. As an explanation of the basics of time-stretching, this seems clear. WARPING is a very similar, yet fundamentally different animal. While time-stretching is the processes of speeding up and slowing down samples while maintaining the correct pitch, warping involves moving sounds in a sample and quantizing them to a particular beat. I'm wondering also how the spaces between the musical events are filled. – Peril 11/4, 2012 at 19:34

@pepperdreamteam, The part of the warp feature that speeds up and slows down a long sample to make it fit specific time is the exact same thing. Each warp point delineates an area where stretching should occur, and to what degree. Was your question about how those warp points are initially chosen? – Chubby 11/4, 2012 at 19:38

Hey thanks Brad, this is a good answer. I think I can assume that the warp points are selected based on which transients are the loudest and most prominent in the audio sample. – Peril 11/4, 2012 at 19:46

@pepperdreamteam, Yes, I don't know the details for how Ableton Live chooses its warp points, but I suspect that it starts with basic beat detection algorithm, and then they probably have a handful of tweaks to make their detection better. I also suspect that based on the inaccuracy of their detection, they don't do this at the sample level. When I DJ with Ableton Live, I always set my own points, at least to start with, as their detection algorithm is usually 50-100ms off. – Chubby 11/4, 2012 at 20:0

@Brad, I believe the algorithms Ableton uses (implemented by zplane.de btw) are not based on frequency-shifting. Perhaps that some of the more advanced algorithms use frequency shifting as well but I believe the basis of most algorithms is granular (re-)synthesis. – Audacity 19/4, 2013 at 10:13

@Mattijs, I'm not saying they are changing the frequency. I'm saying that they take the sound which originally is in the time domain and convert it to the frequency domain. – Chubby 19/4, 2013 at 13:18

@Brad, I think I understand what you're saying, they would be resampling and then shifting the frequency back to match the original pitch. I don't believe that's what they doing though, I think they're mainly using a granular approach. I base this opinion on the artifacts I hear with the more simple algorithms like 'Beats '. When slowing down to 50% of original tempo, you hear that typical choppy sound that comes with granular time stretching. – Audacity 19/4, 2013 at 17:21

Here's a simple version of such an algorithm implemented in Max/MSP, open source:

http://cycling74.com/toolbox/kneppers-granular-stretcher/

Audacity answered 19/4, 2013 at 10:8 Comment(0)

Recommended topics

Hot tags