How to use cepstral?
Asked Answered
H

3

5

Recently I asked this question: How to get the fundamental frequency from FFT? (you don't actually need to read it)

My doubt right now it: how to use the cepstral algorithm?

I just don't know how to use it because the only language that I know is ActionScript 3, and for this reason I have few references about the native functions found in C, Java and so on, and how I should implement them on AS. Most articles are about these languages =/ (althought, answers in other languages than AS are welcome, just explain how the script works please)

The articles I found about cepstral to find the fundamental frequency of a FFT result told me that I should do this:

signal → FT → abs() → square → log → FT → abs() → square → power cepstrum

mathematically: |F{log(|F{f(t)}|²)}|²

Important info:

  • I am developing a GUITAR TUNER in flash
  • This is the first time I am dealing with advanced sound
  • I am using an FFT to extract frequency bins from the signal that reaches user's microphone, but I got stuck in getting the fundamental frequency from it

I don't know:

  • How to apply a square in an ARRAY (I mean, the data that my FFT gives me is an array. Should I multiply it by itself? ActionScript's debug throws errors when I try to fftResults * fftResults)
  • How to apply the "log". I would not know how to apply it even if I had a single number.
  • What is the difference between complex cepstral and power cepstral. Also, what of them should I use? I am trying to develop a guitar tuner.

Thanks!

Haskins answered 7/2, 2011 at 19:5 Comment(0)
P
6

Note that the output of an FFT is an array of complex values, i.e. each bin = re + j*im. I think you can just combine the abs and square operations and calculate re*re + im*im for each bin. This gives you a single positive value for each bin, and obviously you can calculate the log value for each bin quite easily. You then need to do a second FFT on this log squared data and again using the output of this second FFT you will calculate re*re + im*im for each bin. You will then have an array of postive values which will have one or more peaks representing the fundamental frequency or frequencies of your input.

Protostele answered 7/2, 2011 at 20:6 Comment(6)
an example of an array that I get: [0.123123,0.4809,0.0498356,0.000231,82.31240987,0.1230987 ........................................ value 1020, value 1021, value 1022, value 1023, value 1024]. These values are just numbers. They are not complex, are they?Haskins
@Lucas: it depends on what particular FFT you are using as to how the input and output data are organised. Some will use a complex data type, some will interleave real and imaginary parts, and some will have all the real parts in the first half of the array and the imaginary parts in the second (two arrays, effectively) - you need to read and understand the docs for your chosen FFT.Protostele
@PaulR: Taking FFT on N samples gives N values which correspond to N/2 Complex numbers(bins). But, after calculating square magnitude, I am only left with N/2 values which on FFTing again gives only N/4 complex numbers. Is that true? How do we calculate the bin frequency after the second FFT? Can you correct/help me with this?Lorrielorrimer
@Ravi: no, N samples gives you N complex FFT bins. However if your input signal is purely real then N/2 of the complex FFT output bins are redundant as they have conjugate symmetry with the other N/2 bins and can therefore be ignored. They are still valid though and are required for the inverse FFT operation.Protostele
Thanks a lot Paul. Since the second half of the complex numbers are complex conjugates, duplicating the log magnitude values of the first N/2 values to the other N/2 values and using them for FFTing over second time (Cepstral analysis) should be fine right? I'm asking this as the FFT library I use automatically ignores the conjugate part of the values.Lorrielorrimer
@Ravi: yes, the log magnitudes will just be a mirror image about N/2 and can be simulated as you say.Protostele
R
2

The autocorrelation is the easiest and most logical approach, and the best place to start.

To get this working, start with a simple autocorrelation, and then, if necessary, improve it following the outline provided by YIN. (YIN is based on the autocorrelation with refinements. But whether or not you'll need these refinements depends on details of your situation.) This way also, you can learn as you go rather than trying to understand the whole thing in one shot.

Although FFT approaches can also work, they are a bit more confusing. The issue is that what you are really after is the period, and this isn't well represented by the FFT. The missing fundamental is a good example of this, where if you have 2Hz and 3Hz, the fundamental is 1Hz, but is nowhere in the FFT, while 1Hz is obvious in a time based representation (e.g. the autocorrelation). Add to this that overtones aren't necessarily harmonic, and noise, etc... and all of these issues make it usually best to start with a direct approach to the problem.

Ricotta answered 9/2, 2011 at 18:11 Comment(3)
Do you know some autocorrelation code to show me? Also, is autocorrelation based on FFT at some point?Haskins
The autocorrelation is very easy to implement, though exactly what you do will depend on what tools you have available, and I don't know actionscript. Most libraries that do math on arrays have a "correlate" function, and then just correlate the data with itself. If you don't have this type of thing, here's a C++ example (koders.com/cpp/…). But it's easy to write your own, just multiply the object with a shifted version of itself.Ricotta
For the second part of your question: It's sometimes faster to calculate the autocorrelation using an FFT, but for now, think of this purely as a calculation trick. One can do it either way, with the FFT or without it.Ricotta
F
1

There are many ways of finding fundamental frequency (F0).

For languages like Java etc there are many libraries with those type of algorithms already implemented (you can study their sources).

  • MFCC (based on cepstral) implemented in Comirva (Open source).
  • Audacity (beta version!) (Open source) presents cepstrum, autocorellation, enhanced autocorellation,
  • Yin based on autocorrelation (example )
  • Finding max signal values after FFT

All these algorithms may be be very helpful for you. However easiest way to get F0 (one value in Hz) would be to use Yin.

Formula answered 7/2, 2011 at 19:40 Comment(4)
Are these methods precise enought to make a guitar tuner? I mean... I need a really high precision!Haskins
It depends on many things - microphone quality, noises etc Sometimes some algorithms for denoising might be also needed. You can run that Yin example - it is application which captures sound from microphone and displays f0 - test and verify with normal (hardware?) tuner.Formula
I read half of THIS: recherche.ircam.fr/equipes/pcm/cheveign/ps/… it looks to be some kind of documentation, but it's frying my brain. I don't want to annoy you, but would you tell me how I should implement these equations? In the signal, in a fft result, and how? thanks very much for your help!Haskins
1. test that example application does it work with a guitar 2. if works study (understand) that applications code (Yin class) and try to rewrite it to action script. Doing that you will also learn java a bit ;)Formula

© 2022 - 2024 — McMap. All rights reserved.