How can I do FFT analysis of audio file in chrome without need for playback?
Asked Answered
E

5

8

There isn't single one working example on whole internet of how to perform FFT analysis of a sound file/buffer/audiobuffer in browser without need for playback. Web audio API has been changed too much to be able to use this library https://github.com/corbanbrook/dsp.js any more for example. All other clues currently don't lead to solution.

EDIT: I don't need to manipulate any of the data, just to read frequency spectrum in different moments in time of the audio. Input of solution can be any form of data(wav file, arraybuffer, audiobuffer, anything) but not stream. Expected output ideally would be array (moments in time) of arrays(freqency bin amplitudes).

Epigynous answered 16/2, 2019 at 18:34 Comment(13)
Are you trying to measure (sample) actual audio output or raw data? The question asks for analysis. The file is static; does not change the data without user action. The purpose of analysis would be to sample actual audio output, that is, the audio output is expected to be different, or at least, the sampling record is expected to be unique, or arbitrary for each media playback, correct?Complimentary
raw data. i need to avoid output because of faster user experience.Epigynous
What is the raw data, a static file not being played back, analyzed for? What do you mean by "without need for playback"? What is the requirement and expected input and output? Are you trying to manipulate a file to filter certain audio output before playing the file or having to play the media back at all to perform the filtering?Complimentary
afaik, you are right. it would actually be slower not to use the C-accelerated webaudiio api's tools, trying to mount and analize the waveforms in pure userland js. i dont even know how you would turn the mp3 into a wave. If you did go all js,use webworkers to avoid ui slowdown.Apollinaire
@Epigynous See TensorFlow; TensorFlow.jsComplimentary
I dont need to manipulate anything, just to read frequency spectrum in different moments in time of the audio. Input can be any form of data(wav file, arraybuffer, audiobuffer, anything) but not stream. Expected output ideally would be array (moments in time) of arrays(freqency bin amplitudes).Epigynous
@Epigynous With the requirement being without playback? If you create an OfflineAudioContext one of more SourceBuffers can be created, merged, analyzed. If you have the models, you can compare the resulting AudioBuffer, or or TypedArray data to the model data. Unless not gathering what the requirement is?Complimentary
I don't understand why I would need any models or machine learning for this task. If you can provide working example of those tools(offline context, src buffer analysing) you offer, it would be awesome. I already tried everything I could read from Mozilla website on audio API.Epigynous
@Epigynous Still not certain what the expected output is? The closest the can gather as to what you are trying to achieve based on interpretation of the question relevant to what have tried here are Is it possible to mix multiple audio files on top of each other preferably with javascript; How to use Blob URL, MediaSource or other methods to play concatenated Blobs of media fragments?. You can implement your own analyzing in the code. Still not sure what you mean by analysis (sampling) without playback.Complimentary
@Epigynous Web audio analyser node - run at regular interval "Yeah, you can't really use an analyzer. There's too much uncertainty in when it will get run, and you can't guarantee precisely when it will run. You're better off using a ScriptProcessor for now (AudioWorklet eventually), and doing the FFT (or other recognition code) yourself."Complimentary
@Epigynous At Chromium/Chrome it is possible ti use Native Messaging to transfer the data to a shell script to process then transfer the output back to the browser window How to programmatically send a unix socket command to a system server autospawned by browser or convert JavaScript to C++ souce code for Chromium?; Chrome Native messaging with PHP. You can implement the shell script in any language that suits meeting the requirement.Complimentary
@Epigynous "why I would need any models or machine learning for this task." You can train models to match against raw data "In April 2017, *oogle published a paper, Tacotron: Towards End-to-End Speech Synthesis, where they present a neural text-to-speech model that learns to synthesize speech directly from (text, audio) pairs." tacotron. E.g., webkitSpeechRecognition implementation at Chromium records user audio, sends to remote service, returns transcript. See also ts-ebml ebml.Complimentary
You might also be interested in the source code of espeak-ng.Complimentary
R
5

If you must use WebAudio, the way to do it is to use an OfflineAudioContext. Then when you need to get the frequency data, call suspend(time). Something like the following:

c = new OfflineAudioContext(....);
a = new AnalyserNode(c);
src.connect(a);  // src is the signal you want to analyze.

c.suspend(t1)
  .then(() => {
    a.getFloatFrequencyData(array1);
  })
  .then(() => c.resume());

c.suspend(t2)
  .then(() => {
    a.getFloatFrequencyData(array2);
  })
  .then(() => c.resume());

// More suspends if needed

// Render everything now
c.startRendering()
  .then((buffer => {
    // Maybe do something now that all the frequency data is available.
  })

However, I think only Chrome supports suspend with an offline context.

Rochellerochemont answered 17/2, 2019 at 17:8 Comment(2)
Thank you, interesting to learn about this option. Unfortunately I need to go for cross-browser functionality and Firefox and Safari currently don't support suspend method on offline context according to developer.mozilla.org/en-US/docs/Web/API/OfflineAudioContext/…Epigynous
Yeah, that's too bad. I did file a bug against Firefox about adding this. You could file a bug for Safari. Maybe it will get implemented some day.Rochellerochemont
O
3

You can do a lot with an offline audiocontext, but that will just run the whole nodes-graph as fast as possible to render a resulting chunk of audio. I don't see how an analysernode would even work in such a situation (since its audio output is useless).

Seems to me that you're correct in that you can't use the Web Audio API without actually playing the file in realtime. You would have to do the analysis yourself, there should be a lot of libraries available for that (since it's just numbercrunching). Webworkers or wasm is probably the way to go.

Outgrow answered 16/2, 2019 at 19:34 Comment(2)
I can only find 'thin wrapper for web audio API' libraries. Do you have any personal recommendation for library that does have a documented solution for my problem?Epigynous
I don't have experience with what you want to do, but doing frequency analysis is just running through numbers and not very obscure, so there should be tons of existing code in js. you shouldn't be searching for anything related to the web audio api though - we both agreed that it probably wasn't going to work with that :)Outgrow
O
1

I have written browser based DFT/FFT code that reads wave files and it doesn't use any other libraries if anyone is still interested:

https://github.com/ObsessiveCompulsiveAudiophile/wave-file-FFT-in-your-browser

Obara answered 26/9, 2023 at 18:7 Comment(0)
P
0

You need 4 things:

  • Javascript code to read in a WAV file as a binary blob

  • Code to convert slices of that blob as 16-bit samples into suitable Javascript arrays of numeric samples for an FFT

  • A Javascript implementation of a DFT or FFT of suitable size arrays for the time and frequency resolution you desire

  • Code to estimate your desired frequency and magnitude parameters as you step-and-repeat the FFT across your data slices

The first 3 can be found from web searches (Github, here, et.al.)

Phytobiology answered 16/2, 2019 at 21:32 Comment(1)
It's bullet point 3 that's the problem. Do you actually know of a JS implementation of the FFT that actually works in the browser? Because sure, then it becomes almost trivially easy (heck, use an mp3 file with AudioContext.decodeAudioData() even, since there no need for giant PCM wave files if we're going to use pure JS instead of the audio API anyway)Odette
S
0

Already exisiting APIs would provide you the heavily processed DFT output. First, AnalyserNode applies the Blackman-Harris window function. Then applies DFT. Then does exponential smoothing where α is smoothingTimeConstant. Then converts it to decibel scale. This way you only get the magnitude, and not the phase (in case you need it).

Sydel answered 18/5, 2023 at 15:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.