How to downsample audio recorded from mic realtime in javascript?

p = pyaudio.PyAudio() stream = p.open(format="pyaudio.paInt16", channels=1, rate= 8000, input=True, frames_per_buffer=10) print("* recording, please speak") packet_size = int((30/1000)*8000) # normally 240 packets or 480 bytes frames = [] #while True: for i in range(0, 1000): packet = stream.read(packet_size) ws.send(packet, binary=True)

To do realtime downsampling follow these steps:

First get stream instance using this:

const stream = await navigator.mediaDevices.getUserMedia(constraints);

Create media stream source from this stream.

var input = audioContext.createMediaStreamSource(stream);

Create script Processor so that you can play with buffers. I am going to create a script processor which takes 4096 samples from the stream at a time, continuously, has 1 input channel and 1 output channel.
```
var scriptNode = audioContext.createScriptProcessor(4096, 1, 1);
```
Connect your input with scriptNode. You can connect script Node to the destination as per your requirement.
```
    input.connect(scriptNode);
    scriptNode.connect(audioContext.destination);
```

Now there is a function onaudioprocess in scriptProcessor where you can do whatever you want with 4096 samples. var downsample will contain (1/sampling ratio) number of packets. floatTo16BitPCM will convert that to your required format since the original data is in 32 bit float format.

   var inputBuffer = audioProcessingEvent.inputBuffer;
    // The output buffer contains the samples that will be modified and played
    var outputBuffer = audioProcessingEvent.outputBuffer;

    // Loop through the output channels (in this case there is only one)
    for (var channel = 0; channel < outputBuffer.numberOfChannels; channel++) {
        var inputData = inputBuffer.getChannelData(channel);
        var outputData = outputBuffer.getChannelData(channel);



        var downsampled = downsample(inputData);
        var sixteenBitBuffer = floatTo16BitPCM(downsampled);
      }

Your sixteenBitBuffer will contain the data you require.

Functions for downsampling and floatTo16BitPCM are explained in this link of Watson API:IBM Watson Speech to Text Api

You won't need MediaRecorder instance. Watson API is opensource and you can look for a better streamline approach on how they implemented it for their use case. You should be able to salvage important functions from their code.

Recommended topics

Hot tags