wave.Error: unknown format: 3 arises when trying to convert a wav file into text in Python
Asked Answered
E

2

8

I need to record an audio from the microphone and convert it into text. I have tried this conversion process using several audio clips that I downloaded from the web and it works fine. But when I try to convert the audio clip I recorded from the microphone it gives the following error.

Traceback (most recent call last): File "C:\Users\HP\AppData\Local\Programs\Python\Python37\lib\site-packages\speech_recognition__init__.py", line 203, in enter self.audio_reader = wave.open(self.filename_or_fileobject, "rb") File "C:\Users\HP\AppData\Local\Programs\Python\Python37\lib\wave.py", line 510, in open return Wave_read(f) File "C:\Users\HP\AppData\Local\Programs\Python\Python37\lib\wave.py", line 164, in init self.initfp(f) File "C:\Users\HP\AppData\Local\Programs\Python\Python37\lib\wave.py", line 144, in initfp self._read_fmt_chunk(chunk) File "C:\Users\HP\AppData\Local\Programs\Python\Python37\lib\wave.py", line 269, in _read_fmt_chunk raise Error('unknown format: %r' % (wFormatTag,)) wave.Error: unknown format: 3

The code I am trying is as follows.

import speech_recognition as sr
import sounddevice as sd
from scipy.io.wavfile import write

# recording from the microphone
fs = 44100  # Sample rate
seconds = 3  # Duration of recording

myrecording = sd.rec(int(seconds * fs), samplerate=fs, channels=2)
sd.wait()  # Wait until recording is finished
write('output.wav', fs, myrecording)  # Save as WAV file
sound = "output.wav"
recognizer = sr.Recognizer()

with sr.AudioFile(sound) as source:
     recognizer.adjust_for_ambient_noise(source)
     print("Converting audio file to text...")
     audio = recognizer.listen(source)

     try:
          text = recognizer.recognize_google(audio)
          print("The converted text:" + text)

     except Exception as e:
          print(e)

I looked at the similar questions that were answered, and they say that we need to convert it into a different wav format. Can someone provide me a code or a library that I can use for this conversion? Thank you in advance.

Ectype answered 22/2, 2020 at 13:57 Comment(2)
Share the link on the audio fileSucrase
This is the link for the audio file.drive.google.com/file/d/1OGsQbH2dqbiZ--4fI4atAhQC6S9YcUrd/…Ectype
S
9

You wrote the file in float format:

soxi output.wav 

Input File     : 'output.wav'
Channels       : 2
Sample Rate    : 44100
Precision      : 25-bit
Duration       : 00:00:03.00 = 132300 samples = 225 CDDA sectors
File Size      : 1.06M
Bit Rate       : 2.82M
Sample Encoding: 32-bit Floating Point PCM

and wave module can't read it.

To store int16 format do like this:

import numpy as np
myrecording = sd.rec(int(seconds * fs), samplerate=fs, channels=2)
sd.wait()  # Wait until recording is finished
write('output.wav', fs, myrecording.astype(np.int16))  # Save as WAV file in 16-bit format
Sucrase answered 22/2, 2020 at 20:49 Comment(1)
Hi! Your answer solved the wave error. But now I cannot hear anything in the output.wav file. Do you have any solution?Ectype
V
6

Method 1

You can't hear anything because you cast floating point value to an integer which is incorrect. The floating point values in a signal go from -1 to 1 in a WAV file and the 16 bit PCM (integer) values go from -32,768 to 32,767. So essentially, your signal got converted from something like
[-1.4240753e-05, 4.3602209e-05, 1.0526689e-06, ..., 1.7763522e-02, 1.6644333e-02, 6.7148944e-03]
to
[0, 0, 0, ..., 0, 0, 0]

The above conversion is incorrect.

To correctly convert the file into integers (PCM format), you would need to convert and not cast. One way of doing this is given below `def float2pcm(sig, dtype='int16'): sig = np.asarray(sig) dtype = np.dtype(dtype)

i = np.iinfo(dtype)
abs_max = 2 ** (i.bits - 1)
offset = i.min + abs_max
return (sig * abs_max + offset).clip(i.min, i.max).astype(dtype)`

so you can use the following code just after you use the sd.wait line

float2pcm(myrecording)

Method 2

Another (more simpler) way of solving your problem would be to use the sounddevice library's capability to do this internally by calling the following function for recording instead.

import numpy as np
myrecording = sd.rec(int(seconds * fs), samplerate=fs, channels=2, dtype=np.int16)
Vedetta answered 5/6, 2020 at 12:46 Comment(1)
Both methods together worked for me. Make sure you have enabled your Input Microphone Volume trackbar. On Ubuntu I had to enable Settings -> Sound -> Input -> click Volume button on far right and drag the slider to highest point, far right. Then it will record your microphone for the duration specified in code above.Mag

© 2022 - 2024 — McMap. All rights reserved.