I have a big batch of files I'd like to run recognition on using CMU Sphinx 4. Sphinx requires the following format:
- 16 khz
- 16 bit
- mono
- little-endian
My files are something like 44100 khz, 32 bit stereo mp3 files. I tried using Tritonus, and then its updated version JavaZoom, to convert using code from bakuzen. However, AudioSystem.getAudioInputStream(File)
throws an UnsupportedAudioFileException
, and I haven't been able to figure out why, so I have moved on.
Now I am trying ffmpeg. The command ffmpeg -i input.mp3 -ac 1 -ab 16 -ar 16000 output.wav
seems like it should do the trick (except for little endian), but when I check the output with Audacity, it still labels it as "32-bit float". The command I found on this site also uses -acodec pcm_s16le
, which from its name seems to be outputting 16 bit little endian; however, Audacity still tells me the output is 32 bit float
.
Can anyone tell me how to convert audio files into the format required by CMU Sphinx 4?