CMUSphinx PocketSphinx - Recognize all (or large amount) of words
Asked Answered
N

1

14

Before I tried to used PocketSphinx for Android, I used Google's voice recognition API. I didn't need to set a search name or a dictionary file. It just recognized every word that was told.

Now, In PocketSphinx, I need to do it. But I can only find how to set recognition for one word, Or to set dictionary (The ones available in the demo project have only few words) that the recognizer think these are the only words exist, Which means that if someone says something similar, The recognizer thinks its the word that listed in the dictionary.

I just want to ask, How could I set a few search names, Or how could I set it to recognize all the words available (or even a large amount of them)? Maybe someone has a dictionary file with a big number of words?

Nikitanikki answered 20/9, 2014 at 13:28 Comment(1)
I also needs the same.did you find any such list of words?Slumberous
A
18

Before I tried to used PocketSphinx for Android, I used Google's voice recognition API. I didn't need to set a search name or a dictionary file. It just recognized every word that was told.

Google API recognizes a large but still limited set of words too. For a long time it failed to recognize "Spotify". Google offline speech recognizer uses about 50k words as described in their publication.

I just want to ask, How could I set a few search names, Or how could I set it to recognize all the words available (or even a large amount of them)? Maybe someone has a dictionary file with a big number of words?

Demo includes large vocabulary speech recognition with a language model (forecast part). There are bigger language model for the English language available for download, for example En-US generic language model.

The simple code to run the recognition is like that:

 recognizer = defaultSetup()
   .setAcousticModel(new File(assetsDir, "en-us-ptm"))
   .setDictionary(new File(assetsDir, "cmudict-en-us.dict"))
   .getRecognizer();
  recognizer.addListener(this);

  // Create keyword-activation search.
  recognizer.addNgramSearch(NGRAM_SEARCH, new File(assetsDir, "en-us.lm.bin"););

  // Start the search
  recognizer.startListening(NGRAM_SEARCH);

However, they are not easy to fit into device and decode in realtime. If you want to decode speech in realtime with large vocabulary you need to stream audio to a server. Or you need to restrict the vocabulary and language to some small subset of generic English. You can learn more about speech recognition in CMUSphinx in tutorial.

Auscultate answered 20/9, 2014 at 18:15 Comment(2)
the link to the example En-US generic language model is broken.:-(Rizzo
Sorry, works for me. You can also explore sourceforge.net/projects/cmusphinx/files/…Auscultate

© 2022 - 2024 — McMap. All rights reserved.