Before I tried to used PocketSphinx for Android, I used Google's voice recognition API. I didn't need to set a search name or a dictionary file. It just recognized every word that was told.
Google API recognizes a large but still limited set of words too. For a long time it failed to recognize "Spotify". Google offline speech recognizer uses about 50k words as described in their publication.
I just want to ask, How could I set a few search names, Or how could I set it to recognize all the words available (or even a large amount of them)? Maybe someone has a dictionary file with a big number of words?
Demo includes large vocabulary speech recognition with a language model (forecast part). There are bigger language model for the English language available for download, for example En-US generic language model.
The simple code to run the recognition is like that:
recognizer = defaultSetup()
.setAcousticModel(new File(assetsDir, "en-us-ptm"))
.setDictionary(new File(assetsDir, "cmudict-en-us.dict"))
.getRecognizer();
recognizer.addListener(this);
// Create keyword-activation search.
recognizer.addNgramSearch(NGRAM_SEARCH, new File(assetsDir, "en-us.lm.bin"););
// Start the search
recognizer.startListening(NGRAM_SEARCH);
However, they are not easy to fit into device and decode in realtime. If you want to decode speech in realtime with large vocabulary you need to stream audio to a server. Or you need to restrict the vocabulary and language to some small subset of generic English. You can learn more about speech recognition in CMUSphinx in tutorial.