Voice recognition on android with recorded sound clip?

Asked 23/2, 2010 at 16:15 Answered 17/4, 2014 at 21:22

android speech-recognition voice voice-recognition

I've used the voice recognition feature on Android and I love it. It's one of my customers' most praised features. However, the format is somewhat restrictive. You have to call the recognizer intent, have it send the recording for transcription to google, and wait for the text back.

Some of my ideas would require recording the audio within my app and then sending the clip to google for transcription.

Is there any way I can send an audio clip to be processed with speech to text?

Godden answered 23/2, 2010 at 16:15 Comment(3)

Do you know if this has since been included in the API? If not did you find a workaround for sending your own recording to Google? – Gamba 22/1, 2011 at 20:35

I am wondering the same thing. I cannot believe Android is this high level, there seriously is a lack of API when it comes to media it seems. – Vidar 23/1, 2011 at 11:6

Android not provided Any library to do this . – Roping 19/2, 2013 at 5:11

I got a solution that is working well to have speech recognizing and audio recording. Here is the link to a simple Android project I created to show the solution's working. Also, I put some print screens inside the project to illustrate the app.

I'm gonna try to explain briefly the approach I used. I combined two features in that project: Google Speech API and Flac recording.

Google Speech API is called through HTTP connections. Mike Pultz gives more details about the API:

"(...) the new [Google] API is a full-duplex streaming API. What this means, is that it actually uses two HTTP connections- one POST request to upload the content as a “live” chunked stream, and a second GET request to access the results, which makes much more sense for longer audio samples, or for streaming audio."

However, this API needs to receive a FLAC sound file to work properly. That makes us to go to the second part: Flac recording

I implemented Flac recording in that project through extracting and adapting some pieces of code and libraries from an open source app called AudioBoo. AudioBoo uses native code to record and play flac format.

Thus, it's possible to record a flac sound, send it to Google Speech API, get the text, and play the sound that was just recorded.

The project I created has the basic principles to make it work and can be improved for specific situations. In order to make it work in a different scenario, it's necessary to get a Google Speech API key, which is obtained by being part of Google Chromium-dev group. I left one key in that project just to show it's working, but I'll remove it eventually. If someone needs more information about it, let me know cause I'm not able to put more than 2 links in this post.

Wheal answered 17/4, 2014 at 21:22 Comment(3)

@Isantsan I need to implement similar functionality but i am finding the second part (recording in FLAC) really difficult can you help me. I also have looked into the AudioBoo project but didn't know where to start. – Leelah 15/1, 2016 at 12:1

actually there change in api and the above code crashes when i tried to test , is this really possibly to have record voice as well as to speech to test in android – Shophar 29/2, 2016 at 4:35

If the API has changed, the project might need some tweaks. I haven't kept up with the API for a while. However, when this answer was posted, everything worked as described. – Wheal 4/3, 2016 at 4:15

Unfortunately not at this time. The only interface currently supported by Android's voice recognition service is the RecognizerIntent, which doesn't allow you to provide your own sound data.

If this is something you'd like to see, file a feature request at http://b.android.com. This is also tangentially related to existing issue 4541 and issue 36915103.

Montemontefiascone answered 23/2, 2010 at 19:49 Comment(2)

Does Google provide any facility to evaluate the accuracy of the recognizer or language models? We typically evaluate recognizer accuracy by running prerecorded samples with known transcriptions. Is there a way I can test the Google recognizer to know if it is effective for my application? I'd also like to test the two language models against my prerecorded samples to determine which provides the better accuracy. Is there any way I can do this? – Marijuana 30/7, 2010 at 14:35

Isn't there any way a given sound could be looped back from the microphone ? Something like a socket/file/stream redirection ? – Atmo 24/11, 2013 at 22:10

As far as I know there is still no way to directly send an audio clip to Google for transcription. However, Froyo (API level 8) introduced the SpeechRecognizer class, which provides direct access to the speech recognition service. So, for example, you can start playback of an audio clip and have your Activity start the speech recognizer listening in the background, which will return results after completion to a user-defined listener callback method.

The following sample code should be defined within an Activity since SpeechRecognizer's methods must be run in the main application thread. Also you will need to add the RECORD_AUDIO permission to your AndroidManifest.xml.



    boolean available = SpeechRecognizer.isRecognitionAvailable(this);
    if (available) {
        SpeechRecognizer sr = SpeechRecognizer.createSpeechRecognizer(this);
        sr.setRecognitionListener(new RecognitionListener() {
            @Override
            public void onResults(Bundle results) {
                // process results here
            }
            // define your other overloaded listener methods here
        });
        Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
        // the following appears to be a requirement, but can be a "dummy" value
        intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, "com.dummy");
        // define any other intent extras you want

        // start playback of audio clip here

        // this will start the speech recognizer service in the background
        // without starting a separate activity
        sr.startListening(intent);
    }

You can also define your own speech recognition service by extending RecognitionService, but that is beyond the scope of this answer :)

Modena answered 19/2, 2013 at 20:20 Comment(3)

Has anyone tried this and had success? Would you have to wait the entire duration of playback for a long audio file to get speech recognized? – Laryngology 26/6, 2013 at 21:11

Although this was posted a while ago, I've confirmed (with a lot more code) that this idea does work (on Android N). After calling startListening(), wait for the RecognitionListener.onReadyForSpeech() callback and play the audio clip (loudly!). – Caudell 8/11, 2016 at 12:26

Does anybody has the code reference which accepts audioclip and convert the text As mentioned by @Caudell – Larrabee 9/3, 2019 at 2:31

Recommended topics

Hot tags