Android speech recognizing and audio recording in the same time
Asked Answered
E

5

27

My application records audio using MediaRecorder class in AsyncTask and also use Google API transform speech to text - Recognizer Intent - using the code from this question : How can I use speech recognition without the annoying dialog in android phones

I have tried also to record audio in Thread, but this is worse solution. It causes more problems. My problem is that my application works properly on emulator. But emulator don't supports speech reocognition because of lack of voice recognition services. And on my device my application has crash when I starts recording audio and speech reognizing - "has stopped unexpectedly". However when I have wifi turned off, application works properly like on emulator.

Recording audio requires in AndroidManifest:

<uses-permission android:name="android.permission.RECORD_AUDIO" />

and speech recognition requiers:

<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.INTERNET" />

I suppose this is problem with single audio input? How can I resolve this problem? Google Speech Recognizer requiers to work in main UI thread, so I can't for example do it in Async Task. So I have audio recording in Async Task. I don't have idea why this causes problems.

I have connected my device to Eclipse and I have used USB debugging. And this is execption I have in LogCat:

08-23 14:50:03.528: ERROR/ActivityThread(12403): Activity go.android.Activity has leaked ServiceConnection android.speech.SpeechRecognizer$Connection@48181340 that was originally bound here
08-23 14:50:03.528: ERROR/ActivityThread(12403): android.app.ServiceConnectionLeaked: Activity go.android.Activity has leaked ServiceConnection android.speech.SpeechRecognizer$Connection@48181340 that was originally bound here
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread$PackageInfo$ServiceDispatcher.<init>(ActivityThread.java:1121)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread$PackageInfo.getServiceDispatcher(ActivityThread.java:1016)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ContextImpl.bindService(ContextImpl.java:951)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.content.ContextWrapper.bindService(ContextWrapper.java:347)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.speech.SpeechRecognizer.startListening(SpeechRecognizer.java:267)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at go.android.Activity.startRecordingAndAnimation(Activity.java:285)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at go.android.Activity.onResume(Activity.java:86)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.Instrumentation.callActivityOnResume(Instrumentation.java:1151)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.Activity.performResume(Activity.java:3823)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread.performResumeActivity(ActivityThread.java:3118)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread.handleResumeActivity(ActivityThread.java:3143)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2684)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread.access$2300(ActivityThread.java:125)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2033)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.os.Handler.dispatchMessage(Handler.java:99)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.os.Looper.loop(Looper.java:123)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread.main(ActivityThread.java:4627)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at java.lang.reflect.Method.invokeNative(Native Method)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at java.lang.reflect.Method.invoke(Method.java:521)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:858)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:616)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at dalvik.system.NativeStart.main(Native Method)

And after that another exception:

08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412): Failed to create session
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412): com.google.android.voicesearch.speechservice.ConnectionException: POST failed
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.SpeechServiceHttpClient.post(SpeechServiceHttpClient.java:176)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.SpeechServiceHttpClient.post(SpeechServiceHttpClient.java:88)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.ServerConnectorImpl.createTcpSession(ServerConnectorImpl.java:118)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.ServerConnectorImpl.createSession(ServerConnectorImpl.java:98)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.RecognitionController.runRecognitionMainLoop(RecognitionController.java:679)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.RecognitionController.startRecognition(RecognitionController.java:463)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.RecognitionController.access$200(RecognitionController.java:75)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.RecognitionController$1.handleMessage(RecognitionController.java:300)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at android.os.Handler.dispatchMessage(Handler.java:99)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at android.os.Looper.loop(Looper.java:123)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at android.os.HandlerThread.run(HandlerThread.java:60)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412): Caused by: java.net.SocketTimeoutException
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.harmony.luni.net.PlainSocketImpl.read(PlainSocketImpl.java:564)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.harmony.luni.net.SocketInputStream.read(SocketInputStream.java:88)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:103)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:191)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:82)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:174)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:179)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:235)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:259)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:279)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:121)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:410)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:555)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:487)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:465)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at android.net.http.AndroidHttpClient.execute(AndroidHttpClient.java:243)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.SpeechServiceHttpClient.post(SpeechServiceHttpClient.java:167)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     ... 10 more
08-23 14:50:08.000: ERROR/RecognitionController(12412): Ignoring error 2
Erosion answered 23/8, 2011 at 12:2 Comment(1)
I haven't tested this solution but maybe there is a possibility. In developer.android.com/reference/android/speech/… there is method void bufferReceived(byte[] buffer). The possible solution is to saving this recived buffer in AudioRecord Android class. It has method like read(byte[] audioData, int offsetInBytes, int sizeInBytes). So maybe it is possible to connect this two utilities in this way? Problems might have occurred with configuring AudioRecord and with converting the result to mp3 or wav format after recording.Erosion
B
6

I got a solution that is working well to have speech recognizing and audio recording. Here is the link to a simple Android project I created to show the solution's working. Also, I put some print screens inside the project to illustrate the app.

I'm gonna try to explain briefly the approach I used. I combined two features in that project: Google Speech API and Flac recording.

Google Speech API is called through HTTP connections. Mike Pultz gives more details about the API:

"(...) the new [Google] API is a full-duplex streaming API. What this means, is that it actually uses two HTTP connections- one POST request to upload the content as a “live” chunked stream, and a second GET request to access the results, which makes much more sense for longer audio samples, or for streaming audio."

However, this API needs to receive a FLAC sound file to work properly. That makes us to go to the second part: Flac recording

I implemented Flac recording in that project through extracting and adapting some pieces of code and libraries from an open source app called AudioBoo. AudioBoo uses native code to record and play flac format.

Thus, it's possible to record a flac sound, send it to Google Speech API, get the text, and play the sound that was just recorded.

The project I created has the basic principles to make it work and can be improved for specific situations. In order to make it work in a different scenario, it's necessary to get a Google Speech API key, which is obtained by being part of Google Chromium-dev group. I left one key in that project just to show it's working, but I'll remove it eventually. If someone needs more information about it, let me know cause I'm not able to put more than 2 links in this post.

Bithia answered 17/4, 2014 at 20:41 Comment(3)
Anyway, is this an Eclipse project?Trainbearer
Yeah, it's an Eclipse projectBithia
I am getting exception in native when clicking on record button. 03-03 00:28:45.374: A/art(20395): sart/runtime/check_jni.cc:65] native: #08 pc 00040b27 /data/dalvik-cache/arm/data@[email protected]@[email protected] (Java_com_example_jni_FLACStreamEncoder_init__Ljava_lang_String_2III+126) 03-03 00:28:45.374: A/art(20395): sart/runtime/check_jni.cc:65] at com.example.jni.FLACStreamEncoder.init(Native method) 03-03 00:28:45.374: A/art(20395): sart/runtime/check_jni.cc:65] at com.example.jni.FLACStreamEncoder.<init>(FLACStreamEncoder.java:32)Or
I
3

Late Answer, but for the first Exception, You have to destroy Your SpeechRecognizer after this what You want has done, for example (in onStop() or onDestroy() or directly after You don´t need the SpeechRecognizer anymore):

    if (YourSpeechRecognizer != null)
    {
        YourSpeechRecognizer.stopListening();
        YourSpeechRecognizer.cancel();
        YourSpeechRecognizer.destroy();
    }
Indihar answered 24/3, 2013 at 17:22 Comment(0)
I
2

I have successfully accomplished this with the help of CLOUD SPEECH API. You can find it's demo by google speech.

The API recognizes over 80 languages and variants, to support your global user base. You can transcribe the text of users dictating to an application’s microphone, enable command-and-control through voice, or transcribe audio files, among many other use cases. Recognize audio uploaded in the request, and integrate with your audio storage on Google Cloud Storage, by using the same technology Google uses to power its own products.

It uses audio buffer to transcribe data with help of Google Speech API. I have used this buffer to store Audio recording with help of AudioRecorder.

So with this demo we can transcribe user's speech parallely with Audio Recording.

In this, it starts and stops speech recognition based on voice. It also gives a facility of SPEECH_TIMEOUT_MILLIS in VoiceRecorder.java which is just same as EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS of RecognizerIntent, but user controlled.

So all in all, you can specify silence timeout and based on that it will stop after user output and start again as soon as user starts speaking.

Illegal answered 23/9, 2016 at 13:16 Comment(1)
Expensive solutionMaller
P
2

Recent projects on 'google-speech' and on 'android-opus' (opuslib) allow simple, concurrent recognition along with audio record to an opus file in android ext. storage.

Looking at the VoiceRecorder in the speech project , with only a few extra lines of code after reading the microphone buffer, the buffer can also be consumed by a fileSink (PCM16 to Opus-codec) in addition to the current speech-observer.

see minimal merge of the 2 projects above in Google-speech-opus-recorder

Postdiluvian answered 5/12, 2016 at 22:7 Comment(2)
Project is deprecatedMaller
Thanks for sharing. But I got this errror: E/ApiFragment: Error calling the API. io.grpc.StatusRuntimeException: UNIMPLEMENTED: GRPC target method can't be resolved. Any thoughts?Uppercase
E
0

I haven't tested this solution yet but maybe there is a possibility. In http://developer.android.com/reference/android/speech/RecognitionService.Callback.html there is method void bufferReceived(byte[] buffer). The possible solution is to saving this recived buffer in AudioRecord Android class. It has method like read(byte[] audioData, int offsetInBytes, int sizeInBytes). So maybe it is possible to connect this two utilities in this way? Problems might have occurred with configuring AudioRecord and with converting the result to mp3 or wav format after recording.

Erosion answered 11/7, 2012 at 12:2 Comment(1)
Unfortunately this method is not called (at all) by every device, so whether you get any audio data at all is not guaranteed. This is a sad and frustrating result, because we tried to take advantage of this feature in our Dictation-taking app, Dictator. Also, the format and sample rate of this data isn't formally defined, but it generally looks like (8KHz but implementation-dependent) 16-bit mono.Dreyer

© 2022 - 2024 — McMap. All rights reserved.