Can't access microphone while running Dialog demo in sphinx4 5prealpha
Asked Answered
H

4

8

I am trying to run the dialog demo of sphinx 4 pre aplha but it gives errors.

I am creating a live speech application.

I imported the project using maven and followed this guide on stack overflow: https://mcmap.net/q/1470332/-cmu-sphinx-4-5-pre-alpha-install-guide

The error says about issues regarding the 16 khz and channel being mono. So clearly its about the sampling stuff. And is also says about microphone.

I looked on how change the microphone settings to 16 khz and 16 bit but there is no such option in windows 7

: Only Options available in win 7

The thing is that the HelloWorld and dialog demo worked fine in sphinx4 1.06 beta but after I tried the latest release it gives following errors:

Exception in thread "main" java.lang.IllegalStateException: javax.sound.sampled.LineUnavailableException: line with format PCM_SIGNED 16000.0 Hz, 16 bit, mono, 2 bytes/frame, little-endian not supported.
    at edu.cmu.sphinx.api.Microphone.<init>(Microphone.java:38)
    at edu.cmu.sphinx.api.SpeechSourceProvider.getMicrophone(SpeechSourceProvider.java:18)
    at edu.cmu.sphinx.api.LiveSpeechRecognizer.<init>(LiveSpeechRecognizer.java:34)
    at edu.cmu.sphinx.demo.dialog.Dialog.main(Dialog.java:145)
Caused by: javax.sound.sampled.LineUnavailableException: line with format PCM_SIGNED 16000.0 Hz, 16 bit, mono, 2 bytes/frame, little-endian not supported.
    at com.sun.media.sound.DirectAudioDevice$DirectDL.implOpen(DirectAudioDevice.java:513)
    at com.sun.media.sound.AbstractDataLine.open(AbstractDataLine.java:121)
    at com.sun.media.sound.AbstractDataLine.open(AbstractDataLine.java:413)
    at edu.cmu.sphinx.api.Microphone.<init>(Microphone.java:36)
    ... 3 more

Cant figure out what to do to resolve the issue.

Haematocryal answered 18/3, 2015 at 11:45 Comment(16)
@NikolayShmyrev sound card is Conexant SmartAudio HDHaematocryal
Ok, I'm sorry, this is a bug in sphinx4 that recognizer doesn't release the resource properly and Windows java doesn't allow to open the microphone the second time.Wealthy
Relevant issue in our tracker sourceforge.net/p/cmusphinx/bugs/412Wealthy
I'll try to fix it in a coming daysWealthy
@NikolayShmyrev The demo did not work even once .....Haematocryal
@NikolayShmyrev Bro please read my entire question… the issue is also about 16khz and 16bit and mono channel u removed that for the edited question… inform me if u want me to add that againHaematocryal
No, it is not related to 16khz, we have many reports on this issue before. It is about exclusive access. Note that microphone was successfully opened first time but failed second time.Wealthy
Let us continue this discussion in chat.Haematocryal
@NikolayShmyrev U suggested that my question might have an answer but the links in that answer have nothing....so should I go back to sphinx4 1.06 beta instead of 5preaplha ?Haematocryal
Just wait over weekend, we'll fix itWealthy
@NikolayShmyrev Any solution to the pronblem?Haematocryal
@NikolayShmyrev should I switch back to using sphinx4 instead of 5prealpha?Haematocryal
@NikolayShmyrev this is still broken. Do you expect patch expected any time soon? Thanks! edit: Since swapping grammars requires having multiple recognizer, this is a a blocker. Is the only thing required for a fix exposing flush()?Vulgarian
@NikolayShmyrev still not fixed it seemsDire
any hack/workaround for this for now?Dire
i tried this solution too but pointing to api/microphone but it now errors cant find digits.grxml resource where before it could.Dire
S
5

If you modify SpeechSourceProvider to return a constant microphone reference, it won't try to create multiple microphone references, which is the source of the issue.

public class SpeechSourceProvider {
    private static final Microphone mic = new Microphone(16000, 16, true, false);

    Microphone getMicrophone() {
        return mic;
    }
}

The problem here is that you don't want multiple threads trying to access a single resource, but for the demo, the recognizers are stopped and started as needed so that they aren't all competing for the microphone.

Statis answered 27/12, 2015 at 12:33 Comment(1)
This worked for me too. As of 2019 I just cloned the github repo and faced the same issue.Centri
D
1

As Nickolay explains in the source forge forum (here) the microphone resource needs to be released by the recognizer currently using it for another recognizer to be able to use the microphone. While the API is being fixed, I made the following changes to certain classes in the sphinx API as a temporary workaround. This is probably not the best solution, guess until a better solution is proposed, this will work.


I created a class named MicrophoneExtention with the same source code as the Microphone class, and added the following methods:


    public void closeLine(){
        line.close();
    }

Similarly a LiveSpeechRecognizerExtention class with the source code of LiveSpeechRecognizer class, and made the following changes:

  • use the MicrohphoneExtention class I defined:
    private final MicroPhoneExtention microphone;
  • inside the constructor,
    microphone =new MicrophoneExtention(16000, 16, true, false);
  • And add the following methods:
    public void closeRecognitionLine(){
        microphone.closeLine();
    }



Finally I edited the main method of the DialogDemo.

    Configuration configuration = new Configuration();
    configuration.setAcousticModelPath(ACOUSTIC_MODEL);
    configuration.setDictionaryPath(DICTIONARY_PATH);
    configuration.setGrammarPath(GRAMMAR_PATH);
    configuration.setUseGrammar(true);

    configuration.setGrammarName("dialog");
    LiveSpeechRecognizerExtention recognizer =
    new LiveSpeechRecognizerExtention(configuration);

    Recognizer.startRecognition(true);
    while (true) {
        System.out.println("Choose menu item:");
        System.out.println("Example: go to the bank account");
        System.out.println("Example: exit the program");
        System.out.println("Example: weather forecast");
        System.out.println("Example: digits\n");

        String utterance = recognizer.getResult().getHypothesis();

        if (utterance.startsWith("exit"))
            break;

        if (utterance.equals("digits")) {
            recognizer.stopRecognition();
            recognizer.closeRecognitionLine();
            configuration.setGrammarName("digits.grxml");
            recognizer=new LiveSpeechRecognizerExtention(configuration);
            recognizeDigits(recognizer);
            recognizer.closeRecognitionLine();
            configuration.setGrammarName("dialog");
            recognizer=new LiveSpeechRecognizerExtention(configuration);
            recognizer.startRecognition(true);
        }

        if (utterance.equals("bank account")) {
            recognizer.stopRecognition();
            recognizerBankAccount(Recognizer);
            recognizer.startRecognition(true);
        }

        if (utterance.endsWith("weather forecast")) {
            recognizer.stopRecognition();
            recognizer.closeRecognitionLine();
            configuration.setUseGrammar(false);
            configuration.setLanguageModelPath(LANGUAGE_MODEL);
            recognizer=new LiveSpeechRecognizerExtention(configuration);
            recognizeWeather(recognizer);
            recognizer.closeRecognitionLine();
            configuration.setUseGrammar(true);
            configuration.setGrammarName("dialog");
            recognizer=new LiveSpeechRecognizerExtention(configuration);
            recognizer.startRecognition(true);
        }
    }

    Recognizer.stopRecognition();

and obviously the method signatures in the DialogDemo needs changing... hope this helps... and on a final note, I am not sure if what I did is exactly legal to start with. If i am doing something wrong, please be kind enough to point out my mistakes :D

Dibrin answered 17/10, 2015 at 17:6 Comment(8)
I followed your modifications, when I run now I am getting this error: "java.lang.IllegalStateException: Expected state READY actual state DEALLOCATED at edu.cmu.sphinx.recognizer.Recognizer.checkState(Recognizer.java:134)"Phenylalanine
I am not sure where the problem could have occurred, mine works just fine, what is on the call stack? If i am reading this right, that exception gets thrown by the recognizer's decoder, not the line.Dibrin
Exception in thread "main" java.lang.IllegalStateException: Expected state READY actual state DEALLOCATED at edu.cmu.sphinx.recognizer.Recognizer.checkState(Recognizer.java:134) at edu.cmu.sphinx.recognizer.Recognizer.recognize(Recognizer.java:103) at edu.cmu.sphinx.recognizer.Recognizer.recognize(Recognizer.java:122) at edu.cmu.sphinx.api.AbstractSpeechRecognizer.getResult(AbstractSpeechRecognizer.java:63) at edu.cmu.sphinx.demo.dialog.DialogDemo.main(DialogDemo.java:153)Phenylalanine
this is the console output I gotPhenylalanine
Make sure the order of the statements are correct. 1.stop recognition 2.close line 3.reassign recognizer 4.start recognition. ......... as i said, this works for me just fine..Dibrin
Hmm.... copied your code as it is... modified as you listed out... still getting that issue.Phenylalanine
String utterance = recognizer.getResult().getHypothesis();Phenylalanine
From what i have learnt, the startRecognition() method allocates the resources in the recognizer, and sets the state of the recognizer to READY from DEALLOCATED. when the getResult() method is called, it checks if the recognizer resources have been allocated by checking the state, and throws the exception you got if not in the correct state. So i am guessing your problem should be in the startRecognition() method of your LiveSpeechRecognizerExtentionDibrin
B
0

The answer of aetherwalker worked for me - in more detail I overwrote the following files with my own implementations where I only changed the used SpeechSourceProvider:

First one is the AbstractSpeechRecognizer:

public class MaxAbstractSpeechRecognizer {
protected final Context context;
protected final Recognizer recognizer;

protected ClusteredDensityFileData clusters;

protected final MaxSpeechSourceProvider speechSourceProvider;

/**
 * Constructs recognizer object using provided configuration.
 * @param configuration initial configuration
 * @throws IOException if IO went wrong
 */
public MaxAbstractSpeechRecognizer(Configuration configuration)
    throws IOException
{
    this(new Context(configuration));
}

protected MaxAbstractSpeechRecognizer(Context context) throws IOException {
    this.context = context;
    recognizer = context.getInstance(Recognizer.class);
    speechSourceProvider = new MaxSpeechSourceProvider();
} .......................

Then the LiveSpeechRecognizer:

public class MaxLiveSpeechRecognizer extends MaxAbstractSpeechRecognizer {

private final Microphone microphone;

/**
 * Constructs new live recognition object.
 *
 * @param configuration common configuration
 * @throws IOException if model IO went wrong
 */
public MaxLiveSpeechRecognizer(Configuration configuration) throws IOException
{
    super(configuration);
    microphone = speechSourceProvider.getMicrophone();
    context.getInstance(StreamDataSource.class)
        .setInputStream(microphone.getStream());
}......................

And last but not least the SpeechSourceProvider:

import edu.cmu.sphinx.api.Microphone;

public class MaxSpeechSourceProvider {

private static final Microphone mic = new Microphone(16000, 16, true, false);

Microphone getMicrophone() {
    return mic;
}
}
Baba answered 30/3, 2016 at 19:0 Comment(0)
H
0

For me with this change, the problem was as long as i was staying in a context of cmusphinx it was good the line can be reused many times. But if i begin to reuse the mic for another work (like recording) it was not available!
I see that the stream was open in Microphone class but never close!

So first I change in class Microphone the following attributes from static to dynamic :

private TargetDataLine line;
private InputStream inputStream;

After i change the method stopRecording for closing stream before line:

  /**
 * close the stream and line
 */
public void stopRecording() {

    if (inputStream != null )
        try {
            inputStream.close();
        } catch (IOException e) {
            throw new IllegalStateException(e);
        }

    line.stop();

}

And now with no more change (class SpeechSourceProvider is original), i can reuse alternatively mic for cmupsphinx and another recording task

Hephaestus answered 19/2, 2020 at 9:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.