I have a PHP web application and am looking for an open source, high-accuracy speech-to-text recognition implementation that will take voice commands to open web pages from users. Examples: "Make Sales" (this will open Create Sales PHP page), "Make Purchase order", "Open END-OF-DAY reports", etc.
My Question :
I want to know if we can we use Mozilla DeepSpeech to take .wav audio from a Firefox browser and return speech to text. If yes, what will be the flow from recording voice from Firefox using mic TO convert text using the DeepSpeech engine?
How to make wakeup/launch call similar to OK-GOOGLE that will be ready to listen for commands?