Google speech API [closed]
Asked Answered
C

1

25

I'm now working with my project and I'm about to build a Siri-like application for the desktop computer. I am thinking if Google Speech API is reliable and accurate for speech recognition? Can you suggest to me what speech API is the most accurate in terms of speech recognition? Most preferably a free API. Thank you.

Comeau answered 4/10, 2012 at 6:24 Comment(1)
You may want to consider running your own speech recognizer. CMU Sphinx provides specific acoustic models and build instructions for using in mobile devices cmusphinx.sourceforge.net/wiki/buildingCondottiere
A
37

While the Google speech API is free it is not an official public API. Some people have reverse engineered it, as is discussed in this blog. If you are planning on accessing the API directly for a commercial product I would not recommend it because they can drop it or change it without warning, breaking your product. This recently happened to developers that used the Google Weather API. If you are accessing it through a Chrome browser using x-webkit-speech on the other hand you are probably safe since it is supported by Google. Google's speech recognition is right up there with a lot of the more popular commercial solutions. They have a lot of experience with it in other projects like Google Voice and the now defunct Google 411. They have some of the top speech scientists working for them. The only other free alternative I can think of is Sphinx which is an open source project out of Carnegie Mellon University. Steep learning curve using this solution and if you want it to be setup as a service you will have to develop that yourself. Nuance is the other big player in the speech recognition market (I believe that is what Siri uses) and they do have solutions that offer speech recognition as a service. But they are pricey.

Update on Answer From Comments on Language Support

Windows Speech Recognition supports other languages, as does most speech recognition systems. But the caveat is that you have to tell the system what language to use and it has to support the language in question. Each vendor has a list of languages it supports and they are specific to a region. For example a vendor may support Mexican Spanish, American Spanish and Spain Spanish; which all have slightly different dialects. But the speech recognition engine can only support one language/dialect at a timer per user. A user cannot speak multiple languages to a speech recognition system without first requesting it to change to that language.

Updated 3/17/2014

The x-webkit-speech input field is being deprecated due to lack of support in other browsers. This will be replaced with the Web Speech API, which is a javascript API. You can find an example on how to use it here.

Adinaadine answered 4/10, 2012 at 13:3 Comment(4)
I'm in complete agreement with Kevin on the Google API. I'd just add one more suggestion since the question was for a desktop app. Windows provides free speech recognition for both its desktop and server operating systems. See https://mcmap.net/q/539145/-sapi-and-windows-7-problem and https://mcmap.net/q/539146/-text-to-speech-voice-generation-and-speech-to-text-voice-recognition-apis for more info.Colcothar
Im having a trouble in terms of accuracy with the windows speech recognition maybe because it needs to speak in english. I'm also reffering for which API have the most accurate in terms of speech recognition and will also adopt other diction.thank you michael levy, nd kevin junghans.Comeau
Thank you so much for all of your response sir. i am using this Google speech API now for my project. about its accuracy its good but maybe i can change it the next time if some speech API's offers more accuracy than this.depends on what project i am going to use it.thank you..Comeau
The x-webkit-speech input field is deprecated.Pietro

© 2022 - 2024 — McMap. All rights reserved.