How can we send text input to the google assistant?
Asked Answered
A

2

7

Currently, the google assistant SDK accepts voice input, which means my question is fairly simple: I want to converse with the google assistant but not using voice, just chat. This is certainly possible, for instance, in Google Allo. Has google exposed an API for text input?

Antiparticle answered 1/5, 2017 at 16:14 Comment(1)
It's not yet possible but I encourage you to join this discussion on the Google+ community about it.Majuscule
E
4

It is supported now in the v1alpha2 version of the Google Assistant SDK Service

Erv answered 29/12, 2017 at 19:49 Comment(1)
Here's the sample example github.com/googlesamples/assistant-sdk-python/blob/master/…Voluminous
M
2

So it doesn't look like the sdk accepts text but it does accept an audio file input. It even outputs as an audio file.

python -m pushtotalk -i somefile.wav -o outputfile.wav

This got me thinking and I wrote a script:

echo $1 >> query.txt
espeak -f query.txt -w audio_query.wav
python -m pushtotalk -i audio_query.wav -o audio_response.wav &> pushtotalk.log
pocketsphinx_continuous -infile audio_response.wav 2> pocketsphinx.log > response.txt
cat response.txt

rm response.txt query.txt audio_query.wav audio_response.wav pocketsphinx.log pushtotalk.log

This is just a shell script, but this can likely be converted to python too. To use it, save the script as pushtotalk_script.sh and run ./pushtotalk_script.sh "how tall is mount kilamanjaro?. I'm using espeak to turn the text into a wav file. Then using the assistant sdk to get a response. You could stop here and play the response. Pocketsphinx is a audio transcriber engine created by CMU. You can find packages for these tools using apt-get, but if you're on OSX, the pocketsphinx package doesn't work and you'll need to tap these formulas. Also, here's a python module to use espeak. And there's a repo for pocketsphinx as a python module but I can't link more than two links.

Google's Assistant doesn't seem to have much trouble understanding the output from espeak. Pocketsphinx does however have a bit of trouble transcribing the text usually. But it works well for simple responses. Depending on the length of the question and the response audio files, the whole process takes about 5 to 10 seconds.

Mcmorris answered 30/6, 2017 at 16:12 Comment(2)
Also remember to give the script permission to run using chmodMcmorris
This feels inelegant-- I don't know, even I thought of synthesizing the speech.Antiparticle

© 2022 - 2024 — McMap. All rights reserved.