Using Google Text-To-Speech API to save speech audio
Asked Answered
M

1

2

I am trying to implement methods discussed in this question to write a php function that downloads an audio file for a given string, but I can't seem to get around google's abuse protection. Results are sporadic, sometimes I get an audio file and other times it's an empty 2KB mp3 due to a response with "Our systems have detected unusual traffic from your computer network". Here is what I've got so far ( note the $file has a location in my code but for the purposes of this I've omitted it ) :

function downloadMP3( $url, $file ){    
    $curl = curl_init();

    curl_setopt( $curl, CURLOPT_URL, $url );
    curl_setopt( $curl, CURLOPT_RETURNTRANSFER, true );
    curl_setopt( $curl, CURLOPT_REFERER, 'http://translate.google.com/' );
    curl_setopt( $curl, CURLOPT_USERAGENT, 'stagefright/1.2 (Linux;Android 5.0)' );

    $output = curl_exec( $curl );    

    curl_close( $curl );

    if( $output === false ) { 
        return false;
    }

    $fp = fopen( $file, 'wb' );
    fwrite( $fp, $output );
    fclose( $fp );

    return true;
}

$word = "Test";

$file  = md5( $word ) . '.mp3';

if ( !file_exists( $file ) ) {
    $url = 'http://translate.google.com/translate_tts?q=' . $word . '&tl=en&client=t';
    downloadMP3( $url, $file );
}
Mila answered 7/12, 2015 at 13:5 Comment(12)
Hi Julius, I'll take a look at this within the next couple of hours. At first glance, this looks like it should work, though you're missing the ie=UTF-8 in the query string. Try adding that, but I'll be back in a few hours in any case.Schuh
I tried that and it doesn't seem to make a difference. It seems to be sporadic in that it occasionally works and then it stops working. Any ideas?Mila
I just tested the curl command on OSX at my university, and it works just fine. This makes me think that there's something wrong with your PHP code, or you're having network issues (maybe you're in a place that might spam Google a lot?). Unfortunately I have no knowledge of PHP nor how to run scripts on OSX or Ubuntu, so I can't really help debug your code... do string-type variables need to be in " characters instead of ' ? After a quick Google you might need CURLOPT_BINARYTRANSFER as seen here.Schuh
In your tests, how are you calling the curl command it if not with php? Does it work indefinitely for you or are you cut off after about five requests? I tried CURLOPT_BINARYTRANSFER but it actually has no effect after php 5.1.3. I'm stumped and can only assume this just simply isn't going to work.Mila
Test out the curl command in my previous answer. It will work in any nix terminal (OSX, linux distros, etc), or install CURL for Windows. It's a one-line command: curl 'http://translate.google.com/translate_tts?ie=UTF-8&q=Hello&tl=en&client=t' -H 'Referer: http://translate.google.com/' -H 'User-Agent: stagefright/1.2 (Linux;Android 5.0)' > google_tts.mp3. If the command doesn't work for you, it's definitely a network issue. If it *does work however, your PHP code needs some work. Probably a missing/misconfigured header.Schuh
In fact, it seems like using curl in PHP is actually a bad option. PHP has an http_get function. That is definitely a better solution than using curl. You can also set HTTP headers with that function. I would try that!Schuh
Thanks for your efforts Chris. But I tried some local command-line tests in terminal and got the same results, it manages one or two but then fails and just saves the error message into a non-playable mp3. Basically I want to make sure that this would work with no cut off point as I am planning to use it on a website where I can't predict the volume of traffic. When you try it, are you able to consistently fetch different mp3s when you make a series of back to back requests constantly?Mila
Julius, yes I have similar code working in a production Android app used by hundreds of thousands of regular users. It seems to me that you have some kind of network issue here. Perhaps your requests are coming from a region that often spams Google services. I can't say for sure... do you have experience using Wireshark? It would be helpful to capture your requests and responses to see what exactly is going on.Schuh
I am having the same issue. Have you managed to find a solution for this? When I try curl from command line with client=t option, it downloads the mp3 file but it does not play. If I don't use client=t option, it still downloads the file but this time the file size is 0. Either way the file is not playable. I am doing this on windows. I wonder how they do this here: soundoftext.comChalutz
@ChrisCirefice your curl command isn't working for me either, tested it on 5 servers in 3 different data centers.Mireyamiriam
You currently need to send a token along (check out google translate, then speak and see the network request). I'm trying to find a way to fix this.Shaughn
As @RobQuist noted, Google now requires a token (tk parameter in the querystring). However, if you check the GET request from using translate.google.com, it generates a valid one that you can then use in a cURL command. Please see my edit to my answer on the other post which has the cURL working. You can add the tk parameter to your PHP code and it should work. Your $url should look like this now: $url = 'http://translate.google.com/translate_tts?q=' . $word . '&tl=en&tk=995126.592330&client=t';Schuh
S
1

Try another service, I just found one that works even better than Google Translate; Google Text-To-Speech API

Shaughn answered 30/12, 2015 at 20:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.