Google Translate API outputs HTML entities
Asked Answered
L

5

11

ENGLISH: Sale ID prefix is a required field

FRENCH: Vente préfixe d'ID est un champ obligatoire

Is there a way to have google translate NOT output the html entity and instead output the actual character (')

CODE: (SEE translateTo)

#!/usr/bin/php
<?php
$languages = array('english' => 'en', 'spanish' => 'es', 'indonesia' => 'id', 'french' => 'fr', 'italian' => 'it', 'dutch' => 'nl', 'portugues' => 'pt', 'arabic' => 'ar');

fwrite(STDOUT, "Please enter file: ");
$file = trim(fgets(STDIN));

//Run until user kills it
while(true)
{
    fwrite(STDOUT, "Please enter key: ");
    $key = trim(fgets(STDIN));

    fwrite(STDOUT, "Please enter english value: ");
    $value = trim(fgets(STDIN));

    foreach($languages as $folder=>$code)
    {
        $path = dirname(__FILE__).'/../../application/language/'.$folder.'/'.$file;
        $transaltedValue = translateTo($value, $code);

        $current_file_contents = file_get_contents($path); 

        //If we have already translated, update it
        if (preg_match("/['\"]{1}${key}['\"]{1}/",$current_file_contents))
        {
            $find_existing_translation = "/(\[['\"]{1})(${key}['\"]{1}[^=]+=[ ]*['\"]{1})([^'\"]+)(['\"]{1};)/";
            $new_file_contents = preg_replace($find_existing_translation, '${1}${2}'.$transaltedValue.'${4}', $current_file_contents);
            file_put_contents($path, $new_file_contents);
        }
        else //We haven't translated: Add
        {
            $pair = "\$lang['$key'] = '$transaltedValue';";
            file_put_contents($path, str_replace('?>', "$pair\n?>", $current_file_contents));
        }
    }


    fwrite(STDOUT, "Quit? (y/n): ");
    $quit = strtolower(trim(fgets(STDIN)));

    if ($quit == 'y' || $quit == 'yes')
    {
        exit(0);
    }
}

function translateTo($value, $language_key)
{
    if ($language_key == 'en')
    {
        return $value;
    }

    $api_key = 'MY_API_KEY';
    $value = urlencode($value);

    $url ="https://www.googleapis.com/language/translate/v2?key=$api_key&q=$value&source=en&target=$language_key";

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
    $body = curl_exec($ch);
    curl_close($ch);

    $json = json_decode($body);

    return $json->data->translations[0]->translatedText;
}
?>
Linville answered 10/11, 2014 at 19:26 Comment(2)
Have you tried specifying the format as text? According to the API document this defaults to HTML. I understand that this is used to specify the format of the text that is to be translated - but it is worth considering that the response will be in the same format as the requestBidden
That did it! please make an answer so I can award you points!Linville
B
18

According to the Google Translate documentation, you can choose which format you will provide the text which is to be translated (see format in query parameters). The format defaults to HTML if not specfied.

You should set this query parameter to text to indicate that you are sending plain-text as Google will likely return the translated text in the same format as it is received.

So your PHP code could become:

$baseUrl = "https://www.googleapis.com/language/translate/v2";
$params ="?key=$api_key&q=$value&source=en&target=$language_key&format=text";
$ch = curl_init();
curl_setopt( $ch, CURLOPT_URL, $baseUrl + $params );
Bidden answered 15/11, 2014 at 13:34 Comment(2)
now, query parameters webpage is here: cloud.google.com/translate/docs/reference/…Uniform
I was struggling a bit with this issue on Javascript. Thanks for the answer!Schock
T
3

For anyone working in java, there is a format method in Translate.TranslateOption

So right now you might have something translate call like such:

YourTranslateObject.translate(yourTextToBeTranslated,Translate.TranslateOption.targetLanguage(yourTargetLanguageCode))

all you need to do is add a third parameter:

YourTranslateObject.translate(yourTextToBeTranslated,Translate.TranslateOption.targetLanguage(yourTargetLanguageCode), Translate.TranslateOption.format("text"))

since HTML is default, this will switch it to text.

Tendency answered 15/7, 2020 at 1:18 Comment(0)
B
2

If you use google translate client lib, you should pass format_ in translate method, not format, it is format_ below are google translate python api: enter image description here

Bendite answered 6/7, 2018 at 4:28 Comment(0)
D
1

If you specify format Text, content inside HTML tags will be translated as well. Assume your input is:

This is a <a href="https://example.com/path">link</a>

then example and path will be translated as well, which breaks the link.

To avoid this and fix your problem, stick with format HTML and unescape the text you received back from google translate. In php you might use html_entity_decode.

Dena answered 2/3, 2019 at 10:12 Comment(0)
C
0

If you're using a client library with the v3 service to translate text, you will need to set the mimeType to text/plain.

Docs - https://cloud.google.com/translate/docs/advanced/translating-text-v3#translating_input_strings

Cohobate answered 16/3 at 10:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.