Approach mapping Wiktionary using POS tags, related terms and Google-translated word.
TL;DR
The question is titled 'get-multiple-variations-from-google-translate-api', but in short, you (still) currently can't do this by using Google's service alone (as of Sept. 2022). It seems most companies, such as Google, want to continue charging for this service. This answer provides an approach using a (free) service as a pivot to get the term, related terms, and their POS (Parts of Speech) e.g. noun, verb, etc. before translating those terms and then re-querying the service.
This alternative creates a small pipeline that queries Wiktionary before (on the source language), and after (on the translated terms target language) the translation (using Google).
The small pipeline is written in python and bash.
Rationale
We could get word senses, for each POS (Part of Speech) and corresponding synonyms, then translate for each word sense since Google only translates word to word, and then match word senses for the corresponding target language using a tool such as Wiktionary.
Wiktionary
Fortunately, someone has already created a python library to query Wiktionary for multiple languages.
Script to get definitions / synonyms from Wiktionary (using python):
(requires wiktionaryparser
)
e.g. python -m pip install wiktionaryparser
import sys;
import json;
from wiktionaryparser import WiktionaryParser;
parser = WiktionaryParser()
# sys.argv[1] is a language e.g. 'english'
parser.set_default_language(sys.argv[1])
print(
json.dumps(
[
[
{
'pos': d.get('partOfSpeech'),
'text':d.get('text'),
'examples':[e for e in d.get('examples')][0] if d.get('examples') else [],
'related': d.get('relatedWords')
} for d in w.get('definitions')
] for w in parser.fetch(sys.argv[2])
],
indent=2
)
)
Google translate + Wiktionary
The bash script below gets Wiktionary definitions, splits on synonym lists and correlates translations based on POS (Part of Speech).
To be honest this script is a bit convoluted, it uses a lot of utils, but it works. It could be refactored into python like the wiktionary part by anyone wanting to make something a bit more robust.
This github post provided some of the below script that call the free Google translate api.
#!/bin/bash
sl=$1
tl=$2
wiki_sl=$3
wiki_tl=$4
string=$5
ua='Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36'
#echo "$string"
result="{\"${sl}\":[],\"${tl}\":[]}"
#set -x
while IFS= read line; do
# line could be better named 'synonym' here
pos="$(echo ${line} | jq -r ".pos")"
sl_result="$(echo $line | jq . -c)"
tl_result=""
opt_single="single?client=gtx&sl=${sl}&tl=${tl}&dt=t&q=${string//[[:blank:]]/+}"
full_url="http://translate.googleapis.com/translate_a/${opt_single}"
response=$(curl -sA "${ua}" "${full_url}")
tl_word="$(echo ${response} | jq -r '.[[0][0]][] | .[0:1][0]')"
echo "${tl_word}" | grep -q " " && continue 1
tl_result_new="$(python ./get_wiki.py "${wiki_tl}" "${tl_word}" | jq -r -c --arg POS "$pos" '.[][] | select(.pos==$POS)'),"
# making json
tl_result="[${tl_result_new}"
# iterate over synonyms
while IFS= read qry; do
opt_single="single?client=gtx&sl=${sl}&tl=${tl}&dt=t&q=${qry//[[:blank:]]/+}"
full_url="http://translate.googleapis.com/translate_a/${opt_single}"
response=$(curl -sA "${ua}" "${full_url}")
tl_word="$(echo ${response} | jq -r '.[[0][0]][] | .[0:1][0]')"
echo "${tl_word}" | grep -q " " && continue 1
tl_result_new="$(python ./get_wiki.py "${wiki_tl}" "${tl_word}" | jq -r -c --arg POS "$pos" '.[][] | select(.pos==$POS)'),"
# adding to json
tl_result="${tl_result},${tl_result_new}"
done< <(echo "${line}" | jq -c -r ' .related[].words[]' | \
sed -e 's/.*://;s/"//g;s/^ *//g;s/ *$//g' | tr ',' '\n')
tl_result="$(echo "${tl_result_new}" | sed 's/,$//g')"
[ -z "${tl_result}" ] && tl_result=null
[ -z "${sl_result}" ] && sl_result=null
result="{\"${sl}\":${sl_result},\"${tl}\":${tl_result}}"
echo "$result" | jq "."
done< <(python ./get_wiki.py "$wiki_sl" "$string" | \
jq -c -r '.[][]|select(.related[].relationshipType=="synonyms")') 2> /dev/null | jq -c '[.]'
How to use:
The first 2 arguments used are for google (source language, and target language in that order which are two-letter codes.
The second 2 arguments used are for Wiktionary (source language, a full word - e.g. 'English', 'French', etc.)
The final (fifth) argument is the single word to be translated.
./translate.sh en pt english portuguese help
In fact, the python 'wiktionaryparser' lib occasionally breaks and can throw an error, due to the fact that it is a webscraping library, which is why I add 2> /dev/null
to silence stderr on output.
./translate.sh en pt english portuguese help 2> /dev/null
This script isn't perfect, but it is a starting point and a proof-of-concept to show you this is possible using a free tool such as wiktionary.
English to Portuguese
$ ./translate.sh en pt english portuguese help 2> /dev/null
Output:
[
{
"en": {
"pos": "noun",
"text": [
"help (usually uncountable, plural helps)",
"(uncountable) Action given to provide assistance; aid.",
"(usually uncountable) Something or someone which provides assistance with a task.",
"Documentation provided with computer software, etc. and accessed using the computer.",
"(usually uncountable) One or more people employed to help in the maintenance of a house or the operation of a farm or enterprise.",
"(uncountable) Correction of deficits, as by psychological counseling or medication or social support or remedial training."
],
"examples": "I need some help with my homework.",
"related": [
{
"relationshipType": "synonyms",
"words": [
"(action given to provide assistance): aid, assistance"
]
}
]
},
"pt": {
"pos": "noun",
"text": [
"assistência f (plural assistências)",
"assistance, aid, help",
"protection"
],
"examples": [],
"related": [
{
"relationshipType": "related terms",
"words": [
"assistir"
]
}
]
}
}
]
[
{
"en": {
"pos": "verb",
"text": [
"help (third-person singular simple present helps, present participle helping, simple past helped or (archaic) holp, past participle helped or (archaic) holpen)",
"(transitive) To provide assistance to (someone or something).",
"(transitive) To assist (a person) in getting something, especially food or drink at table; used with to.",
"(transitive) To contribute in some way to.",
"(intransitive) To provide assistance.",
"(transitive) To avoid; to prevent; to refrain from; to restrain (oneself). Usually used in nonassertive contexts with can."
],
"examples": "Risk is everywhere. […] For each one there is a frighteningly precise measurement of just how likely it is to jump from the shadows and get you. “The Norm Chronicles” […] aims to help data-phobes find their way through this blizzard of risks.",
"related": [
{
"relationshipType": "synonyms",
"words": [
"(provide assistance to): aid, assist, come to the aid of, help out; See also Thesaurus:help",
"(contribute in some way to): contribute to",
"(provide assistance): assist; See also Thesaurus:assist"
]
}
]
},
"pt": {
"pos": "verb",
"text": [
"ajudar (first-person singular present indicative ajudo, past participle ajudado)",
"to help, aid; to assist"
],
"examples": "Ajude-me! ― Help me!",
"related": [
{
"relationshipType": "related terms",
"words": [
"ajuda",
"ajudante"
]
}
]
}
}
]
English to Latin
$ ./translate.sh en la english latin body | jq '.'
[
{
"en": {
"pos": "noun",
"text": [
"body (countable and uncountable, plural bodies)",
"Physical frame.",
"Main section.",
"Coherent group.",
"Material entity.",
"(printing) The shank of a type, or the depth of the shank (by which the size is indicated).",
"(geometry) A three-dimensional object, such as a cube or cone."
],
"examples": "I saw them walking from a distance, their bodies strangely angular in the dawn light.",
"related": [
{
"relationshipType": "synonyms",
"words": [
"See also Thesaurus:body",
"See also Thesaurus:corpse"
]
}
]
},
"la": {
"pos": "noun",
"text": [
"cadāver n (genitive cadāveris); third declension",
"A corpse, cadaver, carcass"
],
"examples": [],
"related": []
}
}
]
When it doesn't work
Sometimes there is no output at all.
Shortcomings of this approach, and going further
Despite a lot of words being on Wiktionary, and a lot of synonyms being present, they are not always inside the 'related' field, sometimes synonyms are in the 'text' field, which gives word senses. I suspect that the partial information wiktionaryparser provides is the same on the Wiktionary site.
One could use any dictionary tool, or online thesaurus, such as wordnet, to first get possible POS tags and a word's synsets, or query a fasttext model to get a word's nearest neighbors, then filter only words that are nearest neighbors from the 'text' field in wiktionary.