Python script to translate via google translate
Asked Answered
D

3

6

I'm trying to learn python, so I decided to write a script that could translate something using google translate. Till now I wrote this:

import sys
from BeautifulSoup import BeautifulSoup
import urllib2
import urllib

data = {'sl':'en','tl':'it','text':'word'} 
request = urllib2.Request('http://www.translate.google.com', urllib.urlencode(data))

request.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11')
opener = urllib2.build_opener()
feeddata = opener.open(request).read()
#print feeddata
soup = BeautifulSoup(feeddata)
print soup.find('span', id="result_box")
print request.get_method()

And now I'm stuck. I can't see any bugs in it, but it still doesn't work (by that I mean that the script will run, but it wont translate the word).

Does anyone know how to fix it? (Sorry for my poor English)

Deathly answered 22/2, 2012 at 23:3 Comment(6)
What errors do you get if any?Gerome
As I said, I don't get any errors, everything seem to work, but in the ind i get: <span id="result_box" class="short_text"></span> There should be 'something' in this span tag.Deathly
in the end you get what? you asking to print out method. What are you aiming to return?Gerome
Perhaps because Google Translate has an API you should use if you want to programmatically translate text?Artieartifact
The translation should show up in this span tag. I was going to expose it using BeautifulSoup, but now I'm trying to get any translation.Deathly
check this modern tool: github.com/nidhaloff/deep_translatorSheriff
E
6

Google translate is meant to be used with a GET request and not a POST request. However, urrllib2 will automatically submit a POST if you add any data to your request.

The solution is to construct the url with a querystring so you will be submitting a GET.
You'll need to alter the request = urllib2.Request('http://www.translate.google.com', urllib.urlencode(data)) line of your code.

Here goes:

querystring = urllib.urlencode(data)
request = urllib2.Request('http://www.translate.google.com' + '?' + querystring )

And you will get the following output:

<span id="result_box" class="short_text">
    <span title="word" onmouseover="this.style.backgroundColor='#ebeff9'" onmouseout="this.style.backgroundColor='#fff'">
        parola
    </span>
</span>

By the way, you're kinda breaking Google's terms of service; look into them if you're doing more than hacking a little script for training.

Using requests

I strongly advise you to stay away from urllib if possible, and use the excellent requests library, which will allow you to efficiently use HTTP with Python.

Eserine answered 22/2, 2012 at 23:19 Comment(2)
Thank you so much, it works :) And I didn't realize that it was against google's terms of service, just wanted to learn something, and it seemd quite interesting. I'll definitly check this requests library, thanks again :)Deathly
Glad I helped. Regarding the terms, consider the fact that google translate is now a paid service: code.google.com/intl/fr-FR/apis/language/translate/v2/…Eserine
B
10

I made this script if you want to check it: https://github.com/mouuff/Google-Translate-API : )

Blacktail answered 13/10, 2012 at 18:27 Comment(2)
You can add another hack to your code to fake the user agent in every request from fake_useragent import UserAgent ua = UserAgent() def getuseragent(): while True: try: return {'User-Agent': ua.random.encode()} except: passViscus
@Arnaud-Aliès do you happen to know the requests limit for translate.google.com? I was using your module and in less than 30min I received HTTP Status code 429: too many requests. Most of the online answers talk about request limits for the Translate API but I was wondering if you had that data for Python's requests moduleEcho
E
6

Google translate is meant to be used with a GET request and not a POST request. However, urrllib2 will automatically submit a POST if you add any data to your request.

The solution is to construct the url with a querystring so you will be submitting a GET.
You'll need to alter the request = urllib2.Request('http://www.translate.google.com', urllib.urlencode(data)) line of your code.

Here goes:

querystring = urllib.urlencode(data)
request = urllib2.Request('http://www.translate.google.com' + '?' + querystring )

And you will get the following output:

<span id="result_box" class="short_text">
    <span title="word" onmouseover="this.style.backgroundColor='#ebeff9'" onmouseout="this.style.backgroundColor='#fff'">
        parola
    </span>
</span>

By the way, you're kinda breaking Google's terms of service; look into them if you're doing more than hacking a little script for training.

Using requests

I strongly advise you to stay away from urllib if possible, and use the excellent requests library, which will allow you to efficiently use HTTP with Python.

Eserine answered 22/2, 2012 at 23:19 Comment(2)
Thank you so much, it works :) And I didn't realize that it was against google's terms of service, just wanted to learn something, and it seemd quite interesting. I'll definitly check this requests library, thanks again :)Deathly
Glad I helped. Regarding the terms, consider the fact that google translate is now a paid service: code.google.com/intl/fr-FR/apis/language/translate/v2/…Eserine
D
5

Yes their documentation is not so easy to uncover.

Here's what you do:

  1. In the Google Cloud Platform Console:

    1.1 Go to the Projects page and select or create a new project

    1.2 Enable billing for your project

    1.3 Enable the Cloud Translation API

    1.4 Create a new API key in your project, make sure to restrict usage by IP or other means available there.


  1. In the machine where you want to run the client:

    pip install --upgrade google-api-python-client


  1. Then you can write this to send translation requests and receive responses:

Here's the code:

import json
from apiclient.discovery import build

query='this is a test to translate english to spanish'
target_language = 'es'

service = build('translate','v2',developerKey='INSERT_YOUR_APP_API_KEY_HERE')

collection = service.translations()

request = collection.list(q=query, target=target_language)

response = request.execute()

response_json = json.dumps(response)

ascii_translation = ((response['translations'][0])['translatedText']).encode('utf-8').decode('ascii', 'ignore')

utf_translation = ((response['translations'][0])['translatedText']).encode('utf-8')

print response
print ascii_translation
print utf_translation
Dunstable answered 17/1, 2017 at 14:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.