How to query an advanced search with google customsearch API?
Asked Answered
W

3

15

How can I programmatically using the Google Python client library do an advanced search with Google custom search API search engine in order to return a list of first n links based in some terms and parameters of an advanced search I queried?.

I tried to check the documentation(I did not found any example), and this answer. However, the latter did not worked, since currently there is no support for the AJAX API. So far I tried this:

from googleapiclient.discovery import build
import pprint

my_cse_id = "test"

def google_search(search_term, api_key, cse_id, **kwargs):
    service = build("customsearch", "v1",developerKey="<My developer key>")
    res = service.cse().list(q=search_term, cx=cse_id, **kwargs).execute()
    return res['items']

results = google_search('dogs', my_api_key, my_cse_id, num=10)

for result in results:
    pprint.pprint(result)

And this:

import pprint

from googleapiclient.discovery import build


def main():
  service = build("customsearch", "v1",developerKey="<My developer key>")

  res = service.cse().list(q='dogs').execute()
  pprint.pprint(res)

if __name__ == '__main__':
  main()

Thus, any idea of how to do and advanced search with google's search engine API?. This is how my credentials look at google console:

credentials

Workday answered 8/12, 2016 at 5:37 Comment(3)
What error do you get?Singleton
@EugeneLisitsky, I did not got any error. The issue is that I do not understand how to make an advanced search with google's API. For example, how can I programmatically query with google all the urls that contain the best dog food in english in the UK.Workday
Here the documentation, it is complete: developers.google.com/custom-search/v1/reference/rest/v1/cse/…Dysteleology
T
9

First you need to define a custom search as described here, then make sure your my_cse_id matches the google API custom search (cs) id, e.g.

cx='017576662512468239146:omuauf_lfve'

is a search engine which only searches for domains ending with .com.

Next we need our developerKey.

from googleapiclient.discovery import build
service = build("customsearch", "v1", developerKey=dev_key)

Now we can execute our search.

res = service.cse().list(q=search_term, cx=my_cse_id).execute()

We can add additional search parameters, like language or country by using the arguments described here, e.g.

res = service.cse().list(q="the best dog food", cx=my_cse_id, cr="countryUK", lr="lang_en").execute()

would serch for "the best dog food" in English and the site needs to be from the UK.


The following modified code worked for me. api_key was removed since it was never used.

from googleapiclient.discovery import build

my_cse_id = "012156694711735292392:rl7x1k3j0vy"
dev_key = "<Your developer key>"

def google_search(search_term, cse_id, **kwargs):
    service = build("customsearch", "v1", developerKey=dev_key)
    res = service.cse().list(q=search_term, cx=cse_id, **kwargs).execute()
    return res['items']

results = google_search('boxer dogs', my_cse_id, num=10, cr="countryCA", lr="lang_en")
for result in results:
    print(result.get('link'))

Output

http://www.aboxerworld.com/whiteboxerfaqs.htm
http://boxerrescueontario.com/?section=available_dogs
http://www.aboxerworld.com/abouttheboxerbreed.htm
http://m.huffpost.com/ca/entry/10992754
http://rawboxers.com/aboutraw.shtml
http://www.tanoakboxers.com/
http://www.mondlichtboxers.com/
http://www.tanoakboxers.com/puppies/
http://www.landosboxers.com/dogs/puppies/puppies.htm
http://www.boxerrescuequebec.com/
Thermal answered 11/12, 2016 at 14:30 Comment(10)
Thanks for the help!. However, my question was about to making a advanced search (i.e. to make a google query with specific phrases, words, region, domain, language, etc). My main objective is to programmatically do an advanced search.Workday
Also, what I do not understand is why your code sample just return CS lectures links instead of dogs links. Could you show us how to make an advanced search of all the urls of boxer dogs in Seattle in English language?.Workday
Thanks for the clarification! See the updated answer, boxer dogs in the Canada speaking English.Thermal
Thanks, that's what I was looking to do. Now several questions arise from the above sample. Why when I set num=90 I got: HttpError: <HttpError 400 when requesting https://www.googleapis.com/customsearchWorkday
Also, what about the other parameters of the advanced search engine? (e.g. none of these words, any of these words, this exact word or phrase, language, site or domain). How can I declare them into google_search() object?.Workday
From the documentation: Valid values are integers between 1 and 10, inclusive. All the parameters are here: developers.google.com/custom-search/json-api/v1/reference/cse/…Thermal
I see. My main objective is to make advanced search queries over google in order to recover some interesting links, and finally store them. Is this the accurate way to do this?.Workday
Hey. So cx='017576662512468239146:omuauf_lfve' searches just for .com domain. what should the cx be to search the entire web (.org, .uk etc) and not just .com?Rambouillet
@DigvijaySawant: You need to define your own custom search engine, save it and then use the created ID.Thermal
@MaximilianPeters Is there a tutorial that I could read? I am completely new to this and still figuring out how to do it.Rambouillet
F
2

An alternative using the python requests library if you do not want to use the google discovery api:

import requests, pprint
q='italy'
api_key='AIzaSyCs.....................'

q = requests.get('https://content.googleapis.com/customsearch/v1', 
    params={ 'cx': '013027958806940070381:dazyknr8pvm', 'q': q, 'key': api_key} )
pprint.pprint(q.json())
Floorwalker answered 6/5, 2017 at 13:25 Comment(1)
Thanx its work, but why nothing retrieved when we pass a query of multiple words, like: "valencia party" .. ?Ferrin
R
1

This is late but hopefully it helps someone...

For advanced search use

response=service.cse().list(q="mysearchterm", 
cx="017576662512468239146:omuauf_lfve", ).execute()

The list() method takes in more args to help advance your search... check args here: https://developers.google.com/custom-search/json-api/v1/reference/cse/list

Randirandie answered 16/3, 2017 at 20:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.