Is there a programmatic way to return the "about N results" number from Google Search?
Asked Answered
K

2

6

I'd like to be able to scrape the "about N results" number for an arbitrary Google Search term. Google is fairly resistant to scrapers so while that might be an option with a bit of work, I'm specifically asking if there's a better way of doing this? Perhaps there's a preexisting API provided by Google that would fulfill this need?

Kempis answered 18/8, 2016 at 16:37 Comment(0)
E
1

I would not attempt scraping as there are most likely legal ramifications with that, but would use the Google Custom Search API. You'll need an API Key as well as a CX id (This is an id for a custom search engine you'll set up in your Google account)

Once you have access to the API and your CX id, you can submit queries to the cse.list method and get the number you're looking for in the response under totalResults.

When setting up and customizing your custom search engine you'll have to define the sites you want to search. Fortunately, you can add wildcards like *.com, *.net, etc. Or follow the instructions on this page to search the entire web: https://support.google.com/customsearch/answer/2631040?hl=en

I've included all the links you'll need to get moving on this below. Try out the API List Explorer once you have a CX id. It will give you real time response data that you can check out and play around with.

Google Custom Search API

https://developers.google.com/custom-search/

This is the method/endpoint you'll want to use:

https://developers.google.com/custom-search/json-api/v1/reference/cse/list

cse.list method explorer:

https://developers.google.com/apis-explorer/#p/customsearch/v1/search.cse.list

Set up and manage your custom search engine

https://cse.google.com/cse/manage/all

Note: Results may vary a bit depending how you have your search engine configured. I have a test set up to search the entire web with emphasis on *.com and *.net domains and I'm getting a larger number than what Google shows in the "About N Results". I'm not sure if you need that exact number, but they are describing it as "About" so it can't be entirely accurate number anyway. The point is, with CSE you have a lot of control over how to configure it and you should be able to get very close.

Evangelineevangelism answered 7/9, 2016 at 2:7 Comment(5)
This does not give actual count of results, but a very restricted subset. See here : jsfiddle.net/gh/gist/library/pure/6130833Congelation
What domains do you have configured on your search engine with the cx id you used? it looks like it's only searching 'developers.google.com'. You have to add wildcard domains like I mentioned above to broaden the search.Evangelineevangelism
thats a google example, but what I am saying is a lot of wildcard domains would still miss a lot of fraction of the web.Congelation
I added a note to the answer. I'm typically seeing more results in my CSE search than what google shows in "About N Results", not less.Evangelineevangelism
With the caveats mentioned in your note, I agree that this can suffice most needs of the OP.Congelation
G
0

Assuming that's your custom search API, have you tried conditionally removing the property totalResults from the JSON response body?

you can achieve that by performing a check on the query parameter (lets say q),

if(q.equals("your string")){
    var keyName = "totalResults";
    var resp = json_encode($response);
    delete resp.queries.<APIkey>[keyName];     
} 

NOTE: The structure to locate the keyName: totalResults has been derived from here

Gavel answered 8/9, 2016 at 11:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.