From the HTTP server's perspective.
I have captured google crawler request in my asp.net application and here's how the signature of the google crawler looks.
Requesting IP: 66.249.71.113
Client: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
My logs observe many different IPs for google crawler in 66.249.71.*
range. All these IPs are geo-located at Mountain View, CA, USA.
A nice solution to check if the request is coming from Google crawler would be to verify the request to contain Googlebot
and http://www.google.com/bot.html
. As I said there are many IPs observed with the same requesting client, I'd not recommend to check on IPs. And may be that's where Client identity come into the picture. So go for verifying client identity.
Here's a sample code in C#.
if (Request.UserAgent.ToLower().Contains("googlebot") ||
Request.UserAgent.ToLower().Contains("google.com/bot.html"))
{
//Yes, it's google bot.
}
else
{
//No, it's something else.
}
It's important to note that, any Http-client can easily fake this.
66.249.71.*
–
Lovel You can read the official Verifying Googlebot page.
Quoting the page here:
You can verify that a bot accessing your server really is Googlebot (or another Google user-agent) by using a reverse DNS lookup, verifying that the name is in the googlebot.com domain, and then doing a forward DNS lookup using that googlebot name. This is useful if you're concerned that spammers or other troublemakers are accessing your site while claiming to be Googlebot.
For example:
> host 66.249.66.1 1.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com. > host crawl-66-249-66-1.googlebot.com crawl-66-249-66-1.googlebot.com has address 66.249.66.1
Google doesn't post a public list of IP addresses for webmasters to whitelist. This is because these IP address ranges can change, causing problems for any webmasters who have hard coded them. The best way to identify accesses by Googlebot is to use the user-agent (Googlebot).
I have captured google crawler request in my asp.net application and here's how the signature of the google crawler looks.
Requesting IP: 66.249.71.113
Client: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
My logs observe many different IPs for google crawler in 66.249.71.*
range. All these IPs are geo-located at Mountain View, CA, USA.
A nice solution to check if the request is coming from Google crawler would be to verify the request to contain Googlebot
and http://www.google.com/bot.html
. As I said there are many IPs observed with the same requesting client, I'd not recommend to check on IPs. And may be that's where Client identity come into the picture. So go for verifying client identity.
Here's a sample code in C#.
if (Request.UserAgent.ToLower().Contains("googlebot") ||
Request.UserAgent.ToLower().Contains("google.com/bot.html"))
{
//Yes, it's google bot.
}
else
{
//No, it's something else.
}
It's important to note that, any Http-client can easily fake this.
66.249.71.*
–
Lovel You can now perform an IP address check, by checking against googlebot's published IP address list at https://developers.google.com/search/apis/ipranges/googlebot.json
From the docs:
you can identify Googlebot by IP address by matching the crawler's IP address to the list of Googlebot IP addresses. For all other Google crawlers, match the crawler's IP address against the complete list of Google IP addresses.
If you're using Apache Webserver, you could have a look at the log file 'log\access.log'.
Then load google's IPs from http://www.iplists.com/nw/google.txt and check whether one of the IPs is contained in your log.
Based on this. __curious_geek's solution, here's the javascript version:
if(window.navigator.userAgent.match(/googlebot|google\.com\/bot\.html/i)) {
// Yes, it's google bot.
}
To verify if a web request is coming from Google's crawler, you can check the IP address if it falls in the IP ranges posted by Google which can be found here:
https://developers.google.com/search/apis/ipranges/googlebot.json
Alternatively, you can also do a reverse DNS lookup and check if the domain matches one of Google's domains.
Note: You can also check the User-Agent string, but because it can be spoofed it's wise to use one of the methods mentioned above instead.
You can use the NPM package crawl-bot-verifier
to verify Google, Bing, Baidu, and many other crawlers, the library does a DNS lookup which is reliable and has a very nice API. You can find the package here:
© 2022 - 2025 — McMap. All rights reserved.