How to recognize Facebook User-Agent
Asked Answered
S

11

59

When sharing one of my pages on FB, I want to display something different. Problem is, I prefer not to use the og: elements, but to recognize FB user-agent.

What is it? I can't find it.

Shcherbakov answered 24/12, 2011 at 20:49 Comment(1)
if(strpos($_SERVER['HTTP_USER_AGENT'], 'facebookexternalhit') !== false) { ... }Perceptible
E
110

For list of user-agent strings, look up here. The most used, as of September 2015, are facebookexternalhit/* and Facebot. As you haven't stated what language you're trying to recognize the user-agent in, I can't tell you more information. If you do want to recognize Facebook bot in PHP, use

if (
    strpos($_SERVER["HTTP_USER_AGENT"], "facebookexternalhit/") !== false ||          
    strpos($_SERVER["HTTP_USER_AGENT"], "Facebot") !== false
) {
    // it is probably Facebook's bot
}
else {
    // that is not Facebook
}

UPDATE: Facebook has added Facebot to list of their possible user-agent strings, so I've updated my code to reflect the change. Also, code is now more predictible to possible future changes.

Embassy answered 24/12, 2011 at 20:54 Comment(3)
You can checkout Facebook's best practices page for more and up-to-date details on how to detect its crawlers and scrapers. Note that Facebot has been added to the list of user-agent strings.Dormie
@donut's link no longer includes the right information. The updated URL is: developers.facebook.com/docs/sharing/webmasters/crawlerSkean
Also, FWIW, I'm using the following more future-proof code: if(strpos($_SERVER["HTTP_USER_AGENT"], "facebookexternalhit/") !== false || strpos($_SERVER["HTTP_USER_AGENT"], "Facebot") !== false) { /* It's probably Facebook's bot */ }Skean
G
16

"Facebook's user-agent string is facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)..."

Hi

Small, yet important, correction -> Facebook external hit uses 2 different user agents:

facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php) 

Setting you fitler to 1.1 only may cause filtering issues with 1.0 version.

For more information about Facebook Bot (and other bots) please refer to Botopedia.org - a Comunity-Sourced bot directory, powered by Incapsula.

Besides user-agent data, the directory also offers an IP verification option, allowing you to cross-verify an IP/User-Agent, thus helping to prevent impersonation attempts.

Graaf answered 14/8, 2012 at 14:6 Comment(0)
O
15

Here are the Facebook crawlers User Agent:

FacebookExternalHit/1.1
FacebookExternalHit/1.0

or

facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)

Note that the version numbers might change. So use a regular expression to find the crawler name and then display your content.

Update:

You can use this code in PHP to check for Facebook User Agent

if(preg_match('/^FacebookExternalHit\/.*?/i',$agent)){
    print "Facebook User-Agent";
    // process here for Facebook
}

Here is ASP.NET code. You can use this function to check if the userAgent is Facebook's useragent.

public static bool IsFacebook(string userAgent)  
{  
    userAgent = userAgent.ToLower();  
    return userAgent.Contains("facebookexternalhit");  
}  

Note:

Why would you need to do that? When you share a link to your site on Facebook, facebook crawls it and parses it to get some data to display the thumbnail, title and some content from your page, but it would link back to your site.

Also, I think this would lead to cloaking of the site, i.e. displaying different data to user and the crawlers. Cloaking is not considered a good practice and may search engines and site take note of it.

Update: Facebook also added a new useragent as of May 28th, 2014

Facebot

You can read more about the facebook crawler on https://developers.facebook.com/docs/sharing/webmasters/crawler

Oblivious answered 24/12, 2011 at 21:1 Comment(2)
Read Facebook's Privacy Policy first !Callihan
@msec: if Facebook does not crawls the page, how does it knows the details of the page, like title, thumbnails, etc?Jody
D
4

Please do note that sometimes the agent is visionutils/0.2 . You should check for it too.

Disallow answered 18/5, 2014 at 8:25 Comment(3)
Is there any evidence for this?Crosscheck
When I was writing a script to detect facebook and show him difrent content sometimes the useragent was visionutils/0.2.Disallow
presumably that's a face-recognition bot coming around to scrape any images that might have people in them.Caa
L
4

Facebook User-Agents are:

FacebookExternalHit/1.1
FacebookExternalHit/1.0
facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
facebookexternalhit/1.0 (+https://www.facebook.com/externalhit_uatext.php)
facebookexternalhit/1.1 (+https://www.facebook.com/externalhit_uatext.php)

I'm using the code below to detect FB User-Agent in PHP and it works as intended:

$agent = $_SERVER['HTTP_USER_AGENT'];
if(stristr($agent, 'FacebookExternalHit')){
    //Facebook User-Agent
}else{
    //Other User-Agent
}
Latchet answered 7/5, 2015 at 1:38 Comment(0)
S
3

Short solution is to check pattern, and not to load all the mess to user each time

<?php
    # Facebook optimized stuff
    if(strstr($_SERVER['HTTP_USER_AGENT'],'facebookexternalhit')) {
        $buffer.='<link rel="image_src" href="images/site_thumbnail.png" />';
    }
?>
Shilling answered 13/4, 2013 at 12:23 Comment(1)
Don't forget a !empty($_SERVER['HTTP_USER_AGENT']) since $_SERVER['HTTP_USER_AGENT'] is not set when the client does not send this header.Flocky
E
2

In the perspective of user-agent modifications on FB side, it is maybe safer to use a regex like that :

<?php
if (preg_match("/facebook|facebot/i", $_SERVER['HTTP_USER_AGENT'])){
   do_something();
}
?>

You can find more information about Facebook crawler on their doc: https://developers.facebook.com/docs/sharing/webmasters/crawler

Electroluminescence answered 29/4, 2015 at 8:36 Comment(0)
E
1

Firstly you should not use in_array as you will need to have the full user agent and not just a subset, thus will quickly break with changes (i.e. version 1.2 from facebook will not work if you follow the current preferred answer). It is also slower to iterate through an array rather than use a regex pattern.

As no doubt you will want to look for more bot's later so I've given the example below with 2 bot names split in a pattern with the pipe | symbol. the /i at the end makes it case insensitive.

Also you should not use $_SERVER['HTTP_USER_AGENT']; but you should filter it first incase someone has been a little nasty things exist in there.

$pattern = '/(FacebookExternalHit|GoogleBot)/i';
$agent = filter_input(INPUT_SERVER, 'HTTP_USER_AGENT', FILTER_SANITIZE_ENCODED);
    if(preg_match($pattern,$agent)){
      echo "found one of the patters"; 
   }

A bit safer and faster code.

Ensanguine answered 6/1, 2014 at 11:38 Comment(0)
B
1

And if you want to block facebook bot from accessing your website (assuming you're using Apache) add this to your .htaccess file:

<Limit GET POST>
BrowserMatchNoCase "Feedfetcher-Google" feedfetcher
BrowserMatchNoCase "facebookexternalhit" facebook
order deny,allow
deny from env=feedfetcher
deny from env=facebook
</Limit>

It also blocks google's feedfetcher that also can be used for cheap DDoSing.

Broadleaf answered 27/4, 2014 at 10:30 Comment(0)
A
1

You already have the answer for Facebook above, but one way to get any user agent is to place a script on your site that will mail you when there is a visit to it. For example, create this file on your domain at, say, https://example.com/user-agent.php :

<?php
    mail('[email protected]', 'User Agent', $_SERVER['HTTP_USER_AGENT']);

Then, visit Facebook, and type the link to the script there, and hit space bar. You don't actually have to share anything, just typing the link in and a space will cause Facebook to fetch a preview. You should then get an email with Facebook's user agent.


Enter the link on Facebook

Get an email with the user agent

Aboutface answered 16/7, 2019 at 8:56 Comment(0)
T
0

Another generic approach in PHP

$agent = $_SERVER['HTTP_USER_AGENT'];
$agent = trim($agent);
$agent = strtolower($agent);
if (
strpos($agent,'facebookexternalhit/1.1')===0
|| strpos($agent,'facebookexternalhit/1.0')===0
){
    //probably facebook
}else{
    //probably not facebook
}
Tolidine answered 13/3, 2014 at 10:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.