How can I retrieve the favicon of a website with XSLT or JSP?
Asked Answered
G

6

28

I want to list featured websites on my website and I thought it would be cool to honor and use their favicon. How do I get it from the domain for an arbitrary URL in either JSP or XSLT? I can fire off PHP or javascript, but XSLT is the preferred methodology.

Gelatinous answered 2/1, 2010 at 3:9 Comment(1)
to get a favicon one can use this: google.com/s2/favicons?domain=domain_nameDeferential
B
29

To get the favicon of a website, you need to load the index HTML of each featured website and check for either of the following:

HTML:

<link rel="icon" type="image/vnd.microsoft.icon" href="http://example.com/image.ico">
<link rel="icon" type="image/png" href="http://example.com/image.png">
<link rel="icon" type="image/gif" href="http://example.com/image.gif">

XHTML:

<link rel="icon" type="image/vnd.microsoft.icon" href="/somepath/image.ico" />
<link rel="icon" type="image/png" href="/somepath/image.png" />
<link rel="icon" type="image/gif" href="/somepath/image.gif" />

Internet Explorer may use a slightly different format:

<link rel="SHORTCUT ICON" href="http://www.example.com/myicon.ico" />

Also note that since most web browsers do not require the HTML link to retrieve a favicon, you should also check for favicon.ico in the website's document root, if none of the above link references are found.

With PHP, it is easy to get the HTML contents of a web page by using file_get_contents($url):

$url = 'http://www.exmaple.com';
$output = file_get_contents($url);
Bifocals answered 2/1, 2010 at 3:14 Comment(1)
EXCELLENT! Thanks for the detail Daniel. I will check out the PHP tutorial and let you know how it works out.Gelatinous
F
82

You could use Google's cache of the favicon:

https://s2.googleusercontent.com/s2/favicons?domain_url=https://example.com

Fame answered 25/4, 2011 at 8:35 Comment(5)
But in that case you are not doing the job. Google does.Cirro
16x16px is horrible quality.Bonded
is there any way to get a 32x32?Diabolo
@Diabolo you can use the following similar method and specify a size parameter: https://t0.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&fallback_opts=TYPE,SIZE,URL&url=https://aaronmeese.com&size=32Rhodes
@Diabolo Add size parameter, e.g. https://www.google.com/s2/favicons?sz=32&domain_url=yahoo.com. It seems the latest addition to it.Valvate
B
29

To get the favicon of a website, you need to load the index HTML of each featured website and check for either of the following:

HTML:

<link rel="icon" type="image/vnd.microsoft.icon" href="http://example.com/image.ico">
<link rel="icon" type="image/png" href="http://example.com/image.png">
<link rel="icon" type="image/gif" href="http://example.com/image.gif">

XHTML:

<link rel="icon" type="image/vnd.microsoft.icon" href="/somepath/image.ico" />
<link rel="icon" type="image/png" href="/somepath/image.png" />
<link rel="icon" type="image/gif" href="/somepath/image.gif" />

Internet Explorer may use a slightly different format:

<link rel="SHORTCUT ICON" href="http://www.example.com/myicon.ico" />

Also note that since most web browsers do not require the HTML link to retrieve a favicon, you should also check for favicon.ico in the website's document root, if none of the above link references are found.

With PHP, it is easy to get the HTML contents of a web page by using file_get_contents($url):

$url = 'http://www.exmaple.com';
$output = file_get_contents($url);
Bifocals answered 2/1, 2010 at 3:14 Comment(1)
EXCELLENT! Thanks for the detail Daniel. I will check out the PHP tutorial and let you know how it works out.Gelatinous
C
1

Here is my attempt at it. It uses various strategies to work around the many possible cases :

<?
/*
  nws-favicon : Get site's favicon using various strategies

  This script is part of NWS
  https://github.com/xaccrocheur/nws/

*/


function CheckImageExists($imgUrl) {
    if (@GetImageSize($imgUrl)) {
        return true;
    } else {
        return false;
    };
};

function getFavicon ($url) {

$fallback_favicon = "/var/www/favicon.ico";    
// $fallback_favicon = "http://stackoverflow.com/favicon.ico";


    $dom = new DOMDocument();
    @$dom->loadHTML($url);
    $links = $dom->getElementsByTagName('link');
    $l = $links->length;
    $favicon = "/favicon.ico";
    for( $i=0; $i<$l; $i++) {
        $item = $links->item($i);
        if( strcasecmp($item->getAttribute("rel"),"shortcut icon") === 0) {
            $favicon = $item->getAttribute("href");
            break;
        }
    }

    $u = parse_url($url);

    $subs = explode( '.', $u['host']);
    $domain = $subs[count($subs) -2].'.'.$subs[count($subs) -1];

    $file = "http://".$domain."/favicon.ico";
    $file_headers = @get_headers($file);

    if($file_headers[0] == 'HTTP/1.1 404 Not Found' || $file_headers[0] == 'HTTP/1.1 404 NOT FOUND' || $file_headers[0] == 'HTTP/1.1 301 Moved Permanently') {

        $fileContent = @file_get_contents("http://".$domain);

        $dom = @DOMDocument::loadHTML($fileContent);
        $xpath = new DOMXpath($dom);

        $elements = $xpath->query("head/link//@href");

        $hrefs = array();

        foreach ($elements as $link) {
            $hrefs[] = $link->value;
        }

        $found_favicon = array();
        foreach ( $hrefs as $key => $value ) {
            if( substr_count($value, 'favicon.ico') > 0 ) {
                $found_favicon[] = $value;
                $icon_key = $key;
            }
        }

        $found_http = array();
        foreach ( $found_favicon as $key => $value ) {
            if( substr_count($value, 'http') > 0 ) {
                $found_http[] = $value;
                $favicon = $hrefs[$icon_key];
                $method = "xpath";
            } else {
                $favicon = $domain.$hrefs[$icon_key];
                if (substr($favicon, 0, 4) != 'http') {
                    $favicon = 'http://' . $favicon;
                    $method = "xpath+http";
                }
            }
        }

        if (isset($favicon)) {
            if (!CheckImageExists($favicon)) {
                $favicon = $fallback_favicon;
                $method = "fallback";
            }
        } else {
            $favicon = $fallback_favicon;
            $method = "fallback";
        }

    } else {
        $favicon = $file;
        $method = "classic";

        if (!CheckImageExists($file)) {
            $favicon = $fallback_favicon;
            $method = "fallback";
        }

    }
    return $favicon;
}

?>
Cirro answered 11/9, 2013 at 18:51 Comment(3)
Nice. But, doesn't seem to pick up the favicon for plenty of urls I tried throwing at it.Maddocks
Wow, this is old code ; My latest version is here: npmjs.com/package/favratCirro
Nice and interesting that you went for a js approach. I threw together a quick and dirty solution in PHP. I prefer the PHP, as it allows me also easily cache the results.Maddocks
G
0

For Firefox you could use https://addons.mozilla.org/en-US/firefox/addon/httpfox/. Load a website then press F10 > ... > "open HttpFox in own Window" then look for "image/x-icon"; in the column to the right is the URL.

Gaziantep answered 26/6, 2017 at 9:49 Comment(0)
H
0
  • Using IE, bookmark the site

  • Drag the shortcut from your bookmarks menu onto your desktop

  • Open the resulting .URL using a (real) text editor

  • There will be a line in the file for IconFile, which will point to the favicon file on the web server

  • Browse to the file... viola!

Harrod answered 2/4, 2019 at 23:29 Comment(0)
G
-1

Open the page source code (right click View page source) find the below mentioned line, click the images/favicon.png link.

<link rel="icon" href="images/favicon.png" type="image/png" sizes="16x16">
Greatcoat answered 16/8, 2017 at 11:14 Comment(1)
The question specifically says "in either JSP or XSLT", indicating that the user wanted to dynamically retrieve the favicon without manually navigating through their DOMRhodes

© 2022 - 2024 — McMap. All rights reserved.