How do search engines deal with AngularJS applications?
Asked Answered
N

15

702

I see two issues with AngularJS application regarding search engines and SEO:

1) What happens with custom tags? Do search engines ignore the whole content within those tags? i.e. suppose I have

<custom>
  <h1>Hey, this title is important</h1>
</custom>

would <h1> be indexed despite being inside custom tags?


2) Is there a way to avoid search engines of indexing {{}} binds literally? i.e.

<h2>{{title}}</h2>

I know I could do something like

<h2 ng-bind="title"></h2>

but what if I want to actually let the crawler "see" the title? Is server-side rendering the only solution?

Nellanellda answered 21/11, 2012 at 17:44 Comment(8)
all of these "solutions" just make me want to steer away from technologies like AngularJS, at least until google et all have more intelligent crawlers.Cyclothymia
@Cyclothymia : Yes one would wonder why of all AngularJS which is a product of Google has not come up with a built-in solution for this.. Wierd actually..Undermanned
Actually, Misko wrote Angular before he worked for Google. Google now sponsors the project, but they aren't the originators.Counterbalance
Perhaps someone here can/should update the Wikipedia article on SPA which states "SPAs are commonly not used in a context where search engine indexing is either a requirement, or desirable." en.wikipedia.org/wiki/Single-page_application [# Search engine optimization] Theres a huge paragraph about an (obscure) java-based framework called IsNat but no suggestion that SEO has been addressed by the likes of Angularjs.Cabalistic
Just an update from April 2016 - NONE of my AngularJS sites were indexed. I know others are having luck by seems like google bot doesn't understand sites with angular-ui-routerDilator
@Dilator I can confirm this as well. Interestingly, a site I made in React and react-router is fully indexed, no problem. I really wish I knew what the differentiating factor was between Angular sites and my React one.Simulated
@Simulated i kindof having second thoughts, because Google started to index my pages. But Chrome hangs when i try to open cache. Quite screwed up..Dilator
@Roy M J - Why does no one see the intent? PageSpeed, Angular, etc. are all enemies of natural, organic listings on the SERPs. Purposely. When you have a huge business model based on Pay-Per-Clicks... how better to force people to pay for their listings than creating an entire toolbox that will give them no option, but to do so? Instead of building quality web sites filled with valuable content, this industry is now overflowing with cheats and solutions that don't achieve or solve squat diddly.Juggins
P
406

Update May 2014

Google crawlers now executes javascript - you can use the Google Webmaster Tools to better understand how your sites are rendered by Google.

Original answer
If you want to optimize your app for search engines there is unfortunately no way around serving a pre-rendered version to the crawler. You can read more about Google's recommendations for ajax and javascript-heavy sites here.

If this is an option I'd recommend reading this article about how to do SEO for Angular with server-side rendering.

I’m not sure what the crawler does when it encounters custom tags.

Petromilli answered 23/11, 2012 at 0:17 Comment(17)
Thanks for the article! I was aware of this. If there's no other option, then I guess I'll have to do it this way...Nellanellda
custom tags shouldn't matter because they get rendered into whatever the directive transcludes it as (try looking at the actual html for any rendered page). so far only oldIEs have a problem with custom tags.Illa
This answer is quite old now. Please see my below answer for the latest in Angular SEO.Smitty
This is no longer current. You should now use pushState instead. There is no need to serve a separate static version of the site.Counterbalance
even with the google update, ng-view will not be rendered correctly, as i can see in Google Webmaster toolsFestination
Yeah just because they execute javascript doesn't mean that your page will be indexed properly. The safest way is to detect the google bot useragent, use a headless browser like phantomjs, obtain page.content and return static html.Wop
I realize this question is specific to SEO, but keep in mind that other crawlers (Facebook, Twitter, etc.) aren't yet able to evaluate JavaScript. Sharing pages on social media sites, for example, would still be a problem without a server-side rendering strategy.Deron
Googling 'Angularjs seo' gets me this: prerender.io Any use to this answer?Dalhousie
Please, can someone give an example of AngularJS site correctly indexed without implementing the Google crawling scheme specification?Universality
FYI googlewebmastercentral.blogspot.ca/2015/10/…Varney
@check_ca: Without escaped fragment -- with escaped fragment. Example search on MixCloud. Example search on illegalcartoon.Rode
@AllanBogh The indexed content of MixCloud is rendered on server-side.Universality
Google doesn't parse Angular websitesDilator
orchardmile.com is served without pre-rendered content to Google. However we are using a pre-rendered service (prerender.io) in order to work with other bots like Slack, Skype, Facebook, etc.Maddalena
@JonathanMuszkat, Then how you are managing Google SEO, if you are not serving pre-rendered content to Google crawlers? How you have got site indexing in Google search. I am very excited to hear from you :) and any small lead would help me to understand.Malefactor
@NeerajSingh Google is SPA capable. So for Google you don't need to do anything else for SEO apart for a simple code to change the metadata based on the routing.Maddalena
@JonathanMuszkat, Thanks John for your comment :). I was wondering, if there are some another ways. Because Google is not fully ready to understand complex SPA - especially which has slow page speed and lots of dynamic data rendering. As it's use headless browsers which is the not real scenario. But yes! most the medium level SPA could be crawled by Google itself without any issues.Malefactor
C
472

(2022) Use Server Side Rendering if possible, and generate URLs with Pushstate

Google can and will run JavaScript now so it is very possible to build a site using only JavaScript provided you create a sensible URL structure. However, pagespeed has become a progressively more important ranking factor and typically pages built clientside perform poorly on initial render.

Serverside rendering (SSR) can help by allowing your pages to be pre-generated on the server. Your html containst the div that will be used as the page root, but this is not an empty div, it contains the html that the JavaScript would have generated if it were allowed to run.

The client downloads the HTML and renders it giving a very fast initial load, then it executes the JavaScript replacing the content of the root div with generated content in a process known as hydration.

Many newer frameworks come with SSR built in, notably NextJS.

(2015) Use PushState and Precomposition

The current (2015) way to do this is using the JavaScript pushState method.

PushState changes the URL in the top browser bar without reloading the page. Say you have a page containing tabs. The tabs hide and show content, and the content is inserted dynamically, either using AJAX or by simply setting display:none and display:block to hide and show the correct tab content.

When the tabs are clicked, use pushState to update the URL in the address bar. When the page is rendered, use the value in the address bar to determine which tab to show. Angular routing will do this for you automatically.

Precomposition

There are two ways to hit a PushState Single Page App (SPA)

  1. Via PushState, where the user clicks a PushState link and the content is AJAXed in.
  2. By hitting the URL directly.

The initial hit on the site will involve hitting the URL directly. Subsequent hits will simply AJAX in content as the PushState updates the URL.

Crawlers harvest links from a page then add them to a queue for later processing. This means that for a crawler, every hit on the server is a direct hit, they don't navigate via Pushstate.

Precomposition bundles the initial payload into the first response from the server, possibly as a JSON object. This allows the Search Engine to render the page without executing the AJAX call.

There is some evidence to suggest that Google might not execute AJAX requests. More on this here:

https://web.archive.org/web/20160318211223/http://www.analog-ni.co/precomposing-a-spa-may-become-the-holy-grail-to-seo

Search Engines can read and execute JavaScript

Google has been able to parse JavaScript for some time now, it's why they originally developed Chrome, to act as a full featured headless browser for the Google spider. If a link has a valid href attribute, the new URL can be indexed. There's nothing more to do.

If clicking a link in addition triggers a pushState call, the site can be navigated by the user via PushState.

Search Engine Support for PushState URLs

PushState is currently supported by Google and Bing.

Google

Here's Matt Cutts responding to Paul Irish's question about PushState for SEO:

http://youtu.be/yiAF9VdvRPw

Here is Google announcing full JavaScript support for the spider:

http://googlewebmastercentral.blogspot.de/2014/05/understanding-web-pages-better.html

The upshot is that Google supports PushState and will index PushState URLs.

See also Google webmaster tools' fetch as Googlebot. You will see your JavaScript (including Angular) is executed.

Bing

Here is Bing's announcement of support for pretty PushState URLs dated March 2013:

http://blogs.bing.com/webmaster/2013/03/21/search-engine-optimization-best-practices-for-ajax-urls/

Don't use HashBangs #!

Hashbang URLs were an ugly stopgap requiring the developer to provide a pre-rendered version of the site at a special location. They still work, but you don't need to use them.

Hashbang URLs look like this:

domain.example/#!path/to/resource

This would be paired with a metatag like this:

<meta name="fragment" content="!">

Google will not index them in this form, but will instead pull a static version of the site from the escaped_fragments URL and index that.

Pushstate URLs look like any ordinary URL:

domain.example/path/to/resource

The difference is that Angular handles them for you by intercepting the change to document.location dealing with it in JavaScript.

If you want to use PushState URLs (and you probably do) take out all the old hash style URLs and metatags and simply enable HTML5 mode in your config block.

Testing your site

Google Webmaster tools now contains a tool which will allow you to fetch a URL as Google, and render JavaScript as Google renders it.

https://www.google.com/webmasters/tools/googlebot-fetch

Generating PushState URLs in Angular

To generate real URLs in Angular, rather than # prefixed ones, set HTML5 mode on your $locationProvider object.

$locationProvider.html5Mode(true);

Server Side

Since you are using real URLs, you will need to ensure the same template (plus some precomposed content) gets shipped by your server for all valid URLs. How you do this will vary depending on your server architecture.

Sitemap

Your app may use unusual forms of navigation, for example hover or scroll. To ensure Google is able to drive your app, I would probably suggest creating a sitemap, a simple list of all the URLs your app responds to. You can place this at the default location (/sitemap or /sitemap.xml), or tell Google about it using webmaster tools.

It's a good idea to have a sitemap anyway.

Browser support

Pushstate works in IE10. In older browsers, Angular will automatically fall back to hash style URLs

A demo page

The following content is rendered using a pushstate URL with precomposition:

http://html5.gingerhost.com/london

As can be verified, at this link, the content is indexed and is appearing in Google.

Serving 404 and 301 Header status codes

Because the search engine will always hit your server for every request, you can serve header status codes from your server and expect Google to see them.

Counterbalance answered 23/4, 2014 at 13:10 Comment(33)
I have to look into this - thanks for the explanation. One thing I keep wondering is, does google now run the javascript before indexing the page?Mimas
According to Google they do execute the JavaScript and index the result. They do this to prevent cloaking, and to let them spider single page apps.Counterbalance
But will Google crawl the actual angular links?Piscina
.pushState by itself does not actually seem to do anything. There must be something else that triggers it. (I am not using Angular) See my flower painting site as an example: tinyurl.com/mau55bn .. the text doesn't change at all from the default description meta tag. Any ideas? Edit: perhaps the description tag trumps generated content?Burgoo
I'm not sure I understand. Pushstate changes the address in the address bar. After that it's up to you.Counterbalance
@Jamie, According to Google they will, provided you have told Angular to generate real links using pushState.Counterbalance
"PushState changes the URL in the top browser bar without reloading the page... When the tabs are clicked, use pushState to update the url in the address bar. When the page is rendered, use the value in the address bar to determine which tab to show. Angular routing will do this for you automatically." Lightbulb!Perissodactyl
@superluminary, could you, please, explain the subject a little bit deeper? Especially the 'Server side' section. I'm using angularjs + angularjs-route + locationProvider.html5Mode + api + dynamic navigation (not the static one like on html5.gingerhost.com. URLs are displayed well, however the content does not seem to be indexed. Do I have to serve somehow a static content while accessing a page by a direct url? I'm actually confused by this message: >>you will need to ensure the same template gets shipped by your server for all valid URLs. Could you please explain it? Thanks in advance.Latoyalatoye
@sray, yes I can. Visit your home page, now navigate to a sub page by clicking a link that's handled by Angular. PushState causes the URL to change, and Angular notices the change and responds appropriately. Your server is not hit, except possibly by an AJAX call. Angular has handled it all without troubling the browser. - OK, now copy that URL into another tab and press enter, what do you see?Counterbalance
@sray - If every URL on your site is serving the same template, the browser will be able to grab the template, and Angular will be able to take it from there, by inspecting the URL and rendering the correct content. If hitting that URL directly on the server returns a 404 or a 500 then you have a problem, direct links will not work, bookmarks will not work and you will not be indexed. Do you see now?Counterbalance
@superluminary, thank you for your reply. Are you saying that it is enough to configure the server to server the same static page and no need to feed pre-rendered content for search bots like escaped_content? If so, I'm confused, I configured server, it generates a dynamic pages and dynamic urls, but they're not getting fixed. Please check my ticket with details: #26201240 I will be glad if you could help me. Thanks in advance.Latoyalatoye
Google webdev tools (fetch as google), still sees empty content of my site, dynamic content are being indexed or not?Sophiasophie
@calmbird - no, if Google webmaster tools is seeing a blank page you have an issue with your site and will not be indexed. It could be configuration, it could be JavaScript, it could even be robots.txt.Counterbalance
no :) fetch as google doesn't interprate <meta tags>. I can see blank page in fetch as google, but my page is still being indexed.Sophiasophie
@calmbird - you need to fix the blank page issue. It's hard to say what is causing it without more information.Counterbalance
@Counterbalance - as I said, fetch as google doesn't interprate metatag that says webpage has dynamic content, thats why fetch as google see empty side. To check if everything is ok just put ?_escaped_fragment_, in fetch as google url.Sophiasophie
@calmbird - no, forget escaped fragments, take that stuff out and the metatag that supports it. Reread my post, use pushstate with real urls, not escaped fragments and hashbangs. There must be no # characters in your url, except for in page links.Counterbalance
@Counterbalance dynamic titles and description tags are not indexed. But content does. How will we do the titles for SEO?Mallett
My SPA content is loaded via AJAX. I can see in search results only {{something.property}} as if JS is disabled. And also all the pages I submitted via sitemap the same {{event.description}} instead of actual value.Dilator
Hi @toolkit - it sounds like Google is not correctly executing your script. What does it look like in Webmaster Tools?Counterbalance
@Counterbalance in GWT it displays correctly when i do Fetch and Render. Should I preload pages myself and detect Googlebot and other bots?Dilator
No, don't do any of that. Can you share the url of your site?Counterbalance
well better not. But why can't I preload?Dilator
Just a note that you should not abandon server rendering if you care about SEO. Read my comment here: reddit.com/r/angularjs/comments/2wcwt9/…Subito
How do i implement this on my SPA hosted in Github? it includes only one page with no links, only anchors to jump through different places in a long page. Can some refer me to github docs, if any?Patois
@Patois - On Github static pages you will need to create an actual page for every URL your site will respond to. Painful if you have more than one page.Counterbalance
I don't see how that would help. My AngularJS pages would still be uncrawlable.Patois
@Patois - I'm sorry, can you say more, why would your Angular pages be uncrawlable? Google will hit URLs directly, you just have to make sure there's content there for it to get. Pushstate URLs are essentially real URLs, and the crawler will treat them as such.Counterbalance
I understand from you that Google will crawl JS on my page. Since my entire content is served using Angular on one long page (reading JSON data), My content will be crawled. I don't even need to split to pages. Is this correct?Patois
@Patois - You should have one URL for every page your site will respond to. If your site only needs to respond to one URL with one set of content, you don't need any routing at all. This is fine for a simple site. If however your site brings in different data (via JSON) for different URLs it would make sense to use routing. Because Github static pages is file based, you would need an actual html file backing each URL in this instance. There is no rule that a website has to be file based though, and if you use an alternative platform you can serve the same template for multiple URLs.Counterbalance
Sitelinks show in Google search results page JSON: google.co.uk/… - see the screenshot: i.imgur.com/fr1PAgm.png Is there a way to fix that?Media
I've managed to setup my Angular SPA without hasbangs but I'm unable to index both title and meta description. I've explained the problem here Can you please help?Streamway
This link is dead: analog-ni.co/…Plum
P
406

Update May 2014

Google crawlers now executes javascript - you can use the Google Webmaster Tools to better understand how your sites are rendered by Google.

Original answer
If you want to optimize your app for search engines there is unfortunately no way around serving a pre-rendered version to the crawler. You can read more about Google's recommendations for ajax and javascript-heavy sites here.

If this is an option I'd recommend reading this article about how to do SEO for Angular with server-side rendering.

I’m not sure what the crawler does when it encounters custom tags.

Petromilli answered 23/11, 2012 at 0:17 Comment(17)
Thanks for the article! I was aware of this. If there's no other option, then I guess I'll have to do it this way...Nellanellda
custom tags shouldn't matter because they get rendered into whatever the directive transcludes it as (try looking at the actual html for any rendered page). so far only oldIEs have a problem with custom tags.Illa
This answer is quite old now. Please see my below answer for the latest in Angular SEO.Smitty
This is no longer current. You should now use pushState instead. There is no need to serve a separate static version of the site.Counterbalance
even with the google update, ng-view will not be rendered correctly, as i can see in Google Webmaster toolsFestination
Yeah just because they execute javascript doesn't mean that your page will be indexed properly. The safest way is to detect the google bot useragent, use a headless browser like phantomjs, obtain page.content and return static html.Wop
I realize this question is specific to SEO, but keep in mind that other crawlers (Facebook, Twitter, etc.) aren't yet able to evaluate JavaScript. Sharing pages on social media sites, for example, would still be a problem without a server-side rendering strategy.Deron
Googling 'Angularjs seo' gets me this: prerender.io Any use to this answer?Dalhousie
Please, can someone give an example of AngularJS site correctly indexed without implementing the Google crawling scheme specification?Universality
FYI googlewebmastercentral.blogspot.ca/2015/10/…Varney
@check_ca: Without escaped fragment -- with escaped fragment. Example search on MixCloud. Example search on illegalcartoon.Rode
@AllanBogh The indexed content of MixCloud is rendered on server-side.Universality
Google doesn't parse Angular websitesDilator
orchardmile.com is served without pre-rendered content to Google. However we are using a pre-rendered service (prerender.io) in order to work with other bots like Slack, Skype, Facebook, etc.Maddalena
@JonathanMuszkat, Then how you are managing Google SEO, if you are not serving pre-rendered content to Google crawlers? How you have got site indexing in Google search. I am very excited to hear from you :) and any small lead would help me to understand.Malefactor
@NeerajSingh Google is SPA capable. So for Google you don't need to do anything else for SEO apart for a simple code to change the metadata based on the routing.Maddalena
@JonathanMuszkat, Thanks John for your comment :). I was wondering, if there are some another ways. Because Google is not fully ready to understand complex SPA - especially which has slow page speed and lots of dynamic data rendering. As it's use headless browsers which is the not real scenario. But yes! most the medium level SPA could be crawled by Google itself without any issues.Malefactor
T
108

Let's get definitive about AngularJS and SEO

Google, Yahoo, Bing, and other search engines crawl the web in traditional ways using traditional crawlers. They run robots that crawl the HTML on web pages, collecting information along the way. They keep interesting words and look for other links to other pages (these links, the amount of them and the number of them come into play with SEO).

So why don't search engines deal with javascript sites?

The answer has to do with the fact that the search engine robots work through headless browsers and they most often do not have a javascript rendering engine to render the javascript of a page. This works for most pages as most static pages don't care about JavaScript rendering their page, as their content is already available.

What can be done about it?

Luckily, crawlers of the larger sites have started to implement a mechanism that allows us to make our JavaScript sites crawlable, but it requires us to implement a change to our site.

If we change our hashPrefix to be #! instead of simply #, then modern search engines will change the request to use _escaped_fragment_ instead of #!. (With HTML5 mode, i.e. where we have links without the hash prefix, we can implement this same feature by looking at the User Agent header in our backend).

That is to say, instead of a request from a normal browser that looks like:

http://www.ng-newsletter.com/#!/signup/page

A search engine will search the page with:

http://www.ng-newsletter.com/?_escaped_fragment_=/signup/page

We can set the hash prefix of our Angular apps using a built-in method from ngRoute:

angular.module('myApp', [])
.config(['$location', function($location) {
  $location.hashPrefix('!');
}]);

And, if we're using html5Mode, we will need to implement this using the meta tag:

<meta name="fragment" content="!">

Reminder, we can set the html5Mode() with the $location service:

angular.module('myApp', [])
.config(['$location', 
function($location) {
  $location.html5Mode(true);
}]);

Handling the search engine

We have a lot of opportunities to determine how we'll deal with actually delivering content to search engines as static HTML. We can host a backend ourselves, we can use a service to host a back-end for us, we can use a proxy to deliver the content, etc. Let's look at a few options:

Self-hosted

We can write a service to handle dealing with crawling our own site using a headless browser, like phantomjs or zombiejs, taking a snapshot of the page with rendered data and storing it as HTML. Whenever we see the query string ?_escaped_fragment_ in a search request, we can deliver the static HTML snapshot we took of the page instead of the pre-rendered page through only JS. This requires us to have a backend that delivers our pages with conditional logic in the middle. We can use something like prerender.io's backend as a starting point to run this ourselves. Of course, we still need to handle the proxying and the snippet handling, but it's a good start.

With a paid service

The easiest and the fastest way to get content into search engine is to use a service Brombone, seo.js, seo4ajax, and prerender.io are good examples of these that will host the above content rendering for you. This is a good option for the times when we don't want to deal with running a server/proxy. Also, it's usually super quick.

For more information about Angular and SEO, we wrote an extensive tutorial on it at http://www.ng-newsletter.com/posts/serious-angular-seo.html and we detailed it even more in our book ng-book: The Complete Book on AngularJS. Check it out at ng-book.com.

Tolle answered 24/12, 2013 at 20:5 Comment(6)
SEO4Ajax is also a good example of paid service (free during the beta). Unfortunately, it looks like I'm not allowed to edit this response to add it in the list.Universality
@Tolle Do you still recommend this approach? The newer top voted comment seems to discourage this approach.Whitt
This is a great example about why we should never say things like "definitive guide" in CS :). Major search engines now execute Javascript, so this answer need to be rewritten or deleted altogether.Animate
@seb this is still needed for let's say open graph tags that need to be in the page when robots are crawling it. For example Facebook or Twitter cards need it. But this answer should be updated to focus on HTML5 pushstate instead of hashbang that is deprecated now.Wadi
@Grsmto you are right! Then I guess it should get rewritten because it says that major search engines don't execute JS, which is not true anymore.Animate
This is outdated info. Google no longer recommends. Consider this approach deprecated at best.Distillery
S
58

You should really check out the tutorial on building an SEO-friendly AngularJS site on the year of moo blog. He walks you through all the steps outlined on Angular's documentation. http://www.yearofmoo.com/2012/11/angularjs-and-seo.html

Using this technique, the search engine sees the expanded HTML instead of the custom tags.

Showker answered 27/11, 2012 at 21:55 Comment(1)
@Brad Green even so the question was closed (for whatever reasons) you might be the position to answer it. I guess I must be missing something: #16224885Kast
E
42

This has drastically changed.

http://searchengineland.com/bing-offers-recommendations-for-seo-friendly-ajax-suggests-html5-pushstate-152946

If you use: $locationProvider.html5Mode(true); you are set.

No more rendering pages.

Epp answered 19/2, 2014 at 22:35 Comment(6)
This should be top answer now. We are in 2014 and answer by @Petromilli is no longer optimal.Scutellation
This is incorrect. That article (from March 2013) says nothing about Bing executing javascript. Bing simply gives a recommendation to use pushstate instead of their previous recommendation to use #!. From the article: "Bing tells me that while they still support the #! version of crawlable AJAX originally launched by Google, they’re finding it’s not implemented correctly much of the time, and they strongly recommend pushState instead." You still have to render the static HTML and serve it for _escaped_fragment_ URLs. Bing/Google will not execute the javascript/AJAX calls.Byer
You still need _escaped_fragment_ and render pure html pages. This solves nothing mate.Scutellation
Still google robot can't see dynamic content of my site, only empty page.Sophiasophie
search site:mysite.com shows {{staff}}, not the content loaded via AngularJS. As if Google crawler never heard of JavaScript. What can I do?Dilator
again me - $locationProvider.html5Mode(true) doesn't work, google index is showing {{....}}Dilator
S
17

Things have changed quite a bit since this question was asked. There are now options to let Google index your AngularJS site. The easiest option I found was to use http://prerender.io free service that will generate the crwalable pages for you and serve that to the search engines. It is supported on almost all server side web platforms. I have recently started using them and the support is excellent too.

I do not have any affiliation with them, this is coming from a happy user.

Smitty answered 26/11, 2013 at 16:25 Comment(3)
The code for prerender.io is on github (github.com/collectiveip/prerender) so anyone can run it on its own servers.Marchal
This is now outdated as well. See @user3330270's answer below.Ander
This is not outdated. @user3330270's answer is incorrect. The article they link to simply says to use pushstate instead of the #!. You still have to render static pages for the crawlers because they do not execute javascript.Byer
A
9

Angular's own website serves simplified content to search engines: http://docs.angularjs.org/?_escaped_fragment_=/tutorial/step_09

Say your Angular app is consuming a Node.js/Express-driven JSON api, like /api/path/to/resource. Perhaps you could redirect any requests with ?_escaped_fragment_ to /api/path/to/resource.html, and use content negotiation to render an HTML template of the content, rather than return the JSON data.

The only thing is, your Angular routes would need to match 1:1 with your REST API.

EDIT: I'm realizing that this has the potential to really muddy up your REST api and I don't recommend doing it outside of very simple use-cases where it might be a natural fit.

Instead, you can use an entirely different set of routes and controllers for your robot-friendly content. But then you're duplicating all of your AngularJS routes and controllers in Node/Express.

I've settled on generating snapshots with a headless browser, even though I feel that's a little less-than-ideal.

Antimatter answered 21/11, 2013 at 1:25 Comment(0)
A
8

A good practice can be found here:

http://scotch.io/tutorials/javascript/angularjs-seo-with-prerender-io?_escaped_fragment_=tag

Adust answered 16/3, 2014 at 13:56 Comment(0)
E
7

As of now Google has changed their AJAX crawling proposal.

Times have changed. Today, as long as you're not blocking Googlebot from crawling your JavaScript or CSS files, we are generally able to render and understand your web pages like modern browsers.

tl;dr: [Google] are no longer recommending the AJAX crawling proposal [Google] made back in 2009.

Emanuelemanuela answered 15/10, 2015 at 10:0 Comment(7)
@Dilator what do you mean?Emanuelemanuela
Googlebot is NOT able to parse Angular websitesDilator
@Dilator you're talking absolute hoop, my full Angular site has been index by google with dynamic meta data without any issuesSerenaserenade
@Serenaserenade you have faulty logic, you mean if one (your) Angular website was indexed, all were. Well i have a surprise for you. NONE of mine were indexed. May be because i use angular ui router or who knows why. Not even the main pages without any ajax dataDilator
@Dilator If not even your static html pages are indexed, this has nothing to do with googles ability of crawling JS files. If you are saying that google can't crawl anyting properly.. well I think you are wrongValencia
facing same issue with reactjs spa too, no indexing.Volva
Some pages of mine can be fetch by google, and some can't. It's due to a particular library (Draft JS). So no, google can't render all js like modern browsers.Logos
I
6

Google's Crawlable Ajax Spec, as referenced in the other answers here, is basically the answer.

If you're interested in how other search engines and social bots deal with the same issues I wrote up the state of art here: http://blog.ajaxsnapshots.com/2013/11/googles-crawlable-ajax-specification.html

I work for a https://ajaxsnapshots.com, a company that implements the Crawlable Ajax Spec as a service - the information in that report is based on observations from our logs.

If answered 21/1, 2014 at 22:53 Comment(1)
Link is down in the listed blog.ajaxsnapshots.comPrent
M
4

I have found an elegant solution that would cover most of your bases. I wrote about it initially here and answered another similar Stack Overflow question here which references it.

FYI this solution also includes hard coded fallback tags in case JavaScript isn't picked up by the crawler. I haven't explicitly outlined it, but it is worth mentioning that you should be activating HTML5 mode for proper URL support.

Also note: these aren't the complete files, just the important parts of those that are relevant. I can't help with writing the boilerplate for directives, services, etc.

app.example

This is where you provide the custom metadata for each of your routes (title, description, etc.)

$routeProvider
   .when('/', {
       templateUrl: 'views/homepage.html',
       controller: 'HomepageCtrl',
       metadata: {
           title: 'The Base Page Title',
           description: 'The Base Page Description' }
   })
   .when('/about', {
       templateUrl: 'views/about.html',
       controller: 'AboutCtrl',
       metadata: {
           title: 'The About Page Title',
           description: 'The About Page Description' }
   })

metadata-service.js (service)

Sets the custom metadata options or use defaults as fallbacks.

var self = this;

// Set custom options or use provided fallback (default) options
self.loadMetadata = function(metadata) {
  self.title = document.title = metadata.title || 'Fallback Title';
  self.description = metadata.description || 'Fallback Description';
  self.url = metadata.url || $location.absUrl();
  self.image = metadata.image || 'fallbackimage.jpg';
  self.ogpType = metadata.ogpType || 'website';
  self.twitterCard = metadata.twitterCard || 'summary_large_image';
  self.twitterSite = metadata.twitterSite || '@fallback_handle';
};

// Route change handler, sets the route's defined metadata
$rootScope.$on('$routeChangeSuccess', function (event, newRoute) {
  self.loadMetadata(newRoute.metadata);
});

metaproperty.js (directive)

Packages the metadata service results for the view.

return {
  restrict: 'A',
  scope: {
    metaproperty: '@'
  },
  link: function postLink(scope, element, attrs) {
    scope.default = element.attr('content');
    scope.metadata = metadataService;

    // Watch for metadata changes and set content
    scope.$watch('metadata', function (newVal, oldVal) {
      setContent(newVal);
    }, true);

    // Set the content attribute with new metadataService value or back to the default
    function setContent(metadata) {
      var content = metadata[scope.metaproperty] || scope.default;
      element.attr('content', content);
    }

    setContent(scope.metadata);
  }
};

index.html

Complete with the hardcoded fallback tags mentioned earlier, for crawlers that can't pick up any JavaScript.

<head>
  <title>Fallback Title</title>
  <meta name="description" metaproperty="description" content="Fallback Description">

  <!-- Open Graph Protocol Tags -->
  <meta property="og:url" content="fallbackurl.example" metaproperty="url">
  <meta property="og:title" content="Fallback Title" metaproperty="title">
  <meta property="og:description" content="Fallback Description" metaproperty="description">
  <meta property="og:type" content="website" metaproperty="ogpType">
  <meta property="og:image" content="fallbackimage.jpg" metaproperty="image">

  <!-- Twitter Card Tags -->
  <meta name="twitter:card" content="summary_large_image" metaproperty="twitterCard">
  <meta name="twitter:title" content="Fallback Title" metaproperty="title">
  <meta name="twitter:description" content="Fallback Description" metaproperty="description">
  <meta name="twitter:site" content="@fallback_handle" metaproperty="twitterSite">
  <meta name="twitter:image:src" content="fallbackimage.jpg" metaproperty="image">
</head>

This should help dramatically with most search engine use cases. If you want fully dynamic rendering for social network crawlers (which are iffy on JavaScript support), you'll still have to use one of the pre-rendering services mentioned in some of the other answers.

Mauser answered 15/12, 2015 at 16:56 Comment(2)
I am also following this solution and thought like that prior to this but i want to ask that will search engines read contents of custom tags.Amuse
@RavinderPayal can you check this solution with seoreviewtools.com/html-headings-checkerDesdamona
P
3

With Angular Universal, you can generate landing pages for the app that look like the complete app and then load your Angular app behind it.
Angular Universal generates pure HTML means no-javascript pages in server-side and serve them to users without delaying. So you can deal with any crawler, bot and user (who already have low cpu and network speed).Then you can redirect them by links/buttons to your actual angular app that already loaded behind it. This solution is recommended by official site. -More info about SEO and Angular Universal-

Predella answered 16/5, 2017 at 22:30 Comment(0)
G
2

Use something like PreRender, it makes static pages of your site so search engines can index it.

Here you can find out for what platforms it is available: https://prerender.io/documentation/install-middleware#asp-net

Galwegian answered 23/2, 2015 at 10:31 Comment(1)
angular is for easing the work or just making the operations costlier and time takingAmuse
K
1

Crawlers (or bots) are designed to crawl HTML content of web pages but due to AJAX operations for asynchronous data fetching, this became a problem as it takes sometime to render page and show dynamic content on it. Similarly, AngularJS also use asynchronous model, which creates problem for Google crawlers.

Some developers create basic html pages with real data and serve these pages from server side at the time of crawling. We can render same pages with PhantomJS on serve side which has _escaped_fragment_ (Because Google looks for #! in our site urls and then takes everything after the #! and adds it in _escaped_fragment_ query parameter). For more detail please read this blog .

Kirst answered 5/10, 2015 at 12:0 Comment(1)
This is no longer true as of Oct 2017, this income tax calculator income-tax.co.uk is built with pure AngularJs (even the titls are like <title>Tax Calculator for £{{earningsSliders.yearly | number : 0 }} salary</title> that renders like "tax calculator for £30000 salary), and Google indexes them ranks them on first page for hundreds of keywords. Just build your websites for humans, mae them awesome, and Google will take care of the rest ;)Antonio
F
0

The crawlers do not need a rich featured pretty styled gui, they only want to see the content, so you do not need to give them a snapshot of a page that has been built for humans.

My solution: to give the crawler what the crawler wants:

You must think of what do the crawler want, and give him only that.

TIP don't mess with the back. Just add a little server-sided frontview using the same API

Fahy answered 12/2, 2014 at 7:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.