Can LinkedIn crawler read SPA pages?

Asked 20/10, 2013 at 9:56 Answered 3/4, 2021 at 22:6

angularjs seo web-crawler linkedin-api phantomjs

I am using PhantomJS along with the Angular-seo package.

I managed to configure it to work with Facebook open-graph, but it seems that LinkedIn doesn't support the _escaped_fragment_ format, and just ignores the route after the hasbang requesting the index.html page of the application instead of myapp.com/?_escaped_fragment_=client_side_path.

What can I do in order to resolve it?.

Eyecup answered 20/10, 2013 at 9:56 Comment(0)

Unfortunately the only way to resolve this is to check the user agent of the bot and send them the static version. According to this, the user agent of the LinkedIn bot is this:

LinkedInBot/1.0 (compatible; Mozilla/5.0; Jakarta Commons-HttpClient/3.1 +http://www.linkedin.com)

Eldreda answered 9/11, 2013 at 22:59 Comment(5)

Yeah I figured this, but in my case I'm running a highly dynamic application, and serving a snapshot to the bot would be just inefficient compared to having a phantomJS instance running in the background.(or prerender.io for taht matter) – Eyecup 10/11, 2013 at 8:24

If you're serving content using PhantomJS and Angular-SEO already, you can just send over the same content over when the LinkedIn bot requests your site - so no extra work. Unless I'm misunderstanding something? – Eldreda 11/11, 2013 at 16:44

The LinkedIn bot, disregards of any URL parameters after the hahsbang, because from their point of view, anything after the hashbang mark is not in the HTTP spec(client side routing), so to begin with, the bot asks for the index.html file of the application, without any parameters(no escaped fragment). Hope that it clarifies things for you – Eyecup 11/11, 2013 at 16:53

Is there a reason you can't use HTML5 mode with the fallback for older browsers? This means your urls will look normal, and google will request the full URL to your server with the _escaped_fragment_ parameter present but empty. – Eldreda 30/11, 2013 at 12:49

Yes there is, the hashbang in the URL is what tells the google \ bing bots to make a request with an escaped fragment... There is an alternative method of placing the hasbang in a meta tag, but it didn't work for me – Eyecup 30/11, 2013 at 16:56

LinkedIn is not rendering JS and will only process the html static content of your SPA as of today.

As your application is highly dynamic, you could redirect LinkedIn crawler requests to an endpoint that will dynamically generate the required HTML for the LinkedIn crawler (e.g quick win: by using a CDN with a rules engine and serverless functions)

If you don’t need to feed the crawler with real-time information, you may consider using:

a prerendering service (e.g. prerender.io)
a static site generator to create prerendered pages of your SPA on your own

Anglican answered 3/4, 2021 at 22:6 Comment(0)

Recommended topics

Hot tags