How to make a sitemap link in the page head pass the W3C validator?
Asked Answered
W

5

13

I'm trying to pass a page through the W3C Validator. The validation fails on the sitemap, which I'm including like this:

<link rel="sitemap" type="application/xml" title="Sitemap" href="../sitemap.xml" />

The error I'm getting is:

Bad value sitemap for attribute rel on element link: Not an absolute IRI. The string sitemap is not a registered keyword or absolute URL.

I have been trying forever to fix it, but nothing I'm trying seems to work plus this is the recommended layout by Google and Html5 Boilerplate.

Is there anything wrong with my syntax? Seems correct, but why is it not passing?

Warsle answered 11/11, 2012 at 0:38 Comment(1)
Deleted my comment because I wanted to double check my facts. But it said: It's not your fault. It's the validator that's at fault. It's required to pass as valid names listed as "proposed" on this page: microformats.org/wiki/… but it's out of date and not recognising "sitemap" as it should do.Matrilineal
B
12

The short answer is that you cannot.

HTML 5 defines the values that you are allowed to use in rel and sitemap is not one of the ones recognised by the validator.

The error message does say that you can register a new link type on a wiki, but sitemap is already there so you just have to wait for the validator developers to update the validator to reflect the new state of the wiki (assuming nobody deletes the entry).

(The basic problems here are that having the specification use a wiki page as a normative resource is nuts, that HTML 5 is still a draft, and that the HTML 5 validator is still considered experimental).

Bassinet answered 11/11, 2012 at 1:13 Comment(2)
This would explain a lot. I didn't noticed he was talking about HTML5 (and didn't know the w3c validator wouldn't work on it either)Shirashirah
Using a wiki page as a normative resource is nuts, but there's a Working group decision for it and no formal objection in place against it so I guess we're stuck with it.Matrilineal
U
21

Dropping in from the future (June 2021).

The entry:

<link rel="sitemap" type="application/xml" title="Sitemap" href="/sitemap.xml">

is now accepted by the W3 HTML5 Validator

That is to say:

rel="sitemap"

is now a valid attribute + value.

Validating the following HTML file:

<!DOCTYPE html>
<html lang="en-gb">
<head>
<meta charset="utf-8">
<title>My Rel Sitemap Test</title>
<link rel="sitemap" type="application/xml" title="Sitemap" href="/sitemap.xml">
</head>

<body>
<h1>My Rel Sitemap Test</h1>
<p>This is my Rel Sitemap Test.</p>
<p>The document passes.</p>
<p>This document is valid HTML5 + ARIA + SVG 2 + MathML 3.0</p>
</body>
</html>

here: https://validator.w3.org/nu/

returns the response:

Document checking completed. No errors or warnings to show.

Underlay answered 24/6, 2021 at 14:46 Comment(4)
It might pass validation (which was the question), but anything else in rel also passes so this isn't really indicative of correct HTML.Beguile
No worries, @MikeLewis. rel="sitemap" is listed on HTML5 link type extensions according to the WHAT-WG requirements.Underlay
Its status is listed as "proposed" there, just so you know. Not that browsers won't accept it anyway, but it has not been officially accepted as far as I can tell.Beguile
@MikeLewis - Yes, "Proposed" rather than "Ratified", you're right. But then WHAT-WG states: "Conformance checkers must use the information given on the microformats page for existing rel values to establish if a value is allowed or not: values defined in this specification or marked as "proposed" or "ratified" must be accepted" (my bold)Underlay
B
12

The short answer is that you cannot.

HTML 5 defines the values that you are allowed to use in rel and sitemap is not one of the ones recognised by the validator.

The error message does say that you can register a new link type on a wiki, but sitemap is already there so you just have to wait for the validator developers to update the validator to reflect the new state of the wiki (assuming nobody deletes the entry).

(The basic problems here are that having the specification use a wiki page as a normative resource is nuts, that HTML 5 is still a draft, and that the HTML 5 validator is still considered experimental).

Bassinet answered 11/11, 2012 at 1:13 Comment(2)
This would explain a lot. I didn't noticed he was talking about HTML5 (and didn't know the w3c validator wouldn't work on it either)Shirashirah
Using a wiki page as a normative resource is nuts, but there's a Working group decision for it and no formal objection in place against it so I guess we're stuck with it.Matrilineal
M
3

If you only need w3c validator to pass, perhaps you could detect its user agent and modify the output of your application so that it passes. I think of strict validation as more of a marketing benefit then anything when it comes to minor issues like this. If other developers use w3c validator to say your client's web site is full of errors, then that is annoying.

You can check if the HTTP_USER_AGENT contains "W3C_Validator" and remove the non-standard code.

In CFML, I wrote code like this to make my Google Authorship link still able to validate on w3c validator:

<cfif cgi.HTTP_USER_AGENT CONTAINS "W3C_Validator">data-</cfif>rel="publisher"

I just posted a question on the google forum if they could begin supporting data-rel or if they could confirm if google search does already support it. The structured data testing tool they provide doesn't parse data-rel when I tested it just now. http://www.google.com/webmasters/tools/richsnippets

Hopefully, someone will follow up: https://groups.google.com/a/googleproductforums.com/d/msg/webmasters/-/g0RDfpFwmqAJ

Macaw answered 24/2, 2013 at 17:59 Comment(1)
Thanks for chipping in. Still haven't found a nice solution to this, so let's see what happensWarsle
S
1

The string sitemap is not a registered keyword or absolute URL

Your problem is right here:

href="../sitemap.xml" 

You are using a relative URL to indicate where your sitemap is. Try to put something like this:

<link rel="sitemap" type="application/xml" title="Sitemap" href="/myfolder/sitemap.xml" />

EDIT

Since Robots crawl first in your root directory the best approach is indeed use your sitemap.xml file in your root directory:

<link rel="sitemap" type="application/xml" title="Sitemap" href="/sitemap.xml" />

or

<link rel="sitemap" type="application/xml" title="Sitemap" href="http://yoursite.com/sitemap.xml" /> <!-- No www -->

Also,

Make sure your link tag is a child of your head tag

Shirashirah answered 11/11, 2012 at 0:42 Comment(7)
Yes. BTW, the robots search first at the root of your website, so something like /sitemap.xml would be indeed betterShirashirah
Just tried /sitemap.xml. still does not work. I also tried https://www.msite.com/sitemap.xml", same result. Are you sure it's the href` and not the rel attribute?Warsle
BTW, care to post your sitemap.xml file? You can't use relative URL in that eitherShirashirah
The xml file only has full links https://www.mysite.com/.... It passed Google and Bing so I guess it is ok. Thanks for the www. Trying that.Warsle
Where are you running your site? The only thing left is the SSL, have you set your validator to accept SSL like described here: cpansearch.perl.org/src/GAAS/libwww-perl-6.04/README.SSL source: validator.w3.org/docs/install.htmlShirashirah
ok. good idea. It is an SSL site, but I have to contact my admin on Monday, because something seems wrong with the certificate. Hold up while I'm trying SSLWarsle
Nice, I gonna keep track of your issue by favoriting your question. See you mondayShirashirah
O
-2

Try this!

<link rel="alternate" type="application/xml" title="Site Map" href="http://yoursite.com/sitemap.xml" />

The rel Attribute alternate is recognized also for RSS and ATOM feeds. I personally use it for all xml documents.

Officiate answered 26/6, 2016 at 22:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.