How detailed should my sitemap be for a multilingual site?
Asked Answered
D

2

2

I have a one page website which includes an English main page, and a French Main Page. One can access my website through the following URLs:

ENGLISH VERSION OF MAIN PAGE

  • www.example.org
  • www.example.org/index.html
  • example.org
  • example.org/index.html

FRENCH VERSION OF MAIN PAGE

  • www.example.org/fr
  • www.example.org/fr/index.html
  • example.org/fr
  • example.org/fr/index.html

For optimal search engine indexing, should I include all of these URLs in my sitemap (with both http:// and https://)? If not, what would be the set of URLs I should include in my sitemap.xml file?

Dichroism answered 15/1, 2016 at 17:1 Comment(0)
C
1

You should include all unique pages in your sitemap once.

All of the different URLs you listed are just different ways of accessing the same page/content, just like most PHP applications can be accessed via site.org/ or site.org/index.php. Your sitemap should include just one reference to a page.

Chery answered 15/1, 2016 at 17:12 Comment(0)
T
1

The best practice is to have one canonical URL per document. And each canonical URL should be added to your sitemap (if you have one).

So in your case you may want to use one URL for the English main page and one URL for the French main page, and redirect (with HTTP status code 301) from the other URLs to the canonical ones. In addition, you can declare the canonical URL with the canonical link relation.

If you need to provide HTTP in addition to HTTPS (instead of enforcing HTTPS), you would of course need to have two URLs per document (one with HTTP, one with HTTPS). But you [should only list one variant in the sitemap](http://www.sitemaps.org/faq.html#faq_http_vs_https "Sitemaps.org FAQ: 'My site has both "http" and "https" versions of URLs. Do I need to list both?'"), and you should only declare one as canonical (ideally the same which you added to the sitemap).

Which URLs to choose can depend on various factors (usability, SEO, your backend, …), but it seems safe to assume that index.html is ballast. You’d have to decide if to use the www subdomain (a common convention) or not. Assuming that you choose to omit it, you could have these canonical URLs:

https://example.org/
https://example.org/fr

And you would redirect the following URLs with 301 to the canonical URLs listed above:

https://example.org/index.html
https://www.example.org/
https://www.example.org/index.html
https://example.org/fr/index.html
https://www.example.org/fr
https://www.example.org/fr/index.html
Taddeusz answered 17/1, 2016 at 5:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.