How to implement a sitemap.xml file for a single page app using Firebase?
Asked Answered
C

3

9

I was reading Google's guidelines about SEO and I found this.

Help Google find your content

The first step to getting your site on Google is to be sure that Google can find it. The best way to do that is to submit a sitemap. A sitemap is a file on your site that tells search engines about new or changed pages on your site. Learn more about how to build and submit a sitemap.

Obs.: My web app is an ecommerce/blog in which I have a shop that I have products to sell and I have a blogging section where I create and post content about those products.

So, each product has a product page, and each blog post has a blogPost page.

Then I went looking for some examples of Sitemaps from websites like mine that have good SEO ranking.

And I've found this good example:

robots.txt

User-Agent: *
Disallow: ... // SOME ROUTES

Sitemap: https://www.website.com/sitemap.xml

I.E: Apparently the crawler robot finds the Sitemap location from the robots.txt file.

And I've also found out that they keep separate sitemap files for blogPost and product pages.

sitemap.xml

<sitemapindex xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd">
  <sitemap>
    <loc>https://www.website.com/blogPosts-sitemap.xml</loc> // FOR POSTS
    <lastmod>2019-09-10T05:00:14+00:00</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://www.website.com/products-sitemap.xml</loc>  // FOR PRODUCTS
    <lastmod>2019-09-10T05:00:14+00:00</lastmod>
  </sitemap>
</sitemapindex>

blogPosts-sitemap.xml

// HUGE LIST WITH AN <url> FOR EACH BLOGPOST URL

<url>
  <loc>
    https://www.website.com/blog/some-blog-post-slug
  </loc>
  <lastmod>2019-09-03T18:11:56.873+00:00</lastmod>
  <changefreq>weekly</changefreq>
  <priority>0.8</priority>
</url>

products-sitemap.xml

// HUGE LIST WITH AN <url> FOR EACH PRODUCT URL

<url>
  <loc>
    https://www.website.com/gp/some-product-slug
  </loc>
  <lastmod>2019-09-08T07:00:16+00:00</lastmod>
  <changefreq>yearly</changefreq>
  <priority>0.3</priority>
</url>

QUESTION

How can I keep updated Sitemap files like that if my web app is a Single Page App with client site routing?

Since I'm using Firebase as my hosting, what I've thought about doing is:

OPTION #1 - Keep sitemap.xml in Firebase Hosting

From this question Upload single file to firebase hosting via CLI or other without deleting existing ones?

Frank van Puffelen says:

Update (December 2018): Firebase Hosting now has a REST API. While this still doesn't officially allow you to deploy a single file, you can use it creatively to get what you want. See my Gist here: https://gist.github.com/puf/e00c34dd82b35c56e91adbc3a9b1c412

I could use his Gist to update the sitemap.xml file and run this script once a day, or whenever I want. This would work for my current project, but it would not work for a project with a higher change frequency of dynamic pages, like a news portal or market place, for example.

OPTION #2 - Keep sitemap.xml in Firebase Storage

Keep the sitemap files in my Storage bucket and update it as frequently as I need via a admin script or a cloud scheduled function.

Set a rewrite in my firebase.json and specify a function to respond and serve the sitemap files from the bucket, when requested.

firebase.json

"hosting": {
 // ...

 // Add the "rewrites" attribute within "hosting"
 "rewrites": [ {
   "source": "/sitemap.xml",
   "function": "serveSitemapFromStorageBucket"
 } ]
}

FINAL QUESTION

I'm leaning towards OPTION #2, I want to know if it will work for this specific purpose or if I'm missing something out.

Cardiac answered 10/9, 2019 at 11:44 Comment(4)
Hi, I have the same issue like you, and wonder if your solution works for google search console?Epi
@JimmyLin I have a cloud function that generates the sitemap.xml on the fly. Ex: https://www.mywebsite.com/sitemap.xml will be redirect to a http cloud function that will build the file and respond. This way, the sitemap "file" does not exist. It is generated on-demand and it is always updated with the latest data.Cardiac
@JimmyLin I've posted an answer.Cardiac
We're going in the wrong direction when something this simple, ends up being so complex.Citizenship
C
4

I ended up creating a cloud function to build the sitemap file on-demand.

firebase.json

"rewrites": [
  {
    "source": "/sitemap.xml",
    "function": "buildSitemap"
  },
]

buildSitemap.js (this is a cloud function)

import * as admin from 'firebase-admin';

async function buildSitemap(req,res)  {

  // Use firebase-admin to gather necessary data
  // Build the sitemap file string
  // and send it back

  res.set('Content-Type', 'text/xml');
  res.status(200).send(SITEMAP_STRING);
  return;

}

export default buildSitemap;

Cardiac answered 17/7, 2020 at 10:23 Comment(6)
Are you still using this method? Because I have a similar approach, however, I feel like there are a few disadvantages with this. You can only store up to 50.000 urls per sitemap, there is a potential for lots of unnecessary reads (I fetch all post ids from Firestore) and lastly, it takes some seconds to create the sitemap each time from scratch.Tracee
I'm still using it. So far it's working fine. I got around 100 urls, though. I know that you can create a sitemap index and break it into multiple sitemap files, so you get 50k on each one. You can also cache for a day to avoid too many reads.Cardiac
Thanks for the quick response. Yes, I´m currently trying the sitemap-index approach and splitting the urls in several files. I can let you know should I be able to implement it. One last thing, do you perhaps have a reference or a snippet on how I can cache the sitemap result for a day?Tracee
@Tracee It depends on your implementation details. If your hosting provider has a CDN, you can cache it on the CDN by setting a Cache-Control: s-maxage=SOME_VALUE_IN_SECONDS header. Or if you are not behind a CDN, you can cache directly on your server.Cardiac
I´m using Firebase Hosting, so adding res.set('Cache-Control', 'public, max-age=86400, s-maxage=86400'); should do the trick, thank you.Tracee
That's it. You can either set it in the firebase.json config file or you can set it on your server / cloud function. Be aware that firebase.json will overwrite what you set on your server. See this question. At least, those were the results I got after testing this behavior.Cardiac
V
0

Remove src/sitemap.xml from angular.json

      "assets": [
          "src/assets",
          "src/favicon.ico",
          "src/manifest.json",
          "src/robots.txt"
        ],
Vi answered 3/2, 2022 at 1:16 Comment(0)
A
0

Put your sitemap.xml inside public, firebase deploy, and that's it. It works.

If you're coming from Nuxt.js question, then I do have a little bonus for you regarding this comment, this commit and this commit. Remember to use yarn generate, not yarn build before deployment, otherwise the it won't work as explained here.

Aragon answered 7/3, 2023 at 17:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.