1. Should I generate slug once and store within PostModel schema or generate on each post showing?
Both methods are valid and have pros:
- Database : Faster as we don't need to generate it each time we need it. Slugs are only generated once.
- On-the-fly : We don't need to regenerate a whole table or database informations if you decide to change your pattern / algorithm (which should be avoided anyway). Less space used in the database and less data transferred between your database and your application. Shouldn't take too long except if your algorithm to generate slugs is not performant but in this case, generation time shouldn't be an issue.
In both case, you'll have to choose a pattern and define an algorithm to generate your slug which match the pattern you chose.
I personally almost always choose to store slug in the database which allow you to specify a slug for a specific post. You may never need to do so but if the case comes up, you're ready.
For example, if for a specific post, the generated slug would be awesome-post
and you want it to be best-awesome-post
, you can easily do it if the slug is stored in the database, otherwise you'll have to adjust your algorithm for each "special" case, which will become a nightmare with multiple cases like that.
Another point I think in favor of storing it: as soon as you publish a post, the slug is part of the permalink to this post and it should be considered immutable. I'm not a huge fan of generating multiple times an immutable data if I can avoid it in this case.
2. How to generate slug based on title (which existing node modules resolve this task) for non ASCII characters?
Like you said, multiple node modules exists to generate slugs base on one or multiple fields like a title, some are even integrated with MongoDB/Mongoose like mongoose-url-slugs.
In most slugs, accented characters will be converted to their non-accented counterparts, everything is converted to lowercase, punctuations is removed, space are replaced with -
for example, etc.
Regarding the ASCII part of your question, if you take a look at the code of mongoose-url-slugs for instance, when generating a slug, they call a removeDiacritics function which will strips these special characters and replace them with a slug-friendly equivalent.
One example I can think of, that needs special treatment to be handled correctly is the word "road" in german: "Straße".
The function will identify the Eszett character (\u00DF
) and replace it with the letter 's'.
If you want to go a step ahead, you should use a slug module handling unicode & utf-8 like slug for example which conforms to the RFC 3986 regarding Uniform Resource Identifier (URI).
It'll transform a title like i ♥ my title
to i-love-my-title
, etc.
3. Which place should I use to redirect queries from http://www.example.local/posts/571f78d077b4454bafcfcced
to http://www.example.local/posts/571f78d077b4454bafcfcced/how-to-make-and-store-slug-for-title
(nodejs, nginx, client-side).
If you're storing slug in your database for the reasons I posted above, the slug should only be generated once and then saved in the database. At this point, no more regeneration should happens on the server-side or client-side.
When displaying links on the client-side, you'll always safely use the slug you previously generated, for example http://www.example.local/posts/571f78d077b4454bafcfcced/how-to-make-and-store-slug-for-title
to display a link following the pattern you want.
In the case of a client using an url without slug or partial slug like http://www.example.local/posts/571f78d077b4454bafcfcced/how-to-make
, to redirect to the correct url with the full slug, Stack Overflow on this specific question in a good example, they're simply sending a 301 redirect to the correct url.
They're dealing with these special cases on the server as it should be since your application on the server is the only one (if you're saving slug in the database) having authority on this matter. Your application knows the correct slug for a specific post as it's in the database, so if the slug is not specified or only partial, which is easy to detect in your application, you can safely trigger a 301 redirect to the correct URL with the correct slug, like http://www.example.local/posts/571f78d077b4454bafcfcced/how-to-make-and-store-slug-for-title
.
You should handle these cases in your Node application (I assume you're using Node as you mentioned it in the question) and redirect to the correct URL when needed.
For example:
res.writeHead(301, { "Location": `http://www.example.local/posts/${postId}/${postSlug}` });
Since similar content is accessible through multiple URLs, you should also use the canonical link element in order to specify the "canonical" URL which should be used by search engines for example to avoid duplicate content problems.
<link rel="canonical" href="http://www.example.local/posts/571f78d077b4454bafcfcced/how-to-make-and-store-slug-for-title">
Regarding your edit about the Stack Exchange Data Explorer, I think they're omitting the field from the results since it's not really that important. According to a comment from Nick Craver, Software Developer and Systems Administrator for Stack Exchange, they are indeed checking if the slugified title that they have in the database match the one in the query, and if not, they redirect.
Edit regarding russian characters in URLs:
If you want to keep russian characters for example, no problem, as long as you keep up with utf-8 for instance. Your link example displays russian characters but behind the scene the URL is "percent-encoded" or "url-encoded", you can check it yourself by right clicking on the link in your browser, choosing Inspect and you'll see that the URL is actually something like http://ru.https://mcmap.net/q/174019/-how-do-i-pass-parameters-to-a-jar-file-at-the-time-of-execution%D0%BE%D1%88%D0%B8%D0%B1%D0%BA%D0%B0-%D0%BF%D1%80%D0%B8-%D1%81%D0%BE%D0%B7%D0%B4%D0%B0%D0%BD%D0%B8%D0%B8-%D0%B2%D0%B8%D1%80%D1%82%D1%83%D0%B0%D0%BB%D1%8C%D0%BD%D0%BE%D0%B3%D0%BE-%D1%83%D1%81%D1%82%D1%80%D0%BE%D0%B9%D1%81%D1%82%D0%B2%D0%B0
. Your browser knows it's url-encoded and displays it properly with russian characters.
You have of course Node.js modules or even native Javascript methods to url-encode any URLs you want.
If you're wondering about SEO & search engines too, Google for instance: "we can generally keep up with UTF-8 encoded URLs and we’ll generally show them to users in our search results (but link to your server with the URLs properly escaped)" so no problem at all.
Most of the "slugifier" modules will remove these characters, so if you actually want to keep them, you'll have to use something more specific like arSlugify:
var ars = require('arslugify');
var title = 'genymotion ошибка при создании виртуального устройства';
var slug = ars(title);
var url = 'www.example.local/posts/571f78d077b4454bafcfcced/' + slug;
var encodedUrl = encodeURIComponent(url);
console.log(url);
// www.example.local/posts/571f78d077b4454bafcfcced/genymotion-ошибка-при-создании-виртуального-устройства
console.log(encodedUrl);
// www.example.local%2Fposts%2F571f78d077b4454bafcfcced%2Fgenymotion-%D0%BE%D1%88%D0%B8%D0%B1%D0%BA%D0%B0-%D0%BF%D1%80%D0%B8-%D1%81%D0%BE%D0%B7%D0%B4%D0%B0%D0%BD%D0%B8%D0%B8-%D0%B2%D0%B8%D1%80%D1%82%D1%83%D0%B0%D0%BB%D1%8C%D0%BD%D0%BE%D0%B3%D0%BE-%D1%83%D1%81%D1%82%D1%80%D0%BE%D0%B9%D1%81%D1%82%D0%B2%D0%B0
mongoose-url-slugs
stores slug in MongoDB document ? – AleePostSchema.plugin(URLSlugs('title', {field: 'myslug'}));
– Moraine