Database design for tags or tagging
Asked Answered
U

1

6

How are tags of an item stored in a database hassle-free?

Each item has multiple tags with it. I have read a few answers on an efficient way to do so :

  1. What is the most efficient way to store tags in a database?
  2. Recommended SQL database design for tags or tagging

But I think there is a better solution to this. Why can't we simply include tags as a long string for each item?

 Table : Brand_Shops
 Columns : brand_id, brand_name, content, tags

Example :

1 || Nike ||  shoes bags sports football soccer t-shirts track-pants
2 ||  GAP || wallets t-shirts jeans shoes perfumes

This does not have atomicity but completely suits the purpose of tagging. If a new brand has to be added, new tags can simply be added along with it. Because of this it will be very easy to fetch it as well. I don't understand why this is not an efficient solution.

Uncanonical answered 11/6, 2018 at 8:15 Comment(3)
Define efficient. Efficient for writing? efficient for reading? efficient for code maintenance?Apples
Efficient for code maintainance and ease of useUncanonical
Well IMHO, good code maintenance = simple code = don't use complicated code to split and append strings into one row. Just do a simple insertApples
S
5

I don't understand why this is not an efficient solution.

It is inefficient because you have to retrieve and break/search that string for every query.

When you do something like (as mentioned in your links) Three tables (one for storing all items, one for all tags, and one for the relation between the two) then you can use the real power of a relational database, the index.

Instead of breaking each string into a tag or set of tags... that's already done; you just get the ones you want. So, if you're searching for "shoes" then it goes straight there (using the index probably log n or faster) and returns both Nike and GAP. It will do this no matter how many tags you have, no matter how many companies you have.

With the 3-table system you do all of the hard work up front and then just do lookups.

If you intend to run this locally or with a limited number of users your solution may be fine. It is also easier to code.
Once your queries start taking more than a few seconds you'll probably want to update your tagging system. If you do it this way, write the search code separately in case you need to rip it out.


Question from comment:

Can you give an example of a 3 table system that is normalized with atomicity

Sure.
You've basically asked for Third Normal Form which is my usual goal. (I admit I often don't make 3NF because I optimize; e.g. storing a postal code with the addres - if you're out of school, that's a better choice)

--Sample SQL https://mcmap.net/q/1753871/-database-design-for-tags-or-tagging/50818392
IF NOT EXISTS (SELECT * FROM sys.schemas WHERE name = N'ChrisC') 
BEGIN
 EXEC sys.sp_executesql N'CREATE SCHEMA [ChrisC] AUTHORIZATION [dbo]'
 IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[ChrisC].[Brands]') AND type in (N'U'))
     AND NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[ChrisC].[BrandTags]') AND type in (N'U'))
     AND NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[ChrisC].[Tags]') AND type in (N'U'))
 BEGIN
  CREATE TABLE [ChrisC].[Brands]([pkBrand] [int] IDENTITY(101,1) NOT NULL,[Name] [varchar](40) NULL) ON [PRIMARY]
  INSERT INTO [ChrisC].[Brands]([Name])VALUES('Nike'),('GAP')
  CREATE TABLE [ChrisC].[BrandTags]([pk] [int] IDENTITY(1,1) NOT NULL,[Brand] [int] NULL,[Tag] [int] NULL) ON [PRIMARY]
  INSERT INTO [ChrisC].[BrandTags]([Brand],[Tag])VALUES
    (101,201),(101,202),(101,203),(101,204),(101,205),(101,206),(101,207),
    (102,208),(102,209),(102,203),(102,207),(102,210)
  CREATE TABLE [ChrisC].[Tags]([pkTag] [int] IDENTITY(201,1) NOT NULL,[Tag] [varchar](40) NULL) ON [PRIMARY]
  INSERT INTO [ChrisC].[Tags]([Tag])VALUES
    ('bags'),('football'),('shoes'),('soccer'),('sports'),('track-pants'),('t-shirts'),('jeans'),('perfumes'),('wallets')
  SELECT b.[Name], t.Tag 
  FROM chrisc.Brands b 
  LEFT JOIN chrisc.BrandTags bt ON pkBrand = Brand
  LEFT JOIN chrisc.Tags t ON bt.Tag = t.pkTag
  WHERE b.[Name] = 'Nike'
  -- Stop execution here to see the tables with data
  DROP TABLE [ChrisC].[Brands]
  DROP TABLE [ChrisC].[BrandTags]
  DROP TABLE [ChrisC].[Tags]
 END
 IF  EXISTS (SELECT * FROM sys.schemas WHERE name = N'ChrisC') DROP SCHEMA [ChrisC]
END
Satirist answered 12/6, 2018 at 13:25 Comment(2)
Can you show with one example of how the tags would look like with a 3 table system in the Nike example. Because I feel, the rule of atomicity will be broken no matter what. And the database won't be normalized.Uncanonical
Done. Let me know if you have questions (I used a custom schema which I hope isn't confusing)Satirist

© 2022 - 2024 — McMap. All rights reserved.