Should I use ArrayField or ManyToManyField for tags
Asked Answered
O

2

12

I am trying to add tags to a model for a postgres db in django and I found two solutions:

using foreign keys:

class Post(models.Model):
    tags = models.ManyToManyField('tags')
    ...

class Tag(models.Model):
    name = models.CharField(max_length=140)

using array field:

from django.contrib.postgres.fields import ArrayField

class Post(models.Model):
    tags = ArrayField(models.CharField(max_length=140))
    ...

assuming that I don't care about supporting other database-backends in my code, what is a recommended solution ?

Overwind answered 6/6, 2017 at 6:18 Comment(5)
If you don't want to reinvent the wheel - django-taggit.readthedocs.io/en/latestDislocate
@rayy my question is not about using an external module (which I am aware of), and more about implementation choices, I wanted to know more about pros, cons of a field type on anotherOverwind
Possible duplicate of Django JSONField inside ArrayFieldConics
@Conics while it's related, I'm not asking about json field.Overwind
Yeah, I think I used the wrong dup target by accident, but basically you shouldn't be using the arrayfield. Simple as thatConics
M
12

If you use an Array field,

  • The size of each row in your DB is going to be a bit large thus Postgres is going to be using more toast tables

  • Every time you get the row, unless you specifically use defer the field or otherwise exclude it from the query via only, or values or something, you paying the cost of loading all those values every time you iterate across that row. If that's what you need then so be it.

  • Filtering based on values in that array, while possible isn't going to be as nice and the Django ORM doesn't make it as obvious as it does for M2M tables.

If you use M2M field,

  • You can filter more easily on those related values Those fields are postponed by default, you can use prefetch_related if you need them and then get fancy if you want only a subset of those values loaded.

  • Total storage in the DB is going to be slightly higher with M2M because of keys, and extra id fields.

  • The cost of the joins in this case is completely negligible because of keys.

With that being said, the above answer doesn't belong to me. A while ago, I had stumbled upon this dilemma when I was learning Django. I had found the answer here in this question, Django Postgres ArrayField vs One-to-Many relationship.

Hope you get what you were looking for.

Macrobiotic answered 6/6, 2017 at 9:59 Comment(0)
S
4

If you want the class tags to be monitored ( For eg : how many tags, how many of a particular tag etd ) , the go for the first option as you can add more fields to the model and will add richness to the app.

On the other hand, if you just want it to be a array list just for sake of displaying or minimal processing, go for that option.

But if you wish to save time and add richness to the app, you can use this

https://github.com/alex/django-taggit

It is as simple as this to initialise :

from django.db import models

from taggit.managers import TaggableManager

class Food(models.Model):
# ... fields here

    tags = TaggableManager()

and can be used in the following way :

>>> apple = Food.objects.create(name="apple")
>>> apple.tags.add("red", "green", "delicious") 
>>> apple.tags.all()
[<Tag: red>, <Tag: green>, <Tag: delicious>]
Shaper answered 6/6, 2017 at 6:59 Comment(3)
thanks, but my question is more about what is the difference between my choices, (why a is better than b) taggit is a good choice but my question is not exactly about a tagging module, first part of your answer is something that I was looking for.Overwind
Oh I see. Was I able to answer your question or do you have any other questions?Shaper
waiting for something more detailed :-)Overwind

© 2022 - 2024 — McMap. All rights reserved.