Unique model field in Django and case sensitivity (postgres)
Asked Answered
D

10

31

Consider the following situation: -

Suppose my app allows users to create the states / provinces in their country. Just for clarity, we are considering only ASCII characters here.

In the US, a user could create the state called "Texas". If this app is being used internally, let's say the user doesn't care if it is spelled "texas" or "Texas" or "teXas"

But importantly, the system should prevent creation of "texas" if "Texas" is already in the database.

If the model is like the following:

class State(models.Model):
    name = models.CharField(max_length=50, unique=True)

The uniqueness would be case-sensitive in postgres; that is, postgres would allow the user to create both "texas" and "Texas" as they are considered unique.

What can be done in this situation to prevent such behavior. How does one go about providing case-insenstitive uniqueness with Django and Postgres

Right now I'm doing the following to prevent creation of case- insensitive duplicates.

class CreateStateForm(forms.ModelForm):
    def clean_name(self):
        name = self.cleaned_data['name']
        try:
            State.objects.get(name__iexact=name)
        except ObjectDoesNotExist:
            return name
        raise forms.ValidationError('State already exists.')

    class Meta:
        model = State

There are a number of cases where I will have to do this check and I'm not keen on having to write similar iexact checks everywhere.

Just wondering if there is a built-in or better way? Perhaps db_type would help? Maybe some other solution exists?

Defunct answered 7/12, 2009 at 4:31 Comment(4)
I think you're already doing it right. At least that's the way i do it with tags, so my tag cloud doesn't end up with "tag", "Tag", and "TAG" all right next to each other.Glyceric
Yes but here Django's built-in unique is not being of much help, is it? After all I'm having to do the unique checks myself everywhere.Defunct
I would code the app to ensure that the string entered the database only in lowercase. Then check against that.Lighting
I would rather let the user keep whatever case he wants to keep.Defunct
E
32

You could define a custom model field derived from models.CharField. This field could check for duplicate values, ignoring the case.

Custom fields documentation is here http://docs.djangoproject.com/en/dev/howto/custom-model-fields/

Look at http://code.djangoproject.com/browser/django/trunk/django/db/models/fields/files.py for an example of how to create a custom field by subclassing an existing field.

You could use the citext module of PostgreSQL https://www.postgresql.org/docs/current/static/citext.html

If you use this module, the the custom field could define "db_type" as CITEXT for PostgreSQL databases.

This would lead to case insensitive comparison for unique values in the custom field.

Estonian answered 15/12, 2009 at 8:20 Comment(3)
This is an interesting solution, and is seems more Django-istic than other solutions mentioned here.Defunct
see my answer below for the explicit stepsDitter
As of Django 1.11, CICharField is now part of Django. See https://mcmap.net/q/268720/-case-insensitive-unique-model-fields-in-djangoFlowers
X
8

Alternatively you can change the default Query Set Manager to do case insensitive look-ups on the field. In trying to solve a similar problem I came across:

http://djangosnippets.org/snippets/305/

Code pasted here for convenience:

from django.db.models import Manager
from django.db.models.query import QuerySet

class CaseInsensitiveQuerySet(QuerySet):
    def _filter_or_exclude(self, mapper, *args, **kwargs):
        # 'name' is a field in your Model whose lookups you want case-insensitive by default
        if 'name' in kwargs:
            kwargs['name__iexact'] = kwargs['name']
            del kwargs['name']
        return super(CaseInsensitiveQuerySet, self)._filter_or_exclude(mapper, *args, **kwargs)

# custom manager that overrides the initial query set
class TagManager(Manager):
    def get_query_set(self):
        return CaseInsensitiveQuerySet(self.model)

# and the model itself
class Tag(models.Model):
    name = models.CharField(maxlength=50, unique=True, db_index=True)

    objects = TagManager()

    def __str__(self):
        return self.name
Xe answered 31/10, 2010 at 16:46 Comment(1)
This approach won't work for compound queries like name__in=[] or related__name=.Floret
K
7

a very simple solution:

class State(models.Model):
    name = models.CharField(max_length=50, unique=True)

    def clean(self):
        self.name = self.name.capitalize()
Klayman answered 5/10, 2013 at 4:51 Comment(1)
If the name contains two words (e.g. "South Dakota"), this would convert the second to lowercase. With a name, you probably want to use title().Mentholated
V
7

Explicit steps for Mayuresh's answer:

  1. in postgres do: CREATE EXTENSION citext;

  2. in your models.py add:

    from django.db.models import fields
    
    class CaseInsensitiveTextField(fields.TextField):
        def db_type(self, connection):
            return "citext"
    

    reference: https://github.com/zacharyvoase/django-postgres/blob/master/django_postgres/citext.py

  3. in your model use: name = CaseInsensitiveTextField(unique=True)

Viens answered 4/10, 2014 at 11:49 Comment(0)
M
6

On the Postgres side of things, a functional unique index will let you enforce unique values without case. citext is also noted, but this will work with older versions of PostgreSQL and is a useful technique in general.

Example:

# create table foo(bar text);
CREATE TABLE
# create unique index foo_bar on foo(lower(bar));
CREATE INDEX
# insert into foo values ('Texas');
INSERT 0 1
# insert into foo values ('texas');
ERROR:  duplicate key value violates unique constraint "foo_bar"
Moina answered 16/12, 2009 at 9:26 Comment(3)
I have tried this and can confirm it works. But the answer by Mayuresh allows me to live within Django.Defunct
Well, you should always enforce your constraints in the database as well.Moina
What's a good way to enforce this 'automatically' at database level after Django has created models. I'm asking for a scripted way of doing this instead of manually editing db after django's syncdbBrody
P
5

Besides already mentioned option to override save, you can simply store all text in lower case in database and capitalize them on displaying.

class State(models.Model):
    name = models.CharField(max_length=50, unique=True)

    def save(self, force_insert=False, force_update=False):
        self.name = self.name.lower()
        super(State, self).save(force_insert, force_update)
Pairs answered 14/12, 2009 at 17:16 Comment(1)
I'm getting an error now when I have the same text, how can I instead display the error on the page, instead of getting "IntegrityError at /add_category/ UNIQUE constraint failed: theblog_category.name"?Trashy
D
3

You can use lookup='iexact' in UniqueValidator on serializer, like this:

class StateSerializer(serializers.ModelSerializer): 
    name = serializers.CharField(validators=[
    UniqueValidator(
        queryset=models.State.objects.all(),lookup='iexact'
    )]

django version: 1.11.6

Dogmatize answered 27/12, 2017 at 22:44 Comment(0)
B
2

If you don't want to use a postgres-specific solution, you can create a unique index on the field with upper() to enforce uniqueness at the database level, then create a custom Field mixin that overrides get_lookup() to convert case-sensitive lookups to their case-insensitive versions. The mixin looks like this:

class CaseInsensitiveFieldMixin:
    """
    Field mixin that uses case-insensitive lookup alternatives if they exist.
    """

    LOOKUP_CONVERSIONS = {
        'exact': 'iexact',
        'contains': 'icontains',
        'startswith': 'istartswith',
        'endswith': 'iendswith',
        'regex': 'iregex',
    }

    def get_lookup(self, lookup_name):
        converted = self.LOOKUP_CONVERSIONS.get(lookup_name, lookup_name)
        return super().get_lookup(converted)

And you use it like this:

from django.db import models


class CICharField(CaseInsensitiveFieldMixin, models.CharField):
    pass


class CIEmailField(CaseInsensitiveFieldMixin, models.EmailField):
    pass


class TestModel(models.Model):
    name = CICharField(unique=True, max_length=20)
    email = CIEmailField(unique=True)

You can read more about this approach here.

Balfore answered 27/10, 2018 at 16:3 Comment(0)
C
1

You can do this by overwriting the Model's save method - see the docs. You'd basically do something like:

class State(models.Model):
    name = models.CharField(max_length=50, unique=True)

    def save(self, force_insert=False, force_update=False):
        if State.objects.get(name__iexact = self.name):
            return
        else:
            super(State, self).save(force_insert, force_update)

Also, I may be wrong about this, but the upcoming model-validation SoC branch will allow us to do this more easily.

Chemo answered 7/12, 2009 at 8:40 Comment(3)
This is essentially the same as what I am already doing. In fact, they way I am doing it right now is better than handling it within save.Defunct
On a second read, you're right. AFAIK, using the forms' validation would be the best way to go (as of now) - unless the data is not being inserted via a form :).Chemo
It's totally possible for data to not be inserted via a form! especially if you are using a third-party app.Floret
D
0

Solution from suhail worked for me without the need to enable citext, pretty easy solution only a clean function and instead of capitalize I used upper(). Mayuresh's solution also works but changed the field from CharField to TextField.

class State(models.Model):

    name = models.CharField(max_length=50, unique=True)

    def clean(self):
        self.name = self.name.upper()
Divulgence answered 2/6, 2015 at 21:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.