Case insensitive unique model fields in Django?
Asked Answered
C

11

75

I have basically a username is unique (case insensitive), but the case matters when displaying as provided by the user.

I have the following requirements:

  • field is CharField compatible
  • field is unique, but case insensitive
  • field needs to be searchable ignoring case (avoid using iexact, easily forgotten)
  • field is stored with case intact
  • preferably enforced on database level
  • preferably avoid storing an extra field

Is this possible in Django?

The only solution I came up with is "somehow" override the Model manager, use an extra field, or always use 'iexact' in searches.

I'm on Django 1.3 and PostgreSQL 8.4.2.

Catholicon answered 14/10, 2011 at 20:41 Comment(1)
possible duplicate of Unique model field in Django and case sensitivity (postgres)Cantonment
S
31

Store the original mixed-case string in a plain text column. Use the data type text or varchar without length modifier rather than varchar(n). They are essentially the same, but with varchar(n) you have to set an arbitrary length limit, that can be a pain if you want to change later. Read more about that in the manual or in this related answer by Peter Eisentraut @serverfault.SE.

Create a functional unique index on lower(string). That's the major point here:

CREATE UNIQUE INDEX my_idx ON mytbl(lower(name));

If you try to INSERT a mixed case name that's already there in lower case you get a unique key violation error.
For fast equality searches use a query like this:

SELECT * FROM mytbl WHERE lower(name) = 'foo' --'foo' is lower case, of course.

Use the same expression you have in the index (so the query planner recognizes the compatibility) and this will be very fast.


As an aside: you may want to upgrade to a more recent version of PostgreSQL. There have been lots of important fixes since 8.4.2. More on the official Postgres versioning site.

Stonewort answered 14/10, 2011 at 21:1 Comment(9)
Thank you for the solution. I ended up using this one and the one below so now you can't just work around the code.Catholicon
Great solution. Is there a way to do this using Django ORM? Or should I do it in PostgreSQL directly?Talie
@fcrazy: I am no expert with Django, but a single raw SQL call for the CREATE UNIQUE INDEX ... statement should do the job.Stonewort
@ErwinBrandstetter Thanks Erwin, I made my own research and seems that a good place to do this in Django is adding the file <appname>/sql/<modelname>.sql, where <appname> is the given app, just as explain it here: docs.djangoproject.com/en/1.5/ref/django-admin/…Talie
Minor suggestion: Django maps iexact to UPPER. Shouldn't the functional index recommended for Django be UPPER?Salsify
INDEXES in postgres need to be maintained, shouldn't that have been mentioned in the original post? Please correct me if I'm wrong, but eventually wouldn't using the query in your answer have to be rebuilt at some point?Primine
@Dre: All indexes, once created, are maintained automatically In Postgres. (Adding some cost to write-operations, and occupying storage accordingly.) The query can stay the same. Not sure what you mean by "rebuilding query".Stonewort
@ErwinBrandstetter for example, if you have a database of 50 users and then scale to 500,000 users. Will the index still be used / will it cause fragmentation?Primine
@Dre The number of (concurrent) users or transactions has no adverse effect on index usage. Indexes do not "cause fragmentation". Maybe you mean index bloat? Can be a thing. I suggest you start a new question with all details to clarify your concern.Stonewort
R
47

As of Django 1.11, you can use CITextField, a Postgres-specific Field for case-insensitive text backed by the citext type.

from django.db import models
from django.contrib.postgres.fields import CITextField

class Something(models.Model):
    foo = CITextField()

Django also provides CIEmailField and CICharField, which are case-insensitive versions of EmailField and CharField.

Razzledazzle answered 5/5, 2017 at 19:25 Comment(2)
nice! but, note that you must install a postgres extension (citext) to use it.Bibliotheca
I still can make "gYm foOd" and then I can add "gYM FOOD", unique=True doesn't give me an error.Swath
A
37

As of December 2021, with the help of Django 4.0 UniqueConstraint expressions you can add a Meta class to your model like this:

class Meta:
    constraints = [
        models.UniqueConstraint(
            Lower('<field name>'),
            name='<constraint name>'
        ),
    ]

I'm by no mean a Django professional developer and I don't know technical considerations like performance issues about this solution. Hope others comment on that.

Anhydrite answered 12/12, 2021 at 10:47 Comment(1)
This is the correct way for Django 4.0 or higher, as it does what is pointed in the accepted answer CREATE UNIQUE INDEX my_idx ON mytbl(LOWER(name)) but within the ORM.Taipan
S
31

Store the original mixed-case string in a plain text column. Use the data type text or varchar without length modifier rather than varchar(n). They are essentially the same, but with varchar(n) you have to set an arbitrary length limit, that can be a pain if you want to change later. Read more about that in the manual or in this related answer by Peter Eisentraut @serverfault.SE.

Create a functional unique index on lower(string). That's the major point here:

CREATE UNIQUE INDEX my_idx ON mytbl(lower(name));

If you try to INSERT a mixed case name that's already there in lower case you get a unique key violation error.
For fast equality searches use a query like this:

SELECT * FROM mytbl WHERE lower(name) = 'foo' --'foo' is lower case, of course.

Use the same expression you have in the index (so the query planner recognizes the compatibility) and this will be very fast.


As an aside: you may want to upgrade to a more recent version of PostgreSQL. There have been lots of important fixes since 8.4.2. More on the official Postgres versioning site.

Stonewort answered 14/10, 2011 at 21:1 Comment(9)
Thank you for the solution. I ended up using this one and the one below so now you can't just work around the code.Catholicon
Great solution. Is there a way to do this using Django ORM? Or should I do it in PostgreSQL directly?Talie
@fcrazy: I am no expert with Django, but a single raw SQL call for the CREATE UNIQUE INDEX ... statement should do the job.Stonewort
@ErwinBrandstetter Thanks Erwin, I made my own research and seems that a good place to do this in Django is adding the file <appname>/sql/<modelname>.sql, where <appname> is the given app, just as explain it here: docs.djangoproject.com/en/1.5/ref/django-admin/…Talie
Minor suggestion: Django maps iexact to UPPER. Shouldn't the functional index recommended for Django be UPPER?Salsify
INDEXES in postgres need to be maintained, shouldn't that have been mentioned in the original post? Please correct me if I'm wrong, but eventually wouldn't using the query in your answer have to be rebuilt at some point?Primine
@Dre: All indexes, once created, are maintained automatically In Postgres. (Adding some cost to write-operations, and occupying storage accordingly.) The query can stay the same. Not sure what you mean by "rebuilding query".Stonewort
@ErwinBrandstetter for example, if you have a database of 50 users and then scale to 500,000 users. Will the index still be used / will it cause fragmentation?Primine
@Dre The number of (concurrent) users or transactions has no adverse effect on index usage. Indexes do not "cause fragmentation". Maybe you mean index bloat? Can be a thing. I suggest you start a new question with all details to clarify your concern.Stonewort
P
20

With overriding the model manager, you have two options. First is to just create a new lookup method:

class MyModelManager(models.Manager):
   def get_by_username(self, username):
       return self.get(username__iexact=username)

class MyModel(models.Model):
   ...
   objects = MyModelManager()

Then, you use get_by_username('blah') instead of get(username='blah'), and you don't have to worry about forgetting iexact. Of course that then requires that you remember to use get_by_username.

The second option is much hackier and convoluted. I'm hesitant to even suggest it, but for completeness sake, I will: override filter and get such that if you forget iexact when querying by username, it will add it for you.

class MyModelManager(models.Manager):
    def filter(self, **kwargs):
        if 'username' in kwargs:
            kwargs['username__iexact'] = kwargs['username']
            del kwargs['username']
        return super(MyModelManager, self).filter(**kwargs)

    def get(self, **kwargs):
        if 'username' in kwargs:
            kwargs['username__iexact'] = kwargs['username']
            del kwargs['username']
        return super(MyModelManager, self).get(**kwargs)

class MyModel(models.Model):
   ...
   objects = MyModelManager()
Pediatrics answered 14/10, 2011 at 22:1 Comment(3)
I like the hackier version better than custom method version +1 for hackiness!Pipkin
I prefer this method, especially the hackier version, over the accepted answer because this is DBMS-agnostic. It makes you stick with Django's case-insensitive QuerySet methods in the end, so Django can still generate the SQL statements with the proper collation coercion, regardless of the DBMS backend.Ento
It may be database agnostic, but it doesn't prevent you from inserting the same value with different case. So it is not a complete solution to case-insensitive unique model fields. You could always convert to lower case before storing the object in the database, but then you loose the original case, which is not necessarily acceptable.Sad
A
6

Since a username is always lowercase, it's recommended to use a custom lowercase model field in Django. For the ease of access and code-tidiness, create a new file fields.py in your app folder.

from django.db import models
from django.utils.six import with_metaclass

# Custom lowecase CharField

class LowerCharField(with_metaclass(models.SubfieldBase, models.CharField)):
    def __init__(self, *args, **kwargs):
        self.is_lowercase = kwargs.pop('lowercase', False)
        super(LowerCharField, self).__init__(*args, **kwargs)

    def get_prep_value(self, value):
        value = super(LowerCharField, self).get_prep_value(value)
        if self.is_lowercase:
            return value.lower()
        return value

Usage in models.py

from django.db import models
from your_app_name.fields import LowerCharField

class TheUser(models.Model):
    username = LowerCharField(max_length=128, lowercase=True, null=False, unique=True)

End Note : You can use this method to store lowercase values in the database, and not worry about __iexact.

Alpestrine answered 12/3, 2016 at 18:49 Comment(0)
U
3

You can use citext postgres type instead and not bother anymore with any sort of iexact. Just make a note in model that underlying field is case insensitive. Much easier solution.

Utica answered 26/6, 2015 at 21:10 Comment(0)
V
1

You can use lookup='iexact' in UniqueValidator on serializer, like this: Unique model field in Django and case sensitivity (postgres)

V2 answered 28/12, 2017 at 16:42 Comment(0)
P
1

I liked Chris Pratt's Answer but it didn't worked for me, because the models.Manager-class doesn't have the get(...) or filter(...) Methods. I had to take an extra step via a custom QuerySet:

from django.contrib.auth.base_user import BaseUserManager
from django.db.models import QuerySet

class CustomUserManager(BaseUserManager):

    # Use the custom QuerySet where get and filter will change 'email'
    def get_queryset(self):
        return UserQuerySet(self.model, using=self._db)

    def create_user(self, email, password, **extra_fields):
        ...

    def create_superuser(self, email, password, **extra_fields):
        ...

class UserQuerySet(QuerySet):

    def filter(self, *args, **kwargs):
        if 'email' in kwargs:
            # Probably also have to replace...
            #   email_contains -> email_icontains,
            #   email_exact -> email_iexact,
            #   etc.
            kwargs['email__iexact'] = kwargs['email']
            del kwargs['email']
        return super().filter(*args, **kwargs)

    def get(self, *args, **kwargs):
        if 'email' in kwargs:
            kwargs['email__iexact'] = kwargs['email']
            del kwargs['email']
        return super().get(*args, **kwargs)

This worked for me in a very simple case but is working pretty good so far.

Pettish answered 2/2, 2022 at 20:29 Comment(0)
S
0

You can also override get_prep_value() and reuse it through inheritance.

class LowerCaseField:
    def get_prep_value(self, value):
        value = super().get_prep_value(value)
        if value:
            value = value.strip().lower()
        return value


class LowerSlugField(LowerCaseField, models.SlugField):
    pass


class LowerEmailField(LowerCaseField, models.EmailField):
    pass


class MyModel(models.Model):
    email = LowerEmailField(max_length=255, unique=True)

This way, if you ever want to reuse this field in another model, you can use the same consistent strategy.

From Django Docs:

get_prep_value(value)

value is the current value of the model’s attribute, and the method should return data in a format that has been prepared for use as a parameter in a query.

See Converting Python objects to query values for usage.

Syncarpous answered 20/11, 2018 at 9:27 Comment(0)
F
0

one of the best options for do this is that create a form and past it exept of the admin form . like this :

from . Import YourModel
from django import forms
from django.contrib import admin
class HandMadeFormForAdminPage(forms.ModelForm):
    class Meta:
        model = YourModel
        fields = "__all__"
    def clean(self):
        cleaned_data = super().clean()

        field_value = self.cleaned_data.get("YourFieldName")
        value_pair  = YourModel.objects.filter(YourModelFieldName__iexact=field_value).first()
        if value_pair:
            self.add_error("YourFieldName", f"the {value_pair} alredy exist")
class YourModelAdmin(admin.ModelAdmin):
       form =  HandMadeFormForAdminPage
Farfetched answered 19/10, 2023 at 23:30 Comment(0)
R
0

by using def db_type you could also do something like:

from django.db import models
from django.utils.translation import gettext_lazy as _


class CaseInsensitiveCharField(models.CharField):
    description = _("Case insensitive character")

    def db_type(self, connection):
        return "citext"
Remove answered 24/11, 2023 at 17:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.