Django custom for complex Func (sql function)
Asked Answered
B

3

16

In the process of finding a solution for Django ORM order by exact, I created a custom django Func:

from django.db.models import Func

class Position(Func):
    function = 'POSITION'
    template = "%(function)s(LOWER('%(substring)s') in LOWER(%(expressions)s))"
    template_sqlite = "instr(lower(%(expressions)s), lower('%(substring)s'))"

    def __init__(self, expression, substring):
        super(Position, self).__init__(expression, substring=substring)

    def as_sqlite(self, compiler, connection):
        return self.as_sql(compiler, connection, template=self.template_sqlite)

which works as follows:

class A(models.Model):
    title = models.CharField(max_length=30)

data = ['Port 2', 'port 1', 'A port', 'Bport', 'Endport']
for title in data:
    A.objects.create(title=title)

search = 'port'
qs = A.objects.filter(
        title__icontains=search
    ).annotate(
        pos=Position('title', search)
    ).order_by('pos').values_list('title', flat=True)
# result is
# ['Port 2', 'port 1', 'Bport', 'A port', 'Endport'] 

But as @hynekcer commented:

"It crashes easily by ') in '') from myapp_suburb; drop ... expected that the name of the app is "myapp and autocommit is enabled."

The main problem is that extra data (substring) got into the template without sqlescape which leaves the app vulnerable to SQL injection attacks.

I cannot find which is the Django way to protect from that.


I created a repo (djposfunc) where you can test any solution.

Banjermasin answered 26/9, 2017 at 12:51 Comment(1)
Excuse the delay, I wrote the answer to this security question before any answer locally and waited for a solution of the issue. Now I wrote also a normal answer to the original question.Sports
S
13

TL;DR: All examples with Func() in Django docs can be easily used to safely implement other similar SQL functions with one argument. All builtin Django database fuctions and conditional functions that are descendants of Func() are also safe by design. Application beyond this limit needs comment.


The class Func() is the most general part of Django Query expressions. It allows to implement almost any function or operator into Django ORM some way. It is like a Swiss Army knife, very universal, but one must be little more attentive to not cut himself, than with a specialized tool (like an electric cutter with optical barrier). It is still much more secure then to forge an own tool by hammer from piece of iron, if once an "upgraded" "secure" pocket knife does not fit into pocket.


Security notes

  • The short documentation for Func(*expressions, **extra) with examples should be read first. (I recommend here the development docs for Django 2.0 where is recently added more security information, including Avoiding SQL injection, related exactly to your example.)

  • All positional arguments in *expressions are compiled by Django, that is Value(string) are moved to parameters, where they are correctly escaped by database driver.

  • Other strings are interpreted as field names F(name), then prefixed by right table_name. alias dot, eventually a join to that table is added and names are treated by quote_name() function.
  • The problem is that the documentation in 1.11 is still simple, the seductive parameters **extra and **extra_context are documented vaguely. They can be used only for simple parameters that will be never "compiled" and never go through SQL params. Numbers or simple strings with safe characters without apostrophe, backslash or percent are good. It can not be a field name, because it will be not unambiguous, neither joined. It is safe for previously checked numbers and fixed strings like 'ASC'/'DESC', timezone names and other values like from a drop down list. There is still a weak point. Drop down list values must be checked at the server side. Also numbers must be verified that they are numbers, not a numeric string like '2' because all database functions silently accept an omitted numeric string instead of number. If a false "number" is passed '0) from my_app.my_table; rogue_sql; --' then the injection is over. Note that the rogue string doesn't contain any very prohibitive character in this case. User supplied numbers must be checked specifically or the value must be passed through positional expressions.
  • It is safe to specify function name and arg_joiner string attributes of Func class or the same function and arg_joiner parameters of Func() call. The template parameter should never contain apostrophes around substituted parameter expressions inside parentheses: ( %(expressions)s ), because apostrophes are added by the database driver if necessary, but additional apostrophes can cause that it usually doesn't work correctly, but sometimes it could be overlooked and that would lead to another security issue.

Notes not related to security

  • Many simple builtin functions with one argument do not look as simple as possible because they are derived from multi-purpose descendants of Func. For example Length is a function that can be used also as lookup Transform.

    class Length(Transform):
        """Return the number of characters in the expression."""
        function = 'LENGTH'
        output_field = fields.IntegerField()  # sometimes specified the type
        # lookup_name = 'length'  # useful for lookup not for Func usage
    

    Lookup transformation applies the same function to the left and right side of lookup.

    # I'm searching people with usernames longer than mine 
    qs = User.objects.filter(username__length__gt=my_username)
    
  • The same keyword arguments that can be specified in Func.as_sql(..., function=..., template=..., arg_joiner=...) can be specified already in Func.__init__() if not overwritten in custom as_sql() or they can be set as attributes of a custom descendant class of Func.

  • Many SQL database functions have a verbose syntax like POSITION(substring IN string) because it simplifies readability if named parameters are not supported like POSITION($1 IN $2) and a brief variant STRPOS(string, substring) (por postgres) or INSTR(string, substring) (for other databases) that is easier implemented by Func() and the readability is fixed by the Python wrapper with __init__(expression, substring).

  • Also very complicated functions can be implemented by a combination of more nested functions with simple arguments safe way: Case(When(field_name=lookup_value, then=Value(value)), When(...),... default=Value(value)).

Sports answered 4/11, 2017 at 20:59 Comment(0)
C
7

Usually, what leaves you vulnerable to an SQL injection attack are the "stray" single quotes '.
Everything contained between a single quote pair will be processed as it should, but an unpaired single quote may end the string and allow the rest of the entry to act as executable piece of code.
That is exactly the case on @hynekcer's example.

Django provides the Value method to prevent the above:

The value will be added into the SQL parameter list and properly quoted.

So if you make sure to pass every user input through the Value method you will be fine:

from django.db.models import Value

search = user_input
qs = A.objects.filter(title__icontains=search)
              .annotate(pos=Position('title', Value(search)))
              .order_by('pos').values_list('title', flat=True)

EDIT:

As stated in the comments, that doesn't seem to work as expected in the above setting. But if the call is as follows it works:

pos=Func(F('title'), Value(search), function='INSTR')

As a side note: Why mess with the templates in the first place?

You can find the function you want to use from any database language (ex: SQLite, PostgreSQL, MySQL etc) and use it explicitly:

class Position(Func):
    function = 'POSITION' # MySQL default in your example

    def as_sqlite(self, compiler, connection):
        return self.as_sql(compiler, connection, function='INSTR')

    def as_postgresql(self, compiler, connection):
        return self.as_sql(compiler, connection, function='STRPOS')

    ...

EDIT:

You can use other functions (like the LOWER function) inside a Func call as follows:

pos=Func(Lower(F('title')), Lower(Value(search)), function='INSTR')
Cherice answered 29/9, 2017 at 14:26 Comment(3)
do you try your solution, in my test it does not work. about SQL function it will not work for current need case insensitive. Why mess with the templates in the first place in other way i can said it needs as solution for complex functionBanjermasin
I run some tests in your code @BearBrown and if I run that pos=Func(F('title'), Value(search), function='INSTR') instead of Position('title', search) I pass the tests but if I do pos=Position('title', Value(search)) it does fail. Maybe some django bug?Cherice
@BearBrown I would guess that a mixup may happen because both Value and Func have a as_sql method. Let me know in order to edit my answer!Cherice
B
5

basis on the John Moutafis ideas, final function is (inside the __init__ method we use Values for safety result.)

from django.db.models import Func, F, Value
from django.db.models.functions import Lower


class Instr(Func):
    function = 'INSTR'

    def __init__(self, string, substring, insensitive=False, **extra):
        if not substring:
            raise ValueError('Empty substring not allowed')
        if not insensitive:
            expressions = F(string), Value(substring)
        else:
            expressions = Lower(string), Lower(Value(substring))
        super(Instr, self).__init__(*expressions)

    def as_postgresql(self, compiler, connection):
        return self.as_sql(compiler, connection, function='STRPOS')
Banjermasin answered 29/9, 2017 at 23:36 Comment(1)
That's nice @BearBrown :)Cherice

© 2022 - 2024 — McMap. All rights reserved.