Why is using thread locals in Django bad?
Asked Answered
E

4

61

I am using thread locals to store the current user and request objects. This way I can have easy access to the request from anywhere in the programme (e.g. dynamic forms) without having to pass them around.

To implement the thread locals storage in a middleware, I followed a tutorial on the Django site: https://web.archive.org/web/20091128195932/http://code.djangoproject.com:80/wiki/CookBookThreadlocalsAndUser

This document has since been modified to suggest avoiding this technique: https://web.archive.org/web/20110504132459/http://code.djangoproject.com/wiki/CookBookThreadlocalsAndUser

From the article:

From a design point of view, threadlocals are essentially global variables, and are subject to all the usual problems of portability and predictability that global variables usually entail.

More importantly, from a security point of view, threadlocals pose a huge risk. By providing an data store that exposes the state of other threads, you provide a way for one thread in your web server to potentially modify the state of another thread in the system. If the threadlocal data contains descriptions of users or other authentication-related data, that data could be used as the basis for an attack that grants access to an unauthorized user, or exposes private details of a user. While it is possible to build a threadlocal system that is safe from this sort of attack, it's a lot easier to be defensive and build a system that isn't subject to any such vulnerability in the first place.

I understand why global variables can be bad, but in this case I'm running my own code on my own server so I can't see what danger two global variables pose.

Can someone explain the security issue involved? I have asked many people how they would hack my application if they read this article and know I'm using thread locals, yet no one has been able to tell me. I am starting to suspect that this is an opinion held by hair-splitting purists who love to pass objects explicitly.

Epidiascope answered 12/7, 2010 at 9:21 Comment(3)
By the way - do you have the original example? It's deleted now and I want to use that...Silberman
This snippet is pretty similar to the deleted page: djangosnippets.org/snippets/2179Epidiascope
GlobalRequest middleware: djangosnippets.org/snippets/2853Structural
P
65

I disagree entirely. TLS is extremely useful. It should be used with care, just as globals should be used with care; but saying it shouldn't be used at all is just as ridiculous as saying globals should never be used.

For example, I store the currently active request in TLS. This makes it accessible from my logging class, without having to pass the request around through every single interface--including many that don't care about Django at all. It lets me make log entries from anywhere in the code; the logger outputs to a database table, and if a request happens to be active when a log is made, it logs things like the active user and what was being requested.

If you don't want one thread to have the capability of modifying another thread's TLS data, then set your TLS up to prohibit this, which probably requires using a native TLS class. I don't find that argument convincing, though; if an attacker can execute arbitrary Python code as your backend, your system is already fatally compromised--he could monkey patch anything to be run later as a different user, for example.

Obviously, you'll want to clear any TLS at the end of a request; in Django, that means clearing it in process_response and process_exception in a middleware class.

Peruse answered 12/7, 2010 at 10:6 Comment(5)
Thanks for your confirmation. I now feel less insane =). If an attacker can read the threadlocal data, then he must be able to SSH into my machine anyway.Epidiascope
Not necessarily SSH, but at least to have some kind of control over the Python backend. The whole argument seems pretty contrived, anyway.Peruse
I'm curious what, exactly, you mean by "clear any TLS at the end of a request". Clear it how? Deleting the local object itself, or just the attribute on the local object in which you stored the session? Though, to be honest, I'm not even sure if there's a relevant difference between those.Blackfish
@CoreDumpError Just delete the data. If you delete the object, you'd wipe out the whole data storage, which other threads are probably still using.Peruse
One huge caveat - I found that how you import your module (absolute vs relative) has impact on if the module will be loaded again (with its own set of global variables) or not - resulting in very subtle bugs especially around global variables and threading.local ! ie. from path.to.package import global_event will result in different instance of from .package import global_eventShitty
M
15

Despite the fact that you could mix up data from different users, thread locals should be avoided because they hide a dependency. If you pass arguments to a method you see and know what you're passing. But a thread local is something like a hidden channel in the background and you may wonder, that a method doesn't work correctly in some cases.

There are some cases where thread locals are a good choice, but they should be used rarely and carefully!

Maite answered 12/7, 2010 at 9:36 Comment(1)
But this isn't a problem because I know that thread locals are used anywhere I call get_current_user(). I have only two variables in the thread locals anyway.Epidiascope
F
13

A quick example on how to create a TLS middleware compatible with the latest Django 1.10:

# coding: utf-8
# Copyright (c) Alexandre Syenchuk (alexpirine), 2016

try:
    from threading import local
except ImportError:
    from django.utils._threading_local import local

_thread_locals = local()

def get_current_request():
    return getattr(_thread_locals, 'request', None)

def get_current_user():
    request = get_current_request()
    if request:
        return getattr(request, 'user', None)

class ThreadLocalMiddleware(object):
    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        _thread_locals.request = request
        return self.get_response(request)
Falchion answered 4/9, 2016 at 5:22 Comment(6)
This doesn'xt work when using UWSGI. local() is sharedHellish
Could you please explain further, @JavierBuzzi ? Do you mean that local() is shared across threads when using UWSGI? Do you have an explanation for why is that?Falchion
Because of the way UWSGI works, it shares everything among all threads to speed things up. pythonanywhere.com/forums/topic/710 -- its easy enough to test: pip install uwsgi and test it out for yourself -- as a side note, i had to create a middleware that creates cache at the beginning of the request and clear it out at the end of the request.. just to make sure -- otherwise i was getting all sorts of errors when running it inside uwsgiHellish
threading.local and Django works perfectly fine with uwsgi. Django framework itself uses it internally. I personally tested it with Django 1.11 and uwsgi 2.0.17. Of course for sensitive data you would have to know what you are doing and test very well. Because Django is running sync, one request means one thread. Because it uses a pool of threads clean-up for the threading.local() has to be done at the end of a request. This is a design problem thou, because you tightly couple components by using globals, but other than that it works.Tournament
@SebastianBrestin Sounds off, if you look at the uwsgi code you’d see that they patch threading.Hellish
@JavierBuzzi can you please provide more details about your cache implementation?Caprine
R
3

This question is really old, but I just saw someone referring to it, so I just want to note that the wiki page cited by this question stopped recommending threadlocal storage in 2010 and then was deleted altogether by 2012.

Rhigolene answered 6/10, 2019 at 20:1 Comment(1)
Fair enough... which leaves me wondering how the heck to solve the perennial problem of knowing the request.user in model.save() ??Fungiform

© 2022 - 2024 — McMap. All rights reserved.