How to decorate a Python object with a mutex
Asked Answered
T

5

6

I'm new to python and am currently trying to learn threading. I'm weary of using locks to make my resources thread-safe because they aren't inherently tied to the resource, so I'm bound to forget to acquire and/or release them every time my code interacts with the resource. Instead, I'd like to be able to "wrap" (or decorate?) an object so that all of it's methods and attribute getters/setters are atomic. something like this:

state = atomicObject(dict())

# the following is atomic/thread-safe
state["some key"] = "some value"

Is this possible? If so, what's the "best practices" way of implementing it?

EDIT: An good answer to the above question is available in How to make built-in containers (sets, dicts, lists) thread safe?. However; as abarnert and jsbueno have both demonstrated, the solution I proposed (automating locks) is not generally a good idea because determining the proper granularity of atomic operations requires some intelligence and is likely difficult (or impossible) to automate properly.

The problem still remains that locks are not bound in any way to the resources they are meant to protect, so my new question is: What's a good way to associate a lock with an object?

Proposed solution #2: I imagine there might be a way to bind a lock to an object such that trying to access that object without first acquiring the lock throws an error, but I can see how that could get tricky.

EDIT: The following code is not very relevant to the question. I posted it to demonstrate that I had tried to solve the problem myself and gotten lost before posting this question.

For the record, I wrote the following code, but it doesn't work:

import threading    
import types
import inspect

class atomicObject(object):

    def __init__(self, obj):
        self.lock = threading.RLock()
        self.obj = obj

        # keep track of function handles for lambda functions that will be created
        self.funcs = []

        # loop through all the attributes of the passed in object
        # and create wrapped versions of each attribute
        for name in dir(self.obj):
            value = getattr(self.obj, name)
            if inspect.ismethod(value):
                # this is where things get really ugly as i try to work around the
                # limitations of lambda functions and use eval()... I'm not proud of this code
                eval("self.funcs.append(lambda self, *args, **kwargs: self.obj." + name + "(*args, **kwargs))")
                fidx = str(len(self.funcs) - 1)
                eval("self." + name + " = types.MethodType(lambda self, *args, **kwargs: self.atomize(" + fidx + ", *args, **kwargs), self)")

    def atomize(self, fidx, *args, **kwargs):
        with self.lock:
            return self.functions[fidx](*args, **kwargs)

I can create an atomicObject(dict()), but when I try to add a value to the object, I get the error; "atomicObject does not support item assignment".

Teenager answered 11/4, 2013 at 23:56 Comment(6)
Also, this code doesn't even come close to running. You're missing a colon after the with statement, you've got the wrong name for the Lock type, and I have no idea what else might be wrong beyond that. How do you expect us to debug it for you if we can't even get started?Sapotaceous
Also, this is usually not as good an idea as it sounds. For example, if I create d = atomicObject(dict()), then d['abc'] = 3, then d['abc'] += 1, that isn't atomic—it atomically reads d['abc'], then it releases the lock, then it atomically writes d['abc'], overwriting any other write made in the intervening time. (Imagine that d was a counter, and you had 20 threads all trying to do +1 at the same time. Instead of going up +20, it would likely only go up +1 or +2 or so.)Sapotaceous
I'm sorry about the sloppy code. It's the product of torture and frustration. I think I corrected some of the errors, but I mostly included my code as a courtesy. I was hoping someone could point me in the right direction since I'm obviously lost. Your answer was an enormous help. Thanks!Teenager
I think you'll find this answer to the question How to make built-in containers (sets, dicts, lists) thread safe? interesting.Soccer
@Soccer that answer is fantastic! Is there a way to mark this as a duplicate question and refer to that link?Teenager
@arachnivore: Glad you liked the linked answer (I did, too ;-). Anyway, yes, I can vote to close this question because I think it's a duplicate.Soccer
S
4

It's very hard to tell from your non-running example and your mess of eval code, but there's at least one obvious error.

Try this in your interactive interpreter:

>>> d = dict()
>>> inspect.ismethod(d.__setitem__)

As the docs say, ismethod:

Return true if the object is a bound method written in Python.

A method-wrapper written in C (or .NET, Java, the next workspace down, etc. for other Python implementations) is not a bound method written in Python.

You probably just wanted callable or inspect.isroutine here.

I can't say whether this is the only problem, because if I fix the syntax errors and name errors and this bug, the second eval line generates illegal code like this:

self.__cmp__ = types.MethodType(lambda self, *args, **kwargs: self.atomize(0, *args, **kwargs) self)

… and I'm not sure what you were trying to do there.


You really shouldn't be trying to create and eval anything. To assign attributes dynamically by name, use setattr. And you don't need complicated lambdas. Just define the wrapped function with a normal def; the result is a perfectly good local value that you can pass around, exactly like a lambda except that it has a name.

On top of that, trying to wrap methods statically at creation time is difficult, and has some major downsides. (For example, if the class you're wrapping has any dynamically-generated methods, you won't wrap them.) Most of the time, you're better off doing it dynamically, at call time, with __getattr__. (If you're worried about the cost of creating the wrapper functions every time they're called… First, don't worry unless you actually profile and find that it's a bottleneck, because it probably won't be. But, if it is, you can easily add a cache of generated functions.)

So, here's a much simpler, and working, implementation of what I think you're trying to do:

class atomicObject(object):

    def __init__(self, obj):
        self.lock = threading.Lock()
        self.obj = obj

    def __getattr__(self, name):
        attr = getattr(self.obj, name)
        print(attr)
        if callable(attr):
            def atomized(*args, **kwargs):
                with self.lock:
                    attr(*args, **kwargs)
            return atomized
        return attr

However, this isn't going to actually do what you want. For example:

>>> d = atomicObject(dict())
>>> d.update({'a': 4}) # works
>>> d['b'] = 5
TypeError: 'atomicObject' object does not support item assignment

Why does this happen? You've got a __setitem__, and it works:

>>> d.__setitem__
<method-wrapper '__setitem__' of dict object at 0x100706830>
>>> d.__setitem__('b', 5) # works

The problem is that, as the docs imply, special methods are looked up on the class, not the object. And the atomicObject class doesn't have a __setitem__ method.

In fact, this means you can't even usefully print out your object, because you just get the default __str__ and __repr__ from object:

>>> d
<__main__.atomicObject object at 0x100714690>
>>> print(d)
<__main__.atomicObject object at 0x100714690>
>>> d.obj #cheating
{'a': 4, 'b': 5}

So, the right thing to do here is to write a function that defines a wrapper class for any class, then do:

>>> AtomicDict = make_atomic_wrapper(dict)
>>> d = AtomicDict()

But, even after you do all of that… this is rarely as good an idea as it sounds.

Consider this:

d = AtomicDict()
d['abc'] = 0
d['abc'] += 1

That last line is not atomic. There's an atomic __getitem__, then a separate atomic __setitem__.

That may not sound like a big deal, but imagine that d is being used as a counter. You've got 20 threads all trying to do d['abc'] += 1 at the same time. The first one to get in on the __getitem__ will get back 0. And if it's the last one to get in on the __setitem__, it'll set it to 1.

Try running this example. With proper locking, it should always print out 2000. But on my laptop, it's usually closer to 125.

Sapotaceous answered 12/4, 2013 at 0:29 Comment(2)
Thank you. This was extremely helpful. That confusing code that was produced by the eval statement was borrowed from here (second answer, 5th comment). You make a good point about the atomicObject() idea being sketchy. I'll have to reconsider this idea.Teenager
@abarnet - I've used your example code bellow - just to let you knowXylem
T
3

Comming back to this years later. I think a context manager is the ideal solution to my original problem. I know Locks support context management, but you're still left with the problem of enforcing the relationship between the lock and the locked resource. Instead I imagine something like the following would be good:

class Locked:
    def __init__(self, obj):
        super().__init__()
        self.__obj = obj
        self.lock = threading.RLock()

    def __enter__(self):
        self.lock.acquire()
        return self.__obj

    def __exit__(self, *args, **kwargs):
        self.lock.release()


guard = Locked(dict())

with guard as resource:
    do_things_with(resource)
Teenager answered 3/6, 2016 at 5:37 Comment(0)
X
2

I had put some thought on your question, and it would be something a tricky - you have to proxy not only all the object methods with your Atomic class, - which can be done properly writting a __getattribute__ method - but for the operators themselves to work, you'd also have to provide the proxied object with a class that provides the same "magic double underscore" methods as the original's objects class - that is, you have to dynamically create a proxied class - else the operator usage themselves won't be atomic.

It is doable - but since you are new in Python, you can perform import this on the interactive prompt, and among the several guidelines/advices that show up you will see: """If the implementation is hard to explain, it's a bad idea.""" :-)

Which brings us to: Using threads in Python is generally a bad idea. Except for quasi-trivial code with lots of blocking I/O - you will prefer another approach -as threading in Python does not allow ordinary Python code to make use of more CPU cores, for example - there is only a single thread of Python code running at once - search for "Python GIL" to learn why - (exception, if a lot of your code is spend in computational intensive native code, such as Numpy functions) .

But you'd rather write you r program to use asynchronous calls using one of the various available frameworks for that, or for easily taking advantage of more than one core, use multiprocessing instead of threading - which basically creates one process per "thread" - and requires all that sharing to be done explicitly.

Xylem answered 12/4, 2013 at 1:45 Comment(5)
My motive for using threading is not performance. I understand that it's only psudo-threading. I'm just trying to make things asynchronous. I'm working on a text-based multi-user dungeon, so performance is not a huge concern. So far, threading seems to be rather straight-forward (especially with the use of Queues), so I'm curious why you consider it a bad idea?Teenager
@arachnivore: A lot of people seem to believe that "threading is bad" (either "in Python", or "full stop"), but that's because they're overgeneralizing. For CPU parallelism, Python threads are useless, and you have to use processes. For massively-concurrent (c10k) servers, threads are too heavy-weight, and you have to use explicit event loops and callbacks/greenlets/generator coroutines. But for network clients, servers that are only meant to handle a dozen users, GUI apps, etc., threads are often the best solution.Sapotaceous
@arachnivore: Also: dealing with (mutable) shared data is a hard problem, and people often use that as an argument against threads—but it applies just as much to other forms of concurrency. (There's a reason everything from twisted to multiprocessing recommends avoiding it.)Sapotaceous
@arachnivore: indeed, threads may work in your case - you are not falling in the scenarios were they are far from the best solution in Python - but I'd think using an event loop with callbacks would be easier.Xylem
@jsbueno: An event loop is almost always harder than threading, because you can't write your sequential code sequentially anymore; you have to manually rewrite "the rest of this function" as a separate function and pass it as a callback. I said "almost" because with @inlineCallbacks/monocle/gevent/etc., you can write sequential code—but then you're effectively writing threaded code anyway, just with cooperative threads. (And it's still slightly harder than threads, because you have to either annotate your suspension points, or recognize the set of automatic ones.)Sapotaceous
F
1

The wrapt module then contains the @synchronized decorator described there.

https://pypi.python.org/pypi/wrapt

A talk describing the decorator and how it works can be found at:

Foulk answered 31/5, 2019 at 2:15 Comment(0)
X
0

Despite my other answer - which has valid considerations on Python threading, and on to turn an existing object in to an "atomically" locked object - if you are defining the class of the object you want to lock atomically, the whole thing is one order of magnitude simpler.

One can do a function decorator to make functions run with a lock with four lines. With that it is possible to build a class decorator that atomically locks all methods and properties of a given class.

The code bellow works with Python 2 and 3 (I'd used @abarnet's example for the the function calls - and relied on my "printing debug" for the class example. )

import threading
from functools import wraps

#see https://mcmap.net/q/1681562/-how-to-decorate-a-python-object-with-a-mutex/15961762#15960881

printing = False

lock = threading.Lock()
def atomize(func):
    @wraps(func)
    def wrapper(*args, **kw):
        with lock:
            if printing:
                print ("atomic")
            return func(*args, **kw)
    return wrapper

def Atomic(cls):
    new_dict = {}
    for key, value in cls.__dict__.items():
        if hasattr(value, "__get__"):
            def get_atomic_descriptor(desc):
                class Descriptor(object):
                    @atomize
                    def __get__(self, instance, owner):
                        return desc.__get__(instance, owner)
                    if hasattr(desc, "__set__"):
                        @atomize
                        def __set__(self, instance, value):
                            return desc.__set__(instance, value)
                    if hasattr(desc, "__delete__"):
                        @atomize
                        def __delete__(self, instance):
                            return desc.__delete__(instance)
                return Descriptor()
            new_dict[key] = get_atomic_descriptor(value)
        elif callable(value):
            new_dict[key] = atomize(value)
        else:
            new_dict[key] = value
    return type.__new__(cls.__class__, cls.__name__, cls.__bases__, new_dict)


if __name__ == "__main__": # demo:
    printing = True

    @atomize
    def sum(a,b):
        return a + b

    print (sum(2,3))

    @Atomic
    class MyObject(object):
        def _get_a(self):
            return self.__a

        def _set_a(self, value):
            self.__a = value +  1

        a = property(_get_a, _set_a)

        def smurf(self, b):
            return self.a + b

    x = MyObject()
    x.a = 5
    print(x.a)
    print (x.smurf(10))

    # example of atomized function call - based on
    # @abarnet's code at http://pastebin.com/MrtR6Ufh
    import time, random
    printing = False
    x = 0

    def incr():
        global x
        for i in range(100):
            xx = x
            xx += 1
            time.sleep(random.uniform(0, 0.02))
            x = xx

    def do_it():
        threads = [threading.Thread(target=incr) for _ in range(20)]
        for t in threads:
            t.start()
        for t in threads:
            t.join()

    do_it()
    print("Unlocked Run: ", x)

    x = 0
    incr = atomize(incr)
    do_it()
    print("Locked Run: ", x)

NB: although "eval" and "exec" are available in Python, serious code will seldom -- and I mean seldom -- need either. Even complex decorators which do recreate functions can dos o through introspection rather than relying on string-compiling through eval.

Xylem answered 12/4, 2013 at 2:25 Comment(6)
Good job tying together all the separate pieces. One last thing to point out: atomizing the whole incr function solves the too-fine-grained locks allowing races, but only by using too-coarse-grained locks serializing all of the threads. (Each thread holds the lock for the entire course of its run, so they can only run one after another.) What you really need to lock here is in the inside of the for loop (even that means most of the time, 19 threads will be waiting on 1 thread to sleep for 10ms, but there's no better option), and there's no obvious place to add that lock from outside.Sapotaceous
More generally: Finding the right lock granularity is the hardest challenge in threaded programming, and there's no magic bullet. Except, of course, to get rid of the need for locks at the application level—only share immutable data (turn mutable shared data into immutable transformations), switch to message-passing actors, use transactions, etc.Sapotaceous
@abarnet: in respect to the 1st comment: locking the whole methods and attribute accesses is what the O.P. asked for. I agree it is bad, and running the snippet above is "worth 1000 words" :-)Xylem
I suppose what I'm really looking for is some way to enforce the use of locks. There doesn't seem to be any connection between a lock and the resource it is supposed to protect. If I create a locked resource, an error should be thrown if I try to use the resource without acquiring the lock first, and another error should be thrown if I don't release the lock. That seems like a more reasonable goal.Teenager
@arachnivore: The point is that what you're looking for is impossible. Lock discipline can't be enforced completely programmatically, because in general, the things you want to be atomic—the transactions—are not basic operations on a single object. x+=1 is just the simplest example of that; for a more complex example, think of moving money from one account to another. You either use a higher-level abstraction (transactional memory, messaging, etc.), or you do the locking manually.Sapotaceous
@abarnet "enforce the use of locks" is different from "automate the use of locks" (I understand your point that automating locks is a bad idea). What I meant was: If I create a locked resource, an error should be thrown if I try to use the resource without previously acquiring the lock (some time before using the resource), and another error should be thrown if I don't ever release the lock (which would be unnecessary if you use the "with lock:" syntax) . I'm not sure if such a thing is doable, but it is very different from automating locks.Teenager

© 2022 - 2024 — McMap. All rights reserved.