Clearing lru_cache of certain methods when an attribute of the class is updated?
Asked Answered
B

2

7

I have an object with a method/property multiplier. This method is called many times in my program, so I've decided to use lru_cache() on it to improve the execution speed. As expected, it is much faster:

The following code shows the problem:

from functools import lru_cache

class MyClass(object):
    def __init__(self):
        self.current_contract = 201706
        self.futures = {201706: {'multiplier': 1000},
                        201712: {'multiplier': 25}}

    @property
    @lru_cache()
    def multiplier(self):
        return self.futures[self.current_contract]['multiplier']

CF = MyClass()
assert CF.multiplier == 1000

CF.current_contract = 201712
assert CF.multiplier == 25

The 2nd assert fails, because the cached value is 1000 as lru_cache() is unaware that the underlying attribute current_contract was changed.

Is there a way to clear the cache when self.current_contract is updated?

Thanks!

Bloodstained answered 24/7, 2017 at 14:1 Comment(0)
G
6

Yes quite simply: make current_contract a read/write property and clear the cache in the property's setter:

from functools import lru_cache

class MyClass(object):
    def __init__(self):
        self.futures = {201706: {'multiplier': 1000},
                        201712: {'multiplier': 25}}
        self.current_contract = 201706

    @property
    def current_contract(self):
        return self._current_contract

    @current_contract.setter
    def current_contract(self, value):
        self._current_contract = value
        type(self).multiplier.fget.cache_clear()

    @property
    @lru_cache()
    def multiplier(self):
        return self.futures[self.current_contract]['multiplier']

NB : I assume your real use case involves costly computations instead of a mere dict lookup - else lru_cache might be a bit overkill ;)

Grind answered 24/7, 2017 at 14:33 Comment(6)
Actually it's really a mere dict lookup, but it's called hundreds of thousands times in my program and using lru_cache made a big difference. I'll test it again with the new code. Many thanks for your help! It solved the problem.Bloodstained
If you have such a need for optimizations you may want to use self._current_contract instead of self.current_contract in multiplier (to avoid the property call / method call / attribute resolution overhead), and possibly just make multiplier a plain attribute that gets set in current_contract setter (note that I haven't done any benchmarking so you may want to timeit first to find out which solution is indeed the fastest)Grind
That's what I did, will test it soon. Thanks for the advice.Bloodstained
@brunodesthuilliers can you please give me some color on why you have to use type(self).multiplier.fget.cache_clear() instead of self.multiplier.fget.cache_clear() ? thxSommer
Because else you'd trigger the property mechanism. You can read the official doc about descriptors (the general mechanism that supports computed attributes) to get the details.Grind
Hi everyone! Is there a way to clear the lru_cache for only one instance of the class? I mean suppose the changes on attributes of one object require a cache flush, but another object doesn't requires a cache flush. Is there any way to do that?Buttons
I
4

Short Answer

Don't clear the cache when self.current_contract is updated. That is working against the cache and throws away information.

Instead, just add methods for __eq__ and __hash__. That will teach the cache (or any other mapping) which attributes are important for influencing the result.

Worked out example

Here we add __eq__ and __hash__ to your code. That tells the cache (or any other mapping) that current_contract is the relevant independent variable:

from functools import lru_cache

class MyClass(object):
    def __init__(self):
        self.current_contract = 201706
        self.futures = {201706: {'multiplier': 1000},
                        201712: {'multiplier': 25}}

    def __hash__(self):
        return hash(self.current_contract)

    def __eq__(self, other):
        return self.current_contract == other.current_contract

    @property
    @lru_cache()
    def multiplier(self):
        return self.futures[self.current_contract]['multiplier']

An immediate advantage is that as you switch between contract numbers, previous results are kept in the cache. Try switching between 201706 and 201712 a hundred times and you will get 98 cache hits and 2 cache misses:

cf = MyClass()
for i in range(50):
    cf.current_contract = 201712
    assert cf.multiplier == 25
    cf.current_contract = 201706 
    assert cf.multiplier == 1000
print(vars(MyClass)['multiplier'].fget.cache_info())

This prints:

CacheInfo(hits=98, misses=2, maxsize=128, currsize=2)
Ibbetson answered 5/6, 2021 at 5:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.