Is there a decorator to simply cache function return values?
Asked Answered
R

20

270

Consider the following:

@property
def name(self):

    if not hasattr(self, '_name'):

        # expensive calculation
        self._name = 1 + 1

    return self._name

I'm new, but I think the caching could be factored out into a decorator. Only I didn't find one like it ;)

PS the real calculation doesn't depend on mutable values

Rosie answered 2/5, 2009 at 16:15 Comment(2)
There may be a decorator out there that has some capability like that, but you haven't thoroughly specified what you want. What kind of caching backend are you using? And how will the value be keyed? I'm assuming from your code that what you are really asking for is a cached read-only property.Nonconformist
There are memoizing decorators that perform what you call "caching"; they typically work on functions as such (whether meant to become methods or not) whose results depend on their arguments (not on mutable things such as self!-) and so keep a separate memo-dict.Apron
P
61

Python 3.8 functools.cached_property decorator

https://docs.python.org/dev/library/functools.html#functools.cached_property

cached_property from Werkzeug was mentioned at: https://mcmap.net/q/108557/-is-there-a-decorator-to-simply-cache-function-return-values but a supposedly derived version will be merged into 3.8, which is awesome.

This decorator can be seen as caching @property, or as a cleaner @functools.lru_cache for when you don't have any arguments.

The docs say:

@functools.cached_property(func)

Transform a method of a class into a property whose value is computed once and then cached as a normal attribute for the life of the instance. Similar to property(), with the addition of caching. Useful for expensive computed properties of instances that are otherwise effectively immutable.

Example:

class DataSet:
    def __init__(self, sequence_of_numbers):
        self._data = sequence_of_numbers

    @cached_property
    def stdev(self):
        return statistics.stdev(self._data)

    @cached_property
    def variance(self):
        return statistics.variance(self._data)

New in version 3.8.

Note This decorator requires that the dict attribute on each instance be a mutable mapping. This means it will not work with some types, such as metaclasses (since the dict attributes on type instances are read-only proxies for the class namespace), and those that specify slots without including dict as one of the defined slots (as such classes don’t provide a dict attribute at all).

Planospore answered 12/5, 2019 at 9:15 Comment(1)
Can this be combined with the standard library shelve? I want properties cached to disk. Is there anything like that?Karilynn
C
303

Starting from Python 3.2 there is a built-in decorator:

@functools.lru_cache(maxsize=100, typed=False)

Decorator to wrap a function with a memoizing callable that saves up to the maxsize most recent calls. It can save time when an expensive or I/O bound function is periodically called with the same arguments.

Example of an LRU cache for computing Fibonacci numbers:

from functools import lru_cache

@lru_cache(maxsize=None)
def fib(n):
    if n < 2:
        return n
    return fib(n-1) + fib(n-2)

>>> print([fib(n) for n in range(16)])
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610]

>>> print(fib.cache_info())
CacheInfo(hits=28, misses=16, maxsize=None, currsize=16)

If you are stuck with Python 2.x, here's a list of other compatible memoization libraries:

Colloid answered 12/3, 2012 at 20:28 Comment(7)
Backport code.activestate.com/recipes/…Bowlder
the backport can now be found here: pypi.python.org/pypi/backports.functools_lru_cacheDisestablish
@gerrit in theory it works for hashable objects in general - although some hashable objects are only equal if they are the same object (like user-defined objects without an explicit __hash__() function).Hite
@Hite It works, but wrongly. If I pass a hashable, mutable argument, and change the value of the object after the first call of the function, the second call will return the changed, not the original, object. That is almost certainly not what the user wants. For it to work for mutable arguments would require lru_cache to make a copy of whatever result it's caching, and no such copy is being made in the functools.lru_cache implementation. Doing so would also risk creating hard-to-find memory problems when used to cache a large object.Mithridatism
@Mithridatism Would you mind following up here: #44583881 ? I didn't entirely follow your example.Hite
@Hite I'm not sure now. I suspect my argument applies to my own attempt to make a cache function that works with NumPy arrays, not to the built-in lru_cache function. I think mutable hashable objects are pretty rare anyway. Sorry for the confusion!Mithridatism
1) Where is the cache saved? RAM or disk? 2) For how long the cache is saved?Bankable
P
61

Python 3.8 functools.cached_property decorator

https://docs.python.org/dev/library/functools.html#functools.cached_property

cached_property from Werkzeug was mentioned at: https://mcmap.net/q/108557/-is-there-a-decorator-to-simply-cache-function-return-values but a supposedly derived version will be merged into 3.8, which is awesome.

This decorator can be seen as caching @property, or as a cleaner @functools.lru_cache for when you don't have any arguments.

The docs say:

@functools.cached_property(func)

Transform a method of a class into a property whose value is computed once and then cached as a normal attribute for the life of the instance. Similar to property(), with the addition of caching. Useful for expensive computed properties of instances that are otherwise effectively immutable.

Example:

class DataSet:
    def __init__(self, sequence_of_numbers):
        self._data = sequence_of_numbers

    @cached_property
    def stdev(self):
        return statistics.stdev(self._data)

    @cached_property
    def variance(self):
        return statistics.variance(self._data)

New in version 3.8.

Note This decorator requires that the dict attribute on each instance be a mutable mapping. This means it will not work with some types, such as metaclasses (since the dict attributes on type instances are read-only proxies for the class namespace), and those that specify slots without including dict as one of the defined slots (as such classes don’t provide a dict attribute at all).

Planospore answered 12/5, 2019 at 9:15 Comment(1)
Can this be combined with the standard library shelve? I want properties cached to disk. Is there anything like that?Karilynn
C
59

functools.cache has been released in Python 3.9 (docs):

from functools import cache

@cache
def factorial(n):
    return n * factorial(n-1) if n else 1

In previous Python versions, one of the early answers is still a valid solution: Using lru_cache as an ordinary cache without the limit and lru features. (docs)

If maxsize is set to None, the LRU feature is disabled and the cache can grow without bound.

Here is a prettier version of it:

cache = lru_cache(maxsize=None)

@cache
def func(param1):
   pass
Cassiodorus answered 26/10, 2020 at 12:26 Comment(3)
i have tested it, and it's only makes grows executions time! besides second, third ... time executions is the same time...Miscellanea
It will help if you call the same function with the same parameters again. Are you sure that parameters are the same?Cassiodorus
oh, i see now!! ( what a shame ~_~ ) for _ in range(1500): factorial(496) it has taken 0.0941 and only 0.0003 in cashed variant.. i thought this would be cached to speed up iteration the entire file, not just during one program loop, meh...Miscellanea
C
40

It sounds like you're not asking for a general-purpose memoization decorator (i.e., you're not interested in the general case where you want to cache return values for different argument values). That is, you'd like to have this:

x = obj.name  # expensive
y = obj.name  # cheap

while a general-purpose memoization decorator would give you this:

x = obj.name()  # expensive
y = obj.name()  # cheap

I submit that the method-call syntax is better style, because it suggests the possibility of expensive computation while the property syntax suggests a quick lookup.

[Update: The class-based memoization decorator I had linked to and quoted here previously doesn't work for methods. I've replaced it with a decorator function.] If you're willing to use a general-purpose memoization decorator, here's a simple one:

def memoize(function):
  memo = {}
  def wrapper(*args):
    if args in memo:
      return memo[args]
    else:
      rv = function(*args)
      memo[args] = rv
      return rv
  return wrapper

Example usage:

@memoize
def fibonacci(n):
  if n < 2: return n
  return fibonacci(n - 1) + fibonacci(n - 2)

Another memoization decorator with a limit on the cache size can be found here.

Cypress answered 2/5, 2009 at 16:42 Comment(11)
None of the decorators mentioned in all the answers work for methods! Probably because they're class-based. Only one self is passed? Others work fine, but it's crufty to store values in functions.Rosie
I think you may run into a problem if args is not hashable.Rotberg
@Rotberg Yes, the first decorator that I quoted here is limited to hashable types. The one at ActiveState (with the cache size limit) pickles the arguments into a (hashable) string which is of course more expensive but more general.Cypress
@vanity Thanks for pointing out the limitations of the class-based decorators. I've revised my answer to show a decorator function, which works for methods (I actually tested this one).Cypress
You might use an MD5 digest of the args to make it hash-able. Not sure if that's super performant or not.Colicroot
There is also the problem of handling calls with keyword arguments: this solution fails in this case.Gradygrae
Can you explain how the cache work? Because you initialize the memo = {} in function memorize. Then when you call different fibonacci , the fibonacci s call the same decorator memorize?Paulo
@SiminJie The decorator is only called once, and the wrapped function it returns is the same one used for all the different calls to fibonacci. That function always uses the same memo dictionary.Cypress
Sorry about the downvote. I don't know when it happened, but I must have clicked unintentionally.Newmann
@NathanKitchen Minor doubt: memo = {} is defined in a local scope. Shouldn't that be garbage collected when the decorator returns? (And hence the subsequent calls to the decorated function should return an error. Why does it not?)Namara
@Namara The wrapper function has a reference to memo, so it's not garbage-collected.Cypress
N
28
class memorize(dict):
    def __init__(self, func):
        self.func = func

    def __call__(self, *args):
        return self[args]

    def __missing__(self, key):
        result = self[key] = self.func(*key)
        return result

Sample uses:

>>> @memorize
... def foo(a, b):
...     return a * b
>>> foo(2, 4)
8
>>> foo
{(2, 4): 8}
>>> foo('hi', 3)
'hihihi'
>>> foo
{(2, 4): 8, ('hi', 3): 'hihihi'}
Neath answered 21/3, 2014 at 7:27 Comment(4)
Strange! How does this work? It does not seem like other decorators I've seen.Interstratify
This solution returns a TypeError if one uses keyword arguments, e.g. foo(3, b=5)Strontia
The problem of the solution, is that it doesn't have a memory limit. As for the named arguments, you can just add them to __ call__ and __ missing__ like **nargsPreserve
This doesn't seem to work for class functions, because there a TypeError is raised in __missing__: missing 1 required positional argument: 'self'Flame
O
13

Werkzeug has a cached_property decorator (docs, source)

Obtrusive answered 14/3, 2011 at 5:48 Comment(2)
Yes. This is worthwhile to distinguish from the general memoization case, as standard memoization doesn't work if the class isn't hashable.Maciemaciel
Now in Python 3.8: docs.python.org/dev/library/…Planospore
W
10

I coded this simple decorator class to cache function responses. I find it VERY useful for my projects:

from datetime import datetime, timedelta 

class cached(object):
    def __init__(self, *args, **kwargs):
        self.cached_function_responses = {}
        self.default_max_age = kwargs.get("default_cache_max_age", timedelta(seconds=0))

    def __call__(self, func):
        def inner(*args, **kwargs):
            max_age = kwargs.get('max_age', self.default_max_age)
            if not max_age or func not in self.cached_function_responses or (datetime.now() - self.cached_function_responses[func]['fetch_time'] > max_age):
                if 'max_age' in kwargs: del kwargs['max_age']
                res = func(*args, **kwargs)
                self.cached_function_responses[func] = {'data': res, 'fetch_time': datetime.now()}
            return self.cached_function_responses[func]['data']
        return inner

The usage is straightforward:

import time

@cached
def myfunc(a):
    print "in func"
    return (a, datetime.now())

@cached(default_max_age = timedelta(seconds=6))
def cacheable_test(a):
    print "in cacheable test: "
    return (a, datetime.now())


print cacheable_test(1,max_age=timedelta(seconds=5))
print cacheable_test(2,max_age=timedelta(seconds=5))
time.sleep(7)
print cacheable_test(3,max_age=timedelta(seconds=5))
Wenger answered 7/6, 2015 at 21:53 Comment(2)
Your first @cached is missing parenthesis. Else it will only return the cached object in place of myfunc and when called as myfunc() then inner will always be returned as a return valueAcromegaly
also cache only on function returning the same response for different argumentsLeptosome
R
9

Try joblib https://joblib.readthedocs.io/en/latest/memory.html

from joblib import Memory

# customize the decorator
memory = Memory(cachedir=cachedir, verbose=0)

@memory.cache
def f(x):
    print('Running f(%s)' % x)
    return x
Reger answered 3/3, 2018 at 8:19 Comment(2)
Note cache gets wiped out upon code edits. ai2-tango tries to get around this by adding a version property instead of relying on source code edits. Docs: ai2-tango.readthedocs.io/en/latest/api/components/…Sharynshashlik
and ai2-tango.readthedocs.io/en/latest/…Sharynshashlik
U
8

DISCLAIMER: I'm the author of kids.cache.

You should check kids.cache, it provides a @cache decorator that works on python 2 and python 3. No dependencies, ~100 lines of code. It's very straightforward to use, for instance, with your code in mind, you could use it like this:

pip install kids.cache

Then

from kids.cache import cache
...
class MyClass(object):
    ...
    @cache            # <-- That's all you need to do
    @property
    def name(self):
        return 1 + 1  # supposedly expensive calculation

Or you could put the @cache decorator after the @property (same result).

Using cache on a property is called lazy evaluation, kids.cache can do much more (it works on function with any arguments, properties, any type of methods, and even classes...). For advanced users, kids.cache supports cachetools which provides fancy cache stores to python 2 and python 3 (LRU, LFU, TTL, RR cache).

IMPORTANT NOTE: the default cache store of kids.cache is a standard dict, which is not recommended for long running program with ever different queries as it would lead to an ever growing caching store. For this usage you can plugin other cache stores using for instance (@cache(use=cachetools.LRUCache(maxsize=2)) to decorate your function/property/class/method...)

Unstick answered 27/4, 2015 at 8:46 Comment(5)
This module seems to result in a slow import time on python 2 ~0.9s (see: pastebin.com/raw/aA1ZBE9Z). I suspect that this is due to this line github.com/0k/kids.cache/blob/master/src/kids/__init__.py#L3 (c.f setuptools entry points). I am creating an issue for this.Kinlaw
Here is an issue for the above github.com/0k/kids.cache/issues/9 .Kinlaw
This would leads to memory leak.Souza
@Unstick create an instance c of MyClass, and inspect it with objgraph.show_backrefs([c], max_depth=10), there is a ref chain from the class object MyClass to c. That is to say, c would never been released until the MyClass been released.Souza
@TimothyZhang you are invited and welcome to add your concerns in github.com/0k/kids.cache/issues/10 . Stackoverflow is not the right place to have a proper discussion on that. And further clarification are needed. Thank you for your feedback.Unstick
S
7

Ah, just needed to find the right name for this: "Lazy property evaluation".

I do this a lot too; maybe I'll use that recipe in my code sometime.

Sculpin answered 3/5, 2009 at 3:25 Comment(0)
A
5

There is yet another example of a memoize decorator at Python Wiki:

http://wiki.python.org/moin/PythonDecoratorLibrary#Memoize

That example is a bit smart, because it won't cache the results if the parameters are mutable. (check that code, it's very simple and interesting!)

Antifederalist answered 14/1, 2010 at 0:40 Comment(0)
K
4

If you are using Django Framework, it has such a property to cache a view or response of API's using @cache_page(time) and there can be other options as well.

Example:

@cache_page(60 * 15, cache="special_cache")
def my_view(request):
    ...

More details can be found here.

Kylander answered 9/5, 2013 at 10:0 Comment(0)
I
4

There is fastcache, which is "C implementation of Python 3 functools.lru_cache. Provides speedup of 10-30x over standard library."

Same as chosen answer, just different import:

from fastcache import lru_cache
@lru_cache(maxsize=128, typed=False)
def f(a, b):
    pass

Also, it comes installed in Anaconda, unlike functools which needs to be installed.

Ingham answered 8/11, 2018 at 13:6 Comment(2)
functools is part of the standard library, the link you've posted is to a random git fork or something else...Marque
This is completely outdated, the standard library is now much fasterSaprophagous
L
3

Along with the Memoize Example I found the following python packages:

  • cachepy; It allows to set up ttl and\or the number of calls for cached functions; Also, one can use encrypted file-based cache...
  • percache
Lor answered 26/4, 2016 at 6:36 Comment(0)
S
3

@lru_cache is not good with default attrs

my @mem decorator:

import inspect
from copy import deepcopy
from functools import lru_cache, wraps
from typing import Any, Callable, Dict, Iterable


# helper
def get_all_kwargs_values(f: Callable, kwargs: Dict[str, Any]) -> Iterable[Any]:
    default_kwargs = {
        k: v.default
        for k, v in inspect.signature(f).parameters.items()
        if v.default is not inspect.Parameter.empty
    }

    all_kwargs = deepcopy(default_kwargs)
    all_kwargs.update(kwargs)

    for key in sorted(all_kwargs.keys()):
        yield all_kwargs[key]


# the best decorator
def mem(func: Callable) -> Callable:
    cache = dict()

    @wraps(func)
    def wrapper(*args, **kwargs) -> Any:
        all_kwargs_values = get_all_kwargs_values(func, kwargs)
        params = (*args, *all_kwargs_values)
        _hash = hash(params)

        if _hash not in cache:
            cache[_hash] = func(*args, **kwargs)

        return cache[_hash]

    return wrapper


# some logic
def counter(*args) -> int:
    print(f'* not_cached:', end='\t')
    return sum(args)


@mem
def check_mem(a, *args, z=10) -> int:
    return counter(a, *args, z)


@lru_cache
def check_lru(a, *args, z=10) -> int:
    return counter(a, *args, z)


def test(func) -> None:
    print(f'\nTest {func.__name__}:')

    print('*', func(1, 2, 3, 4, 5))
    print('*', func(1, 2, 3, 4, 5))
    print('*', func(1, 2, 3, 4, 5, z=6))
    print('*', func(1, 2, 3, 4, 5, z=6))
    print('*', func(1))
    print('*', func(1, z=10))


def main():
    test(check_mem)
    test(check_lru)


if __name__ == '__main__':
    main()

output:

Test check_mem:
* not_cached:   * 25
* 25
* not_cached:   * 21
* 21
* not_cached:   * 11
* 11

Test check_lru:
* not_cached:   * 25
* 25
* not_cached:   * 21
* 21
* not_cached:   * 11
* not_cached:   * 11
Scathe answered 24/12, 2018 at 23:57 Comment(0)
N
3

Create your own decorator and use it

from django.core.cache import cache
import functools

def cache_returned_values(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        key = "choose a unique key here"
        results = cache.get(key)
        if not results:
            results = func(*args, **kwargs)
            cache.set(key, results)
        return results

    return wrapper

Now at the function side

@cache_returned_values
def get_some_values(args):
  return x
Neocene answered 2/12, 2022 at 1:51 Comment(0)
S
2

I implemented something like this, using pickle for persistance and using sha1 for short almost-certainly-unique IDs. Basically the cache hashed the code of the function and the hist of arguments to get a sha1 then looked for a file with that sha1 in the name. If it existed, it opened it and returned the result; if not, it calls the function and saves the result (optionally only saving if it took a certain amount of time to process).

That said, I'd swear I found an existing module that did this and find myself here trying to find that module... The closest I can find is this, which looks about right: http://chase-seibert.github.io/blog/2011/11/23/pythondjango-disk-based-caching-decorator.html

The only problem I see with that is it wouldn't work well for large inputs since it hashes str(arg), which isn't unique for giant arrays.

It would be nice if there were a unique_hash() protocol that had a class return a secure hash of its contents. I basically manually implemented that for the types I cared about.

Suzisuzie answered 12/9, 2013 at 21:1 Comment(0)
S
2

If you are using Django and want to cache views, see Nikhil Kumar's answer.


But if you want to cache ANY function results, you can use django-cache-utils.

It reuses Django caches and provides easy to use cached decorator:

from cache_utils.decorators import cached

@cached(60)
def foo(x, y=0):
    print 'foo is called'
    return x+y
Showboat answered 15/3, 2018 at 23:0 Comment(0)
M
2

Function cache simple solution

with ttl (time to life) and max_entries

  • doesnt work when the decorated function takes unhashable types as input (e.g. dicts)
  • optional parameter: ttl (time to live for every entry)
  • optional parameter: max_entries (if too many cache argument combination to no clutter the storage)
  • make sure the function has no important side effects

Example use

import time

@cache(ttl=timedelta(minutes=3), max_entries=300)
def add(a, b):
    time.sleep(2)
    return a + b

@cache()
def substract(a, b):
    time.sleep(2)
    return a - b

a = 5
# function is called with argument combinations the first time -> it takes some time
for i in range(5):
    print(add(a, i))

# function is called with same arguments again? -> will answer from cache
for i in range(5):
    print(add(a, i))

Copy the decorator code

from datetime import datetime, timedelta

def cache(**kwargs):
  def decorator(function):
    # static function variable for cache, lazy initialization
    try: function.cache
    except: function.cache = {}
    def wrapper(*args):
        # if nothing valid in cache, insert something
        if not args in function.cache or datetime.now() > function.cache[args]['expiry']:
            if 'max_entries' in kwargs:
                max_entries = kwargs['max_entries']
                if max_entries != None and len(function.cache) >= max_entries:
                    now = datetime.now()
                    # delete the the first expired entry that can be found (lazy deletion)
                    for key in function.cache:
                        if function.cache[key]['expiry'] < now:
                            del function.cache[key]
                            break
                    # if nothing is expired that is deletable, delete the first
                    if len(function.cache) >= max_entries:
                        del function.cache[next(iter(function.cache))]
            function.cache[args] = {'result': function(*args), 'expiry': datetime.max if 'ttl' not in kwargs else datetime.now() + kwargs['ttl']}

        # answer from cache
        return function.cache[args]['result']
    return wrapper
  return decorator
Maniac answered 16/5, 2021 at 9:46 Comment(1)
Adding a TTL is a good idea. However, the time complexity is O(max_iters) if max_iters is set because of the for key in function.cache.keys() operation. You may think of a way to remove expired items when they are requested (lazy) or when the dict is full (remove the first one in the dict. dict keeps insertion order in Python 3.7+. You can use OrderedDict for older versions)Cassiodorus
M
1
from functools import wraps


def cache(maxsize=128):
    cache = {}

    def decorator(func):
        @wraps(func)
        def inner(*args, no_cache=False, **kwargs):
            if no_cache:
                return func(*args, **kwargs)

            key_base = "_".join(str(x) for x in args)
            key_end = "_".join(f"{k}:{v}" for k, v in kwargs.items())
            key = f"{key_base}-{key_end}"

            if key in cache:
                return cache[key]

            res = func(*args, **kwargs)

            if len(cache) > maxsize:
                del cache[list(cache.keys())[0]]
                cache[key] = res

            return res

        return inner

    return decorator


def async_cache(maxsize=128):
    cache = {}

    def decorator(func):
        @wraps(func)
        async def inner(*args, no_cache=False, **kwargs):
            if no_cache:
                return await func(*args, **kwargs)

            key_base = "_".join(str(x) for x in args)
            key_end = "_".join(f"{k}:{v}" for k, v in kwargs.items())
            key = f"{key_base}-{key_end}"

            if key in cache:
                return cache[key]

            res = await func(*args, **kwargs)

            if len(cache) > maxsize:
                del cache[list(cache.keys())[0]]
                cache[key] = res

            return res

        return inner

    return decorator

Example use

import asyncio
import aiohttp


# Removes the aiohttp ClientSession instance warning.
class HTTPSession(aiohttp.ClientSession):
    """ Abstract class for aiohttp. """
    
    def __init__(self, loop=None) -> None:
        super().__init__(loop=loop or asyncio.get_event_loop())

    def __del__(self) -> None:
        if not self.closed:
            self.loop.run_until_complete(self.close())
            self.loop.close()
 

        return 
       

            

session = HTTPSession()

@async_cache()
async def query(url, method="get", res_method="text", *args, **kwargs):
    async with getattr(session, method.lower())(url, *args, **kwargs) as res:
        return await getattr(res, res_method)()


async def get(url, *args, **kwargs):
    return await query(url, "get", *args, **kwargs)
 

async def post(url, *args, **kwargs):
    return await query(url, "post", *args, **kwargs)

async def delete(url, *args, **kwargs):
    return await query(url, "delete", *args, **kwargs)
Metamorphosis answered 21/1, 2022 at 0:10 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Ferriage

© 2022 - 2024 — McMap. All rights reserved.