How to implement memoization with "pure" Functional Programming in JavaScript
Asked Answered
A

5

6

In all the examples I can find on memoization / internal function cache in Functional Programming in JavaScript, the examples are either mutating or reassigning the cache.

Here's an example taken from https://scotch.io/tutorials/understanding-memoization-in-javascript#toc-a-functional-approach

function memoizer(fun){
    let cache = {}
    return function (n){
        if (cache[n] != undefined ) {
          return cache[n]
        } else {
          let result = fun(n)
          cache[n] = result
          return result
        }
    }
}

The only tiny improvement I can come up with is to use reassigning instead of mutating the cache object:

function memoizer(fun){
    let cache = {}
    return function (n){
        if (cache[n] != undefined ) {
          return cache[n]
        } else {
          let result = fun(n)
          cache = {... cache, [n]: result}
          return result
        }
    }
}

But my question is how to do this in a "pure" functional way, without mutations or reassigning a let/var?

Afterdamp answered 7/12, 2020 at 20:4 Comment(5)
Re-creating the whole cache seems pretty inefficient.Antifreeze
You would need to wrap the function so that it takes a tuple of old cache and arguments and returns another tuple of new cache and result. This makes it very cumbersome to use, but it's possible.Sulfonmethane
I'm not saying that there's no place on SO for interesting puzzle questions (not downvoting or VTC this), but on a practical level isn't this kinda just pure wankery? Your function (and by construction all the functions you feed it) will always return the same value for the same input.Afc
I'd use a lazy unfold that represents the entire result set (codomain) of your function but only lazily, i.e. on demand evaluated. I think this is the mechanism used in purely functional programming. However, I never implemented it so its just a teaser.Auscultation
@scriptum Thanks, but can't really find much about "lazy unload" but did find this about "unload" ramdajs.com/docs/#unfold and nabilhassein.github.io/blog/unfold - but you're talking about doing it lazily / on demand. Can you give a little more info on what you had in mind?Afterdamp
A
2

This partially answers the question how to memoize all kind of functions in a purely functional manner. The solution is completely different from your imperative one and the code is not production ready but just a proof of concept. Please note that I am also a rookie on this subject.

The answer is partial, because I am only going to show a memoize function whose domain (arguments) are integers. A more advanced polymorphic memoization function can handle all kinds of argument types and also recursive functions.

The basic idea is to have a function that operates on an infinite list of integers. A list can be only infinite if it is non-strictly evaluated, that is to say only evaluated when needed and only just enough. We will later see that non-strictness is not enough for memoization, we need proper lazy evaluation.

In the first example I am going to use explicit thunks (e.g. nullary functions of shape () => "do something" for the sake of simplicity:

// INFINITE LIST

// we only mimick such a type

const iterate = f => {
  const go = x =>
    [x, () => go(f(x))];

  return go;
};

// Functor

const map = f => {
  const go = ([x, thunk]) =>
    [log(f(x)), () => go(thunk())];

  return go;
};

// element lookup

const lookup = ([x, thunk]) => i =>
  i === 0
    ? x
    : lookup(thunk()) (i - 1);

// memoization

const memoize = f =>
  lookup(
    map(f)
      (iterate(x => x + 1) (0)));

// auxiliary function

const log = x => (console.log(x), x);

const fib = n => {
  return n > 1
    ? fib(n - 1) + fib(n - 2)
    : n;
}

// MAIN

const foo = memoize(fib);

console.log("yields:", foo(6)); // logs 1, 1, 2, 3, 5, 8
console.log("yields:", foo(6)); // logs 1, 1, 2, 3, 5, 8

This works with an infinite sequence of fibonacci numbers and yields the expected result, but there is still no memoization. Right now it is just a cumbersome way to calculate places from the fibunacci sequence.

While this algorithm is non-strict, we need proper lazy evaluation as already mentioned. Lazy evaluation means non-strict evaluation with sharing of once computed thunks. We can mimic lazy evaluation with the native Proxy type in Javascript. Please note that the explicit thunks from the above example are now replaced with implicit ones, i.e. you don't need to call (thunk()) but just use them as if they were ordinary values. Here is a working sketch:

const NULL = "null";

const THUNK = "thunk";

const thunk = thunk =>
  new Proxy(thunk, new ThunkProxy());

class ThunkProxy {
  constructor() {
    this.memo = NULL;
  }

  apply(g, that, args) {
    if (this.memo === NULL) {
      this.memo = g();

      while (this.memo && this.memo[THUNK] === true)
        this.memo = this.memo.valueOf();
    }

    return this.memo(...args);
  }

  get(g, k) {
    if (k === THUNK)
      return true;

    else if (this.memo === NULL) {
      this.memo = g();
      
      while (this.memo && this.memo[THUNK] === true)
        this.memo = this.memo.valueOf();
    }

    if (k === "valueOf")
      return () => this.memo

    else if (k === "toString")
      return () => this.memo.toString();

    else if (k === Symbol.toStringTag)
      return Object.prototype.toString.call(this.memo).slice(8, -1);

    while (this.memo[k] && this.memo[k] [THUNK] === true)
      this.memo[k] = this.memo[k].valueOf();

    if (typeof this.memo[k] === "function")
      return this.memo[k].bind(this.memo);

    else return this.memo[k];
  }

  has(g, k) {
    if (k === THUNK)
      return true;

    else if (this.memo === NULL) {
      this.memo = g();

      while (this.memo && this.memo[THUNK] === true)
        this.memo = this.memo.valueOf();
    }

    return k in this.memo;
  }
}

const iterate = f => {
  const go = x =>
    [x, thunk(() => go(f(x)))];

  return go;
};

const lookup = ([head, tail]) => i =>
  i === 0
    ? head
    : lookup(tail) (i - 1);

const map = f => {
  const go = ([head, tail]) =>
    [log(f(head)), thunk(() => go(tail))];

  return go;
};

const memoize = f =>
  lookup(
    map(f)
      (iterate(x => x + 1) (0)));

const log = x => (console.log(x), x);

const fib = n => {
  return n > 1
    ? fib(n - 1) + fib(n - 2)
    : n;
}

const foo = memoize(fib);

console.log("yields:", foo(6)); // logs 0, 1, 1, 2, 3, 5, 8, "yields: 8"
console.log("yields:", foo(6)); // logs "yields: 8"
console.log("yields:", foo(10)); // logs 13, 21, 34, 55, "yields: 55"

Finally we've achieved the desired memoization effect, but the way I mimic lazy evaluation in Javascript is impure as well - I am just cheating. In a purely functional language like Haskell, however, lazy evaluation is an implementation detail of the language and thus not part of the syntax you actually use. This way the language itself remains pure.

Please note that fib is a recursive function and memoize does not handle the recursive steps but only intermediate and the final result.

Auscultation answered 8/12, 2020 at 13:29 Comment(3)
First of all thank you so much for taking the time to write this great answer. When I saw you were using a class I was first a bit skeptical, but the point you state in the 2nd last paragraph made me finally understand that the reason you can't do this 100% purely functional in JS is that it is an implementation details in a pure language such as Haskell. Thinking about that opens a new world of possibilities. I'll need some time to read and re-read your answer, to fully understand it. Thanks again, much appreciated!!Afterdamp
I've used the ThunkProxy class in more complex scenarios (a lazy list monad transformer, for instance) and it works quite reliable. There is an issue with the weak head normal form (WHNF), though. AFAIK every bare thunk (unevaluated sub-expression) in Haskell is immediately evaluated into WHNF, i.e. [1, thunk] remains as is, but thunk is automatically evaluated regarding the outermost layer. So thunk is evaluated to [1, thunk], for instance, but not any further. ThunkProxy does not mimic this behaior correctly. You have to do it manually sometimes. It is too lazy so to speak.Auscultation
Please also note that I used a simplified version of ThunkProxy. The full implementation is available on Github via my profile page.Auscultation
P
6

With the definition of a "pure function" being:

In computer programming, a pure function is a function that has the following properties:[1][2]

Its return value is the same for the same arguments (no variation with local static variables, non-local variables, mutable reference arguments or input streams from I/O devices). Its evaluation has no side effects (no mutation of local static variables, non-local variables, mutable reference arguments or I/O streams).


I think we can see that what you have is a pure function, if the same values are passed to the function, you will always get the same result; there is nothing in that method that would be affected by any external state beyond it –– that is unless the passed fun argument isn't pure, but that isn't an issue with your memoizer method itself.

EDIT Because the cache object is not a static variable, mutating it does not violate any of the rules of what makes a pure function pure. Definition of "side effects" as used by Wikipedia in the explanation above:

In computer science, an operation, function or expression is said to have a side effect if it modifies some state variable value(s) outside its local environment, that is to say has an observable effect besides returning a value (the main effect) to the invoker of the operation.

Pope answered 7/12, 2020 at 20:10 Comment(5)
There's mutation of the cache variable, that violates the "no mutation" restriction.Expatriate
"Because the cache object is not a static variable, mutating it does not violate any of the rules of what makes a pure function pure." - no, that has nothing to do with it. Mutating any non-local state is problematic, regardless whether it's "static" or "global" or something else.Sulfonmethane
@Sulfonmethane I take your point, I was trying to bridge the gap between what Barmar has said and what was on the wikipedia definition.Pope
The function might be pure, but like @Expatriate mentioned it violates the "no mutation" restriction. My question is how to change this function so that it does not violate any of the functional programming principles.Afterdamp
@Afterdamp I think people are misunderstanding what the "no mutation" restriction is. That is referring to mutating things outside the scope of the function – thereby potentially providing a different result with the same inputs. Please see this heavily upvotted SO post about function purity.Pope
A
2

This partially answers the question how to memoize all kind of functions in a purely functional manner. The solution is completely different from your imperative one and the code is not production ready but just a proof of concept. Please note that I am also a rookie on this subject.

The answer is partial, because I am only going to show a memoize function whose domain (arguments) are integers. A more advanced polymorphic memoization function can handle all kinds of argument types and also recursive functions.

The basic idea is to have a function that operates on an infinite list of integers. A list can be only infinite if it is non-strictly evaluated, that is to say only evaluated when needed and only just enough. We will later see that non-strictness is not enough for memoization, we need proper lazy evaluation.

In the first example I am going to use explicit thunks (e.g. nullary functions of shape () => "do something" for the sake of simplicity:

// INFINITE LIST

// we only mimick such a type

const iterate = f => {
  const go = x =>
    [x, () => go(f(x))];

  return go;
};

// Functor

const map = f => {
  const go = ([x, thunk]) =>
    [log(f(x)), () => go(thunk())];

  return go;
};

// element lookup

const lookup = ([x, thunk]) => i =>
  i === 0
    ? x
    : lookup(thunk()) (i - 1);

// memoization

const memoize = f =>
  lookup(
    map(f)
      (iterate(x => x + 1) (0)));

// auxiliary function

const log = x => (console.log(x), x);

const fib = n => {
  return n > 1
    ? fib(n - 1) + fib(n - 2)
    : n;
}

// MAIN

const foo = memoize(fib);

console.log("yields:", foo(6)); // logs 1, 1, 2, 3, 5, 8
console.log("yields:", foo(6)); // logs 1, 1, 2, 3, 5, 8

This works with an infinite sequence of fibonacci numbers and yields the expected result, but there is still no memoization. Right now it is just a cumbersome way to calculate places from the fibunacci sequence.

While this algorithm is non-strict, we need proper lazy evaluation as already mentioned. Lazy evaluation means non-strict evaluation with sharing of once computed thunks. We can mimic lazy evaluation with the native Proxy type in Javascript. Please note that the explicit thunks from the above example are now replaced with implicit ones, i.e. you don't need to call (thunk()) but just use them as if they were ordinary values. Here is a working sketch:

const NULL = "null";

const THUNK = "thunk";

const thunk = thunk =>
  new Proxy(thunk, new ThunkProxy());

class ThunkProxy {
  constructor() {
    this.memo = NULL;
  }

  apply(g, that, args) {
    if (this.memo === NULL) {
      this.memo = g();

      while (this.memo && this.memo[THUNK] === true)
        this.memo = this.memo.valueOf();
    }

    return this.memo(...args);
  }

  get(g, k) {
    if (k === THUNK)
      return true;

    else if (this.memo === NULL) {
      this.memo = g();
      
      while (this.memo && this.memo[THUNK] === true)
        this.memo = this.memo.valueOf();
    }

    if (k === "valueOf")
      return () => this.memo

    else if (k === "toString")
      return () => this.memo.toString();

    else if (k === Symbol.toStringTag)
      return Object.prototype.toString.call(this.memo).slice(8, -1);

    while (this.memo[k] && this.memo[k] [THUNK] === true)
      this.memo[k] = this.memo[k].valueOf();

    if (typeof this.memo[k] === "function")
      return this.memo[k].bind(this.memo);

    else return this.memo[k];
  }

  has(g, k) {
    if (k === THUNK)
      return true;

    else if (this.memo === NULL) {
      this.memo = g();

      while (this.memo && this.memo[THUNK] === true)
        this.memo = this.memo.valueOf();
    }

    return k in this.memo;
  }
}

const iterate = f => {
  const go = x =>
    [x, thunk(() => go(f(x)))];

  return go;
};

const lookup = ([head, tail]) => i =>
  i === 0
    ? head
    : lookup(tail) (i - 1);

const map = f => {
  const go = ([head, tail]) =>
    [log(f(head)), thunk(() => go(tail))];

  return go;
};

const memoize = f =>
  lookup(
    map(f)
      (iterate(x => x + 1) (0)));

const log = x => (console.log(x), x);

const fib = n => {
  return n > 1
    ? fib(n - 1) + fib(n - 2)
    : n;
}

const foo = memoize(fib);

console.log("yields:", foo(6)); // logs 0, 1, 1, 2, 3, 5, 8, "yields: 8"
console.log("yields:", foo(6)); // logs "yields: 8"
console.log("yields:", foo(10)); // logs 13, 21, 34, 55, "yields: 55"

Finally we've achieved the desired memoization effect, but the way I mimic lazy evaluation in Javascript is impure as well - I am just cheating. In a purely functional language like Haskell, however, lazy evaluation is an implementation detail of the language and thus not part of the syntax you actually use. This way the language itself remains pure.

Please note that fib is a recursive function and memoize does not handle the recursive steps but only intermediate and the final result.

Auscultation answered 8/12, 2020 at 13:29 Comment(3)
First of all thank you so much for taking the time to write this great answer. When I saw you were using a class I was first a bit skeptical, but the point you state in the 2nd last paragraph made me finally understand that the reason you can't do this 100% purely functional in JS is that it is an implementation details in a pure language such as Haskell. Thinking about that opens a new world of possibilities. I'll need some time to read and re-read your answer, to fully understand it. Thanks again, much appreciated!!Afterdamp
I've used the ThunkProxy class in more complex scenarios (a lazy list monad transformer, for instance) and it works quite reliable. There is an issue with the weak head normal form (WHNF), though. AFAIK every bare thunk (unevaluated sub-expression) in Haskell is immediately evaluated into WHNF, i.e. [1, thunk] remains as is, but thunk is automatically evaluated regarding the outermost layer. So thunk is evaluated to [1, thunk], for instance, but not any further. ThunkProxy does not mimic this behaior correctly. You have to do it manually sometimes. It is too lazy so to speak.Auscultation
Please also note that I used a simplified version of ThunkProxy. The full implementation is available on Github via my profile page.Auscultation
M
2
(define (fib-memo n memo k)
  (let ((p (assv n memo)))
    (if p
        (let ((v (cdr p)))
          (k v memo))
        (fib-memo (- n 1) memo
                  (lambda (v1 memo^)
                    (fib-memo (- n 2) memo^
                              (lambda (v2 memo^^)
                                (let ((v (+ v1 v2)))
                                  (k v (cons (cons n v) memo^^))))))))))
(define (fib n)
  (fib-memo n '((1 . 1) (0 . 0)) (lambda (v memo) v)))

The Scheme code above is a purely functional memoized version of fibonacci using continuation-passing style to accumulate information as the computation progresses.

Milner answered 26/1, 2022 at 20:57 Comment(0)
L
0

As long as the variable cache remains private to the function I can't think of anything that could go wrong by mutating it. Incidentally recreating a new cache object isn't helpful because you do re-assign to the same reference anyway.

You have to update the cache somehow for your memoizer to be useful. You just need to make sure you're doing it in a safe way. Don't read or write from a global cache for example.

I'm sure there are ways to do this with monads (or other functional programming constructs) but I'd don't feel qualified enough to talk about it. Keeping your function pure (and by the look of it it is) is already a big win in my opinion.

Note that in the event where applying the memoized function to n does return undefined (or null) this would never be cached because with cache[n] != undefined you cannot differentiate between a call you didn't make yet and a call that genuinely returned undefined. You should perhaps do something like cache.hasOwnProperty(n) instead.

See also Is mutating accumulator in reduce function considered bad practice?


An alternative implementation of memoizer could use Function#bind but admittedly this may not be a better way of doing it.

const memoizer =
  fn =>
    ((cache, n) =>
      n in cache
        ? cache[n]
        : cache[n] = fn(n)).bind(null, {});
Lodging answered 7/12, 2020 at 20:23 Comment(3)
bind is not better, just more confusing.Sulfonmethane
"Don't read or write from a global cache for example." - why do you think that would be unsafe?Sulfonmethane
@Sulfonmethane If the cache is publicly available that would be unsafe in my opinion as you cannot guarantee (or not easily guarantee) that it hadn't been tampered with. (If that's a useful addition to make to the post I can certainly add that.) As for bind I do agree (and I did mention it) but I shared the alternative in case OP finds it useful.Lodging
C
0

In a pure functional language we are not allowed to mutate values. However, it is possible to change the state of a variable from uninitialized to initialized during execution. This can be leveraged for memorization.

We provide an example of how to implement the Fibonacci Recursion in the pure functional language nix below.

As lists are lazily evaluated in nix, we can incrementally initialize values in a list, and compute them on first access. Subsequent accesses do not re-trigger computation but used the "cached" value:

let 
  # iota n = [1 2 3 ... n]
  iota = n: if n == 0 then [] else (iota (n - 1)) ++ [n];
  fib = n: 
    let
      results = map (x: g x) ([0] ++ (iota n));
      f = x:
        if x <= 2 then 1 else
          (elemAt results (x - 1)) + (elemAt results (x - 2));
      g = x: trace "Called f ${toString x}" (f x);
    in
      g n;
in 
  fib 5;
; nix repl -f fib.nix
trace: Called f 5
trace: Called f 4
trace: Called f 3
trace: Called f 2
trace: Called f 1
5

Code example with tests is available on GitHub

Cotsen answered 24/3, 2024 at 12:37 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.