How are nested functions and lexical scope compiled in JVM languages?
Asked Answered
P

1

4

As a concrete example for my question, here's a snippet in Python (which should be readable to the broadest number of people and which has a JVM implementation anyway):

def memo(f):
  cache = {}
  def g(*args):
    if args not in cache:
      cache[args] = f(*args)
    return cache[args]
  return g

How do industrial-strength languages compile a definition like this, in order to realize static scope? What if we only have nested definition but no higher order function-value parameters or return values, à la Pascal (and hence no need for closures)? I'm guessing that calculating static links is out, since you can't access the stack frame of a method call. So what is commonly done? Do they build anonymous inner classes? Lambda lifting? Something else?

Sorry if this is a question that's been asked before; it seems like it must be, but I haven't found anything that's quite right.

Pippo answered 20/12, 2019 at 16:20 Comment(2)
AFAIK (note I may be wrong) what Scala does for inner methods is that they get translated to normal methods at the class scope but they are marked private so they are not visible on Java (but using reflection you can reach them) and the scala compiler ensures that they are only accessible inside their enclosing scope. If you return such a method then eta-expansion kick is and creates a function from it, since 2.12 onwards they are encoded the same way a JVM lambdas (which is basically runtime anonymous classes).Methyl
in clojure, as far as i know, functions are compiled to plain java inner classes, implementing IFn interface, so they simply capture the scope variables, just as inner classes do. Don't know exactly about Jython, but pretty sure it does something like that. geeksforgeeks.org/inner-class-javaBoozer
C
6

I will be answering your question from the standpoint of Clojure, the only JVM language whose translation strategy I know intimately. For concreteness, I have translated your Python to the following Clojure (not idiomatic or thread-safe, but this is not important here):

(defn memo [f]
  (let [cache (atom {})]
    (fn g [& args]
      (when-not (contains? (@cache args))
        (swap! cache assoc args (apply f args)))
      (get @cache args))))

Inner classes (mentioned in the question and the comments) are a convenience for programmers, and the compiler doesn't need them1. Each Clojure function definition (not function invocation!) corresponds to a single top-level class implementing clojure.lang.IFn (usually through some abstract helper class). In that class, each closed-over lexical variable is saved as a field; these are initialized in the constructor. So this function definition expands to something like:

class memo extends AFunction {
  // static constants...
  public Object invoke(Object f) {
    Object cache = ...;
    return new memo$g__1723(cache);
  }
}

class memo$g__1723 extends RestFn {
  static Object swap_BANG_ = RT.var("clojure.core", "swap!");
  static Object assoc = RT.var("clojure.core", "assoc");
  static Object apply = RT.var("clojure.core", "apply");
  // ... more static constants for each function used ...

  Object f;
  Object cache;

  public memo$g__1723(Object f, Object cache) {
    this.f = f;
    this.cache = cache;
  }

  public int getRequiredArity() { return 0;}

  public Object applyTo(ISeq args) {
    Object cache = this.cache;
    if (/*...*/) {
      ((IFn)swap_BANG_).invoke(cache, assoc, args,
          ((IFn)apply).invoke(this.f, args));
    }
    return /*...*/;
  }
}

1In fact, in the version of Java that Clojure targets, inner classes don't exist at the JVM level - they are a fiction that the java compiler translates into separate top-level classes with secret access mechanisms, much as Clojure translates nested functions to top-level classes. In more recent versions of Java, the VM itself does actually understand nested classes.

For completeness, the full disassembled bytecode for memo and its inner function follows below.

$ javap -c -p 'tmp$memo' 'tmp$memo$g__1723'
Compiled from "tmp.clj"
public final class tmp$memo extends clojure.lang.AFunction {
  public static final clojure.lang.Var const__0;

  public tmp$memo();
    Code:
       0: aload_0
       1: invokespecial #9                  // Method clojure/lang/AFunction."<init>":()V
       4: return

  public static java.lang.Object invokeStatic(java.lang.Object);
    Code:
       0: getstatic     #15                 // Field const__0:Lclojure/lang/Var;
       3: invokevirtual #21                 // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
       6: checkcast     #23                 // class clojure/lang/IFn
       9: getstatic     #29                 // Field clojure/lang/PersistentArrayMap.EMPTY:Lclojure/lang/PersistentArrayMap;
      12: invokeinterface #32,  2           // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;)Ljava/lang/Object;
      17: astore_1
      18: new           #34                 // class tmp$memo$g__1723
      21: dup
      22: aload_1
      23: aconst_null
      24: astore_1
      25: aload_0
      26: aconst_null
      27: astore_0
      28: invokespecial #37                 // Method tmp$memo$g__1723."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V
      31: areturn

  public java.lang.Object invoke(java.lang.Object);
    Code:
       0: aload_1
       1: aconst_null
       2: astore_1
       3: invokestatic  #42                 // Method invokeStatic:(Ljava/lang/Object;)Ljava/lang/Object;
       6: areturn

  public static {};
    Code:
       0: ldc           #45                 // String clojure.core
       2: ldc           #47                 // String atom
       4: invokestatic  #53                 // Method clojure/lang/RT.var:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Var;
       7: checkcast     #17                 // class clojure/lang/Var
      10: putstatic     #15                 // Field const__0:Lclojure/lang/Var;
      13: return
}
Compiled from "tmp.clj"
public final class tmp$memo$g__1723 extends clojure.lang.RestFn {
  java.lang.Object cache;

  java.lang.Object f;

  public static final clojure.lang.Var const__0;

  public static final clojure.lang.Var const__1;

  public static final clojure.lang.Var const__2;

  public static final clojure.lang.Var const__3;

  public static final clojure.lang.Var const__4;

  public tmp$memo$g__1723(java.lang.Object, java.lang.Object);
    Code:
       0: aload_0
       1: invokespecial #13                 // Method clojure/lang/RestFn."<init>":()V
       4: aload_0
       5: aload_1
       6: putfield      #15                 // Field cache:Ljava/lang/Object;
       9: aload_0
      10: aload_2
      11: putfield      #17                 // Field f:Ljava/lang/Object;
      14: return

  public java.lang.Object doInvoke(java.lang.Object);
    Code:
       0: getstatic     #23                 // Field const__0:Lclojure/lang/Var;
       3: invokevirtual #29                 // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
       6: checkcast     #31                 // class clojure/lang/IFn
       9: getstatic     #34                 // Field const__1:Lclojure/lang/Var;
      12: invokevirtual #29                 // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
      15: checkcast     #31                 // class clojure/lang/IFn
      18: aload_0
      19: getfield      #15                 // Field cache:Ljava/lang/Object;
      22: invokeinterface #37,  2           // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;)Ljava/lang/Object;
      27: checkcast     #31                 // class clojure/lang/IFn
      30: aload_1
      31: invokeinterface #37,  2           // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;)Ljava/lang/Object;
      36: invokeinterface #37,  2           // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;)Ljava/lang/Object;
      41: dup
      42: ifnull        56
      45: getstatic     #43                 // Field java/lang/Boolean.FALSE:Ljava/lang/Boolean;
      48: if_acmpeq     57
      51: aconst_null
      52: pop
      53: goto          102
      56: pop
      57: getstatic     #46                 // Field const__2:Lclojure/lang/Var;
      60: invokevirtual #29                 // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
      63: checkcast     #31                 // class clojure/lang/IFn
      66: aload_0
      67: getfield      #15                 // Field cache:Ljava/lang/Object;
      70: getstatic     #49                 // Field const__3:Lclojure/lang/Var;
      73: invokevirtual #29                 // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
      76: aload_1
      77: getstatic     #52                 // Field const__4:Lclojure/lang/Var;
      80: invokevirtual #29                 // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
      83: checkcast     #31                 // class clojure/lang/IFn
      86: aload_0
      87: getfield      #17                 // Field f:Ljava/lang/Object;
      90: aload_1
      91: invokeinterface #55,  3           // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
      96: invokeinterface #58,  5           // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
     101: pop
     102: getstatic     #34                 // Field const__1:Lclojure/lang/Var;
     105: invokevirtual #29                 // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
     108: checkcast     #31                 // class clojure/lang/IFn
     111: aload_0
     112: getfield      #15                 // Field cache:Ljava/lang/Object;
     115: invokeinterface #37,  2           // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;)Ljava/lang/Object;
     120: aload_1
     121: aconst_null
     122: astore_1
     123: invokestatic  #63                 // Method clojure/lang/RT.get:(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
     126: areturn

  public int getRequiredArity();
    Code:
       0: iconst_0
       1: ireturn

  public static {};
    Code:
       0: ldc           #70                 // String clojure.core
       2: ldc           #72                 // String contains?
       4: invokestatic  #76                 // Method clojure/lang/RT.var:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Var;
       7: checkcast     #25                 // class clojure/lang/Var
      10: putstatic     #23                 // Field const__0:Lclojure/lang/Var;
      13: ldc           #70                 // String clojure.core
      15: ldc           #78                 // String deref
      17: invokestatic  #76                 // Method clojure/lang/RT.var:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Var;
      20: checkcast     #25                 // class clojure/lang/Var
      23: putstatic     #34                 // Field const__1:Lclojure/lang/Var;
      26: ldc           #70                 // String clojure.core
      28: ldc           #80                 // String swap!
      30: invokestatic  #76                 // Method clojure/lang/RT.var:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Var;
      33: checkcast     #25                 // class clojure/lang/Var
      36: putstatic     #46                 // Field const__2:Lclojure/lang/Var;
      39: ldc           #70                 // String clojure.core
      41: ldc           #82                 // String assoc
      43: invokestatic  #76                 // Method clojure/lang/RT.var:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Var;
      46: checkcast     #25                 // class clojure/lang/Var
      49: putstatic     #49                 // Field const__3:Lclojure/lang/Var;
      52: ldc           #70                 // String clojure.core
      54: ldc           #84                 // String apply
      56: invokestatic  #76                 // Method clojure/lang/RT.var:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Var;
      59: checkcast     #25                 // class clojure/lang/Var
      62: putstatic     #52                 // Field const__4:Lclojure/lang/Var;
      65: return
}
Countersign answered 20/12, 2019 at 20:22 Comment(1)
Thank you! It's going to take me awhile to digest everything in your answer, but this is just the sort of explanation I was looking for.Pippo

© 2022 - 2024 — McMap. All rights reserved.