MethodHandles or LambdaMetafactory?
Asked Answered
T

1

9

At my job we have a DSL for specfying mathematical formulas, that we later apply to a lot of points (in the millions).

As of today, we build an AST of the formula, and visit each node to produce what we call an "Evaluator". We then pass that evaluator the arguments of the formula, and for each point it does the computing.

For instance, we have that formula: x * (3 + y)

           ┌────┐
     ┌─────┤mult├─────┐
     │     └────┘     │
     │                │
  ┌──v──┐          ┌──v──┐
  │  x  │      ┌───┤ add ├──┐
  └─────┘      │   └─────┘  │
               │            │
            ┌──v──┐      ┌──v──┐
            │  3  │      │  y  │
            └─────┘      └─────┘

Our evaluator will emit "Evaluate" objects for each step.

This method is easy to program, but not very efficient.

So I started looking into method handles to build up a "composed" method handle to speed things up lately.

Something along this: I have my "Arithmetic" class with :

public class Arithmetics {

  public static double add(double a, double b){
      return a+b;
  }

  public static double mult(double a, double b){
      return a*b;
  }

}

And when building my AST I use MethodHandles.lookup() to directly get a handle on those and compose them. Something along these lines, but in a tree:

Method add = ArithmeticOperator.class.getDeclaredMethod("add", double.class, double.class);
Method mult = ArithmeticOperator.class.getDeclaredMethod("mult", double.class, double.class);
MethodHandle mh_add = lookup.unreflect(add);
MethodHandle mh_mult = lookup.unreflect(mult);
MethodHandle mh_add_3 = MethodHandles.insertArguments(mh_add, 3, plus_arg);
MethodHandle formula = MethodHandles.collectArguments(mh_mult, 1, mh_add_3); // formula is f(x,y) = x * (3 + y)

Sadly, I'm quite disapointed by the results. For instance, the actual construction of the method handle is very long (due to calls to MethodHandles::insertArguments and other such compositions functions), and the added speedup for evaluation only starts to make a difference after over 600k iterations.

At 10M iterations, the Method handle starts to really shine, but millions of iterations is not (yet?) a typical use case. We are more around 10k-1M, where the result is mixed.

Also, the actual computation is sped up, but by not so much (~2-10 times). I was expecting the thing to run a bit faster..

So anyway, I started scouring StackOverflow again, and saw the LambdaMetafactory threads like these: https://mcmap.net/q/280633/-faster-alternatives-to-java-39-s-reflection-closed

And I'm itching to start trying this. But before that, I'd like your input on some questions:

  • I need to be able to compose all those lambdas. MethodHandles provides lots of (slowish, admitedly) ways to do it, but I feel like lambdas have a stricter "interface", and I can't yet wrap my head on how to do that. Do you know how?

  • lambdas and method handles are quite interconnected, and I'm not sure that I will get a significant speedup. I see these results for simple lambdas: direct: 0,02s, lambda: 0,02s, mh: 0,35s, reflection: 0,40 but what about composed lambdas?

Thanks guys!

Toein answered 8/6, 2016 at 10:19 Comment(6)
Is there a dynamic component suggesting such approaches? An ordinary tree of (preferably immutable) objects isn’t that bad.Bowline
Our tree of object actually uses an interface, and we discussed the cost of all the dynamic dispatch, and figured we would probably be better with a custom-built method handle for that. My tests show that this is the case for big trees or with lots of iterations for small trees. Now I'm wondering if lambdas could enhance this a bit moreToein
Did you perform actual benchmarks regarding the “cost of all the dynamic dispatch”? If you have an immutable tree, HotSpot is usually very good at doing aggressive inlining. And you will have to rely on that capability when using lambdas as well, as the are designed around interfaces as well.Bowline
I think (but haven't tested it) one advantage of the LambdaMetaFactory is that it creates a new class, which means the JVM optimizes it much more aggressively than methodhandles held in local or instance fields. The only other way to achieve the same effect would be using Unsafe to generate anonymous classes. So maybe just keep your MH tree and convert it to a class with the LMF as a final step? If those things are created dynamically you might also want to tune JVM parameters so that the JITs can kick in sooner. They're normally tuned for reaching steady state and then changing little.Antimacassar
@the8472: the problem is that the current implementation of LambdaMetaFactory doesn’t support composed method handles. So you can only generate instances which delegate to an existing method, which implies, that you have to generate the code for combining nodes yourself. This is basically, what my answer is about, once you are aware that you need these methods, you can create the nodes without reflective operations, using the Java 8 source level features. Then, LMF still can be used to integrate particular specialized leaf nodes.Bowline
For what's it's worth: a benchmark performance comparison of reflection, static/non-static methodhandles, LambdaMetaFactory and generated code...Forgive
B
7

I think, for most practical cases, an immutable evaluation tree consisting of nodes fulfilling a particular interface or inheriting from a common evaluator base class, is unbeatable. HotSpot is capable of performing (aggressive) inlining, at least for subtrees, but has the freedom to decide how many nodes it will inline.

In contrast, generating explicit code for an entire tree, imposes the risk of exceeding the JVM’s thresholds, then, you have code that has no dispatch overhead, for sure, but might run interpreted all the time.

A tree of adapted MethodHandles starts like any other tree, but with a higher overhead. Whether its own optimization is capable of beating HotSpots own inlining strategy, is debatable. And as you noticed, it takes a long number of invocations, before that self-tuning kicks in. It seems, the thresholds accumulate in an unfortunate way for composed method handles.

To name one prominent example of the evaluation tree pattern, when you use Pattern.compile to prepare a regex matching operation, no bytecode nor native code will be generated, despite the method’s name may mislead into thinking into that direction. The internal representation is just an immutable tree of nodes, representing the combinations of the different kind of operations. It’s up to the JVMs optimizer to generate flattened code for it where it is considered beneficial.

Lambda expressions don’t change the game. They allow you to generate (small) classes fulfilling an interface and invoking a target method. You can use them to build an immutable evaluation tree and while this is unlikely to have a different performance than explicitly programmed evaluation node classes, it allows much simpler code:

public class Arithmetics {
    public static void main(String[] args) {
        // x * (3 + y)
        DoubleBinaryOperator func=op(MUL, X, op(ADD, constant(3), Y));
        System.out.println(func.applyAsDouble(5, 4));
        PREDEFINED_UNARY_FUNCTIONS.forEach((name, f) ->
            System.out.println(name+"(0.42) = "+f.applyAsDouble(0.42)));
        PREDEFINED_BINARY_FUNCTIONS.forEach((name, f) ->
            System.out.println(name+"(0.42,0.815) = "+f.applyAsDouble(0.42,0.815)));
        // sin(x)+cos(y)
        func=op(ADD,
            op(PREDEFINED_UNARY_FUNCTIONS.get("sin"), X),
            op(PREDEFINED_UNARY_FUNCTIONS.get("cos"), Y));
        System.out.println("sin(0.6)+cos(y) = "+func.applyAsDouble(0.6, 0.5));
    }
    public static DoubleBinaryOperator ADD = Double::sum;
    public static DoubleBinaryOperator SUB = (a,b) -> a-b;
    public static DoubleBinaryOperator MUL = (a,b) -> a*b;
    public static DoubleBinaryOperator DIV = (a,b) -> a/b;
    public static DoubleBinaryOperator REM = (a,b) -> a%b;

    public static <T> DoubleBinaryOperator op(
        DoubleUnaryOperator op, DoubleBinaryOperator arg1) {
        return (x,y) -> op.applyAsDouble(arg1.applyAsDouble(x,y));
    }
    public static DoubleBinaryOperator op(
        DoubleBinaryOperator op, DoubleBinaryOperator arg1, DoubleBinaryOperator arg2) {
        return (x,y)->op.applyAsDouble(arg1.applyAsDouble(x,y),arg2.applyAsDouble(x,y));
    }
    public static DoubleBinaryOperator X = (x,y) -> x, Y = (x,y) -> y;
    public static DoubleBinaryOperator constant(double value) {
        return (x,y) -> value;
    }

    public static final Map<String,DoubleUnaryOperator> PREDEFINED_UNARY_FUNCTIONS
        = getPredefinedFunctions(DoubleUnaryOperator.class,
            MethodType.methodType(double.class, double.class));
    public static final Map<String,DoubleBinaryOperator> PREDEFINED_BINARY_FUNCTIONS
        = getPredefinedFunctions(DoubleBinaryOperator.class,
            MethodType.methodType(double.class, double.class, double.class));

    private static <T> Map<String,T> getPredefinedFunctions(Class<T> t, MethodType mt) {
        Map<String,T> result=new HashMap<>();
        MethodHandles.Lookup l=MethodHandles.lookup();
        for(Method m:Math.class.getMethods()) try {
            MethodHandle mh=l.unreflect(m);
            if(!mh.type().equals(mt)) continue;
            result.put(m.getName(), t.cast(LambdaMetafactory.metafactory(
            MethodHandles.lookup(), "applyAsDouble", MethodType.methodType(t),
            mt, mh, mt) .getTarget().invoke()));
        }
        catch(RuntimeException|Error ex) { throw ex; }
        catch(Throwable ex) { throw new AssertionError(ex); }
        return Collections.unmodifiableMap(result);
    }
}

This is everything you need to compose evaluators for expressions made of basic arithmetic operators and functions found in java.lang.Math, the latter collected dynamically, to address that aspect of your question.

Note that technically,

public static DoubleBinaryOperator MUL = (a,b) -> a*b;

is just a short-hand for

public static DoubleBinaryOperator MUL = Arithmetics::mul;
public static double mul(double a, double b){
    return a*b;
}

I added a main method containing some examples. Keep in mind that these function behave like compiled code, right in the first invocation, as in fact, they consist of compiled code only, but composed of multiple functions.

Bowline answered 8/6, 2016 at 14:28 Comment(1)
This is a good answer, thanks Holger. I still have grey areas (how to have variadic amount of parameters), but it is a good start.Toein

© 2022 - 2024 — McMap. All rights reserved.