What are the different techniques to make megamorphic call sites more efficient

Asked 12/3, 2015 at 22:20 Answered 25/6, 2023 at 21:30

performance optimization smalltalk jit vm-implementation

Preamble

This is about improving message send efficiency in a JIT compiler. Despite referring to Smalltalk, this question applies to most dynamic JIT-compiled languages.

Problem

Given a message send site, it can be classified as monomorphic, polymorphic or megamorphic. If the receiver of the message send is always of the same type, it is a monomorphic send, as in

10 timesRepeat: [Object new].

where the receiver of new is always Object. For this kind of sends JITs emit monomorphic inline caches.

Sometimes a given send site refers to a few different object types, like:

#(1 'a string' 1.5) do: [:element | element print]

In this case, print is sent to different types of objects. For these cases, JITs usually emit polymorphic inline caches.

Megamorphic message sends occur when a message is sent to not just a few but a lot of different object types in a same place. One of the most prominent examples is this:

Behavior>>#new
    ^self basicNew initialize

Here, basicNew creates the object, then initialize does initialization. You could do:

Object new
OrderedCollection new
Dictionary new

and they will all execute the same Behavior>>#new method. As the implementation of initialize is different in a lot of classes, the PIC will quickly fill. I'm interested in this kind of send sites, knowing they only occur unfrequently (only 1% of sends are megamorphic).

Question

What are the possible and specific optimizations for megamorphic send sites to avoid doing a lookup?

Tumor answered 12/3, 2015 at 22:20 Comment(0)

I imagine a few, and want to know more. After a PIC gets full, we'll have to call the lookup (being it full or the global cached one), but to optimize we can:

Recycle the PIC, throwing away all entries (many entries could be old and not used frequently).
Call some sort of specific megamorphic lookup (i.e. one that would cache all previously dispatched types in an array accesed by the type hash).
Inline the containing method (when inlined, the send site may stop being megamorphic)

Tumor answered 12/3, 2015 at 22:23 Comment(3)

What about promoting megamorphic selectors (such as #initialize) to special locations attached to the receiver's MethodDictionary, so to short-circuit their lookup? – Selfevident 13/3, 2015 at 3:23

What would these special locations be? and how could they be attached to the receiver's MethodDictionary? – Tumor 13/3, 2015 at 17:34

I don't know exactly. But hey, dictionaries offer plenty of possibilities to attach things to them! For instance, I would consider adding an association to the MD with some reserved key (say, a SmallInteger) and copy there the megamorphic method. Since all selectors are Symbols (and not SmallIntegers) there won't be any collisions with "regular" messages. But again, this is just an idea that requires more thinking and some experimentation. – Selfevident 13/3, 2015 at 20:24

As the following paper suggests, megamorphic sends could be changed to use a vtable lookup.

Given that very few selectors are megamorphic, assigning selector indexes, where each selector is specialized to know its index, should yield compact jump vectors which could be lazily added to each class whose megamorphic method is used.

Manuel Serrano and Marc Feeley, Property caches revisited. In International Conference on Compiler Construction (CC'19), February 2019.

http://www.iro.umontreal.ca/~feeley/papers/SerranoFeeleyCC19.pdf

Dewittdewlap answered 23/5, 2023 at 3:7 Comment(0)

We know the call site is megamorphic. We know that only 1% of sends are megamorphic. We know "we'll have to call the lookup".

Which means we're already doing something like

methodDictionary at: selector ifAbsent: [and we're in here]

so we know this is the first time we've been here, and we know there aren't many megamorphic cases, so if we're here, we should check to see if this is a "special" selector, and if it is, we can make sure we don't come here again

methodDictuionary at: selector put: (result of the lookup)

which will expand the dictionaries, but only for selectors that are "special", and only for the top most lookup (in the receiver's class). Of course, every lookup for a selector that isn't defined in the receiver would incur a little more overhead. (Every such case already incurs the overhead of a lookup).

Q1: Are all megamorphic selectors "special"?

Q2: Can we define the "special" set so as to get enough speed without using too much space?

Q3: Can we do the "special" test fast enough?

Q4: Is this anything, and if so, is it enough?

Tamah answered 25/6, 2023 at 21:30 Comment(0)

Recommended topics

Hot tags