Stimulate code-inlining

Asked 13/12, 2016 at 11:15 Answered 30/3, 2023 at 21:4

Unlike in languages like C++, where you can explicitly state inline, in Go the compiler dynamically detects functions that are candidate for inlining (which C++ can do too, but Go can't do both). Also there's a debug option to see possible inlining happening, yet there is very few documented online about the exact logic of the go compiler(s) doing this.

Let's say I need to rerun some big loop over a set of data every n-period;

func Encrypt(password []byte) ([]byte, error) {
    return bcrypt.GenerateFromPassword(password, 13)
}

for id, data := range someDataSet {
    newPassword, _ := Encrypt([]byte("generatedSomething"))
    data["password"] = newPassword
    someSaveCall(id, data)
}

Aiming for example for Encrypt to being inlined properly what logic should I need to take into consideration for the compiler?

I know from C++ that passing by reference will increase likeliness for automatic inlining without the explicit inline keyword, but it's not very easy to understand what the compiler exactly does to determine the decisions on choosing to inline or not in Go. Scriptlanguages like PHP for example suffer immensely if you do a loop with a constant addSomething($a, $b) where benchmarking such a billion cycles the cost of it versus $a + $b (inline) is almost ridiculous.

Tufa answered 13/12, 2016 at 11:15 Comment(1)

Don't worry too much. Especially as there is not much you can do and what you can do changes every 6 month with a new compiler (which can do more and better inlining). – Metacarpal 13/12, 2016 at 11:58

Until you have performance problems, you shouldn't care. Inlined or not, it will do the same.

If performance does matter and it makes a noticable and significant difference, then don't rely on current (or past) inlining conditions, "inline" it yourself (do not put it in a separate function).

The rules can be found in the $GOROOT/src/cmd/compile/internal/inline/inl.go file. You may control its aggressiveness with the 'l' debug flag.

// The inlining facility makes 2 passes: first caninl determines which
// functions are suitable for inlining, and for those that are it
// saves a copy of the body. Then InlineCalls walks each function body to
// expand calls to inlinable functions.
//
// The Debug.l flag controls the aggressiveness. Note that main() swaps level 0 and 1,
// making 1 the default and -l disable. Additional levels (beyond -l) may be buggy and
// are not supported.
//      0: disabled
//      1: 80-nodes leaf functions, oneliners, panic, lazy typechecking (default)
//      2: (unassigned)
//      3: (unassigned)
//      4: allow non-leaf functions
//
// At some point this may get another default and become switch-offable with -N.
//
// The -d typcheckinl flag enables early typechecking of all imported bodies,
// which is useful to flush out bugs.
//
// The Debug.m flag enables diagnostic output.  a single -m is useful for verifying
// which calls get inlined or not, more is for debugging, and may go away at any point.

Also check out blog post: Dave Cheney - Five things that make Go fast (2014-06-07) which writes about inlining (long post, it's about in the middle, search for the "inline" word).

Also interesting discussion about inlining improvements (maybe Go 1.9?): cmd/compile: improve inlining cost model #17566

Sheliasheline answered 13/12, 2016 at 11:27 Comment(2)

Thank you for your explanation icza. I will definitely read those articles as soon as I get back home! I'm aware from C++ (GNU docs also state this) that inlining can also cause the counter-effect of causing bloating-effects. I +1'd your post, but will first read the articles later today and accept your answer based on the found criteria :). I know this is sometimes the withy optimisation-methodology, but I'm just interested in the logic in itself too over the "performance-gain" per se. – Tufa 13/12, 2016 at 11:57

Note that the compiler does inline functions that call other functions if those functions in turn have been inlined, and together the resulting function still satisfies the criteria. In particular, the "budget" starts becoming tight, as the function and any inlined functions it calls must fit within the same budget, of 40 operations or whatever it is in the latest version. When talking about non-leaf or "mid stack" functions, people are referring to functions that call other functions that aren't inlined. – Houseyhousey 11/9, 2017 at 10:39

Better still, don’t guess, measure! You should trust the compiler and avoid trying to guess its inner workings as it will change from one version to the next. There are far too many tricks the compiler, the CPU or the cache can play to be able to predict performance from source code.

What if inlining makes your code bigger to the point that it doesn’t fit in the cache line anymore, making it much slower than the non-inlined version? Cache locality can have a much bigger impact on performance than branching.

Indict answered 13/12, 2016 at 11:30 Comment(1)

I understand your point Franck. I use the testing library a lot, yet it's not always the literal "explainer" to see if it actually did inlining. It's also harder to determine it by testing if the function is used in more locations. I already cache a lot of data-sets, so there's not much overhead there. My question was mostly literal to the point of inlining on it's own. Your point is generally valid tho and I totally agree :). – Tufa 13/12, 2016 at 11:53

You are fighting an uphill battle. Go is not made for what you are trying to do. Go is not tweakable and it is made for having medium performance. It values simplicity over performance, therefore people should not use it where you need more precise behavior like inlining. Languages that value performance more have APIs for inlining. Check out Rust, C++, C#.

Marienthal answered 30/3, 2023 at 21:4 Comment(0)

Recommended topics

Hot tags