Why does using LTO increase the size of my Rust binary?

Asked 12/9, 2018 at 8:43 Answered 12/9, 2018 at 14:38

Introduction

I finished a small Rust project (about 300 lines of code) with the following dependencies:

Problem

When using cargo build --release without further configuration, a 2.942.744 bytes (= 2,8 MiB) binary is generated. I tried to optimize this by enabling Link Time Optimization (LTO) in my Cargo.toml:

[profile.release]
lto = true

To my surprise, the binary grows, with a new size of 3.848.288 bytes (= 3,7 MiB).

How can this be explained? Is there any mistake I made configuring Cargo?

Outbalance answered 12/9, 2018 at 8:43 Comment(4)

you may be interested in this IRLO thread – Congius 12/9, 2018 at 9:22

Why do yo expect the binary size to go down? – Biconcave 12/9, 2018 at 12:16

@MatthieuM. In fact, because of this. I didn't know that LTO doesn't just optimize binary size but performance, too. – Outbalance 12/9, 2018 at 12:23

@PEAR: I see! It's a bit more complicated than that, actually. – Biconcave 12/9, 2018 at 14:27

What is LTO?

LTO means Link-Time Optimization. It is generally set up to use the regular optimization passes used to produce object files... at link time instead, or in addition.

Why does it matter?

A compiler does not inherently optimize for speed over size or size over speed; and therefore neither does LTO.

Instead, when invoking the compiler, the user selects a profile. For rustc:

O0, O1, O2 and O3 are optimizing for speed.
Os and Oz are optimizing for size.

LTO can be combined on top of any optimization level, and will follow the selected profile.

So why did the size increase?

By default, the [release] profile instructs cargo to invoke rustc with O2 or O3, which attempts to optimize for speed over size.

In particular, O3 can rely quite heavily on inlining. Inlining is all about giving more context to the optimizer, and therefore more optimization opportunities... LTO offers more chances to apply inlining (more known functions), and here it appears that more inlining happened.

So why did this blog post claim it reduced size?

It also reduces size. Possibly.

By giving more context, the optimizer/linker can realize that some portion of the code or dependencies are not used at all, and therefore can be elided.

If using Os or Oz, the size is near certain to go down.

If using O2 or O3, unused code is removed while inlining adds more code, so it's quite unpredictable whether the end result is bigger or smaller.

So, LTO?

LTO gives the optimizer a better opportunity at optimizing, so it's a good default for Releases.

Just remember that cargo leans toward speed over size by default, and if this does not suit you, you may want to select another optimization direction.

Biconcave answered 12/9, 2018 at 14:38 Comment(0)

Probably because of inlining, which can increase code size to increase speed.

Paranoia answered 12/9, 2018 at 8:44 Comment(3)

Is LTO actually able to inline functions? Because that needs a recompilation of that particular function. Can you please provide some documentation? – Hatband 12/9, 2018 at 9:11

What do you mean by recompilation? It is possible to cache generated object files of external crates and optimize every inlined function one by on every calling site. – Ctenidium 22/1, 2021 at 18:8

@EkremDinçel: I think "recompilation" means "code generation" here. But yes, inlining is one of the very purposes of LTO. eklitzke.org/how-gcc-lto-works – Paranoia 22/1, 2021 at 23:11