Can Rust optimise away the bit-wise copy during move of an object someday?
Asked Answered
C

1

29

Consider the snippet

struct Foo {
    dummy: [u8; 65536],
}

fn bar(foo: Foo) {
    println!("{:p}", &foo)
}

fn main() {
    let o = Foo { dummy: [42u8; 65536] };
    println!("{:p}", &o);
    bar(o);
}

A typical result of the program is

0x7fffc1239890
0x7fffc1229890

where the addresses are different.

Apparently, the large array dummy has been copied, as expected in the compiler's move implementation. Unfortunately, this can have non-trivial performance impact, as dummy is a very large array. This impact can force people to choose passing argument by reference instead, even when the function actually "consumes" the argument conceptually.

Since Foo does not derive Copy, object o is moved. Since Rust forbids the access of moved object, what is preventing bar to "reuse" the original object o, forcing the compiler to generate a potentially expensive bit-wise copy? Is there a fundamental difficulty, or will we see the compiler someday optimise away this bit-wise copy?

Chablis answered 25/7, 2016 at 15:3 Comment(4)
Rustc does optimize moves. It isn't doing so in this case, probably because llvm didn't inline bar. This might even be because you are trying to observe the pointer values, and llvm isn't sure if that's safe to optimize. I tried it without the :p prints and used test::black_box instead, and the copy vanishes from the assembly.Nickeliferous
@Nickeliferous bar is getting inlined. LLVM is just bad at removing moves of large arrays.Diba
The issues with NRVO tag are related to this: github.com/rust-lang/rust/labels/A-mir-opt-nrvoChablis
Is o dropping guaranteed in this case? In view of it was moved out to bar(), what's the point where the o memory would free up?Schoolman
T
27

Given that in Rust (unlike C or C++) the address of a value is not considered to matter, there is nothing in terms of language that prevents the elision of the copy.

However, today rustc does not optimize anything: all optimizations are delegated to LLVM, and it seems you have hit a limitation of the LLVM optimizer here (it's unclear whether this limitation is due to LLVM being close to C's semantics or is just an omission).

So, there are two avenues of improving code generation for this:

  • teaching LLVM to perform this optimization (if possible)
  • teaching rustc to perform this optimization (optimization passes are coming to rustc now that it has MIR)

but for now you might simply want to avoid such large objects from being allocated on the stack, you can Box it for example.

Turney answered 25/7, 2016 at 15:19 Comment(5)
Speaking of MIR optimization passes, the first one would be a simple move destination propagation pass: github.com/rust-lang/rust/pull/34693. The tracking issue is github.com/rust-lang/rust/issues/32966.Butterwort
Instead of just avoiding stack allocation, it would be better to assume the move will be optimized and only box things later if it wasn't. Most of the time in Rust you shouldn't be thinking about trying to avoid copying things.Chronometer
@MichaelYounkin: I partially agree. The problem is that large objects copied a few times on the stack easily lead to stack-overflow, especially with Debug targets where optimizations do not occur. If the buffer is very large, the cost of the dynamic allocation should be dwarfed by the cost of initializing the buffer itself anyway.Turney
@MatthieuM allocating it on the heap is very well, but in my experience, even writing Box::new(BigStruct::new()) first allocates the BigStruct in the stack (in BigStruct::new), then copies it in the heap (in Box::new). Or am I missing something?Lanai
@Pierre-Antoine: In Debug, yes, for now; this is why placement new is so sought after. In Release, the stack copy should hopefully be optimized out anyway, but this may lead to Stack Overflows in Debug that prevent you from testing your code :(Turney

© 2022 - 2024 — McMap. All rights reserved.