What are move semantics in Rust?

Asked 17/5, 2015 at 15:46 Answered 5/2, 2024 at 21:16

In Rust, there are two possibilities to take a reference

Borrow, i.e., take a reference but don't allow mutating the reference destination. The & operator borrows ownership from a value.
Borrow mutably, i.e., take a reference to mutate the destination. The &mut operator mutably borrows ownership from a value.

The Rust documentation about borrowing rules says:

First, any borrow must last for a scope no greater than that of the owner. Second, you may have one or the other of these two kinds of borrows, but not both at the same time:

one or more references (&T) to a resource,

exactly one mutable reference (&mut T).

I believe that taking a reference is creating a pointer to the value and accessing the value by the pointer. This could be optimized away by the compiler if there is a simpler equivalent implementation.

However, I don't understand what move means and how it is implemented.

For types implementing the Copy trait it means copying e.g. by assigning the struct member-wise from the source, or a memcpy(). For small structs or for primitives this copy is efficient.

And for move?

This question is not a duplicate of What are move semantics? because Rust and C++ are different languages and move semantics are different between the two.

Chokebore answered 17/5, 2015 at 15:46 Comment(3)

You may be interested in Move vs Copy in Rust or How does Rust provide move semantics?. – Pibroch 17/5, 2015 at 15:49

It looks like you already found your answer, but I am learning this stuff now as well and I found these resources to be very helpful doc.rust-lang.org/book/ownership.html and youtube.com/watch?v=WQbg6ZMQJvQ – Merganser 17/5, 2015 at 16:59

According to the book, doc.rust-lang.org/book/…. A move is a shallow copy + invalidation: "If you’ve heard the terms shallow copy and deep copy while working with other languages, the concept of copying the pointer, length, and capacity without copying the data probably sounds like making a shallow copy. But because Rust also invalidates the first variable, instead of calling it a shallow copy, it’s known as a move" – Spang 22/2, 2022 at 10:11

Semantics

Rust implements what is known as an Affine Type System:

Affine types are a version of linear types imposing weaker constraints, corresponding to affine logic. An affine resource can be used at most once, while a linear one must be used exactly once.

Types that are not Copy, and are thus moved, are Affine Types: you may use them either once or never, nothing else.

Rust qualifies this as a transfer of ownership in its Ownership-centric view of the world (*).

(*) Some of the people working on Rust are much more qualified than I am in CS, and they knowingly implemented an Affine Type System; however contrary to Haskell which exposes the math-y/cs-y concepts, Rust tends to expose more pragmatic concepts.

Note: it could be argued that Affine Types returned from a function tagged with #[must_use] are actually Linear Types from my reading.

Implementation

It depends. Please keep in mind than Rust is a language built for speed, and there are numerous optimizations passes at play here which will depend on the compiler used (rustc + LLVM, in our case).

Within a function body (playground):

fn main() {
    let s = "Hello, World!".to_string();
    let t = s;
    println!("{}", t);
}

If you check the LLVM IR (in Debug), you'll see:

%_5 = alloca %"alloc::string::String", align 8
%t = alloca %"alloc::string::String", align 8
%s = alloca %"alloc::string::String", align 8

%0 = bitcast %"alloc::string::String"* %s to i8*
%1 = bitcast %"alloc::string::String"* %_5 to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %1, i8* %0, i64 24, i32 8, i1 false)
%2 = bitcast %"alloc::string::String"* %_5 to i8*
%3 = bitcast %"alloc::string::String"* %t to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %2, i64 24, i32 8, i1 false)

Underneath the covers, rustc invokes a memcpy from the result of "Hello, World!".to_string() to s and then to t. While it might seem inefficient, checking the same IR in Release mode you will realize that LLVM has completely elided the copies (realizing that s was unused).

The same situation occurs when calling a function: in theory you "move" the object into the function stack frame, however in practice if the object is large the rustc compiler might switch to passing a pointer instead.

Another situation is returning from a function, but even then the compiler might apply "return value optimization" and build directly in the caller's stack frame -- that is, the caller passes a pointer into which to write the return value, which is used without intermediary storage.

The ownership/borrowing constraints of Rust enable optimizations that are difficult to reach in C++ (which also has RVO but cannot apply it in as many cases).

So, the digest version:

moving large objects is inefficient, but there are a number of optimizations at play that might elide the move altogether
moving involves a memcpy of std::mem::size_of::<T>() bytes, so moving a large String is efficient because it only copies a couple bytes whatever the size of the allocated buffer they hold onto

Piave answered 17/5, 2015 at 17:48 Comment(11)

What is the use case for affine types? Why is it useful to prevent multiple use of a value? – Goatherd 14/8, 2017 at 20:25

@weberc2: Imagine that you have a String, whose content is stored on the heap. By moving it, you avoid the need for reference counting or garbage collection, since no one can access the "old" binding. Well, turns out that there are other usecases for when you'd like to avoid the "old" binding being reused. For example, in a state machine, once you transition out of a state it doesn't make sense to transition again for that state. Move semantics help model this at compile-time. – Piave 15/8, 2017 at 6:31

Regarding your string example, moving creates the use-after-free problem, it doesn't solve it. Normally, the order of events would be: allocate the String, call func1 with the string, call func2 with the string, destroy string. Moving just moves the destructor into func1, thereby preventing func2 from safely running. This is almost never what I want, and it precludes huge swaths of trivially-proven, valid programs. Maybe there are uses, but they seem few and far between--in particular, I'm pretty sure I could make a statically-guaranteed state-machine in a language without affine types. – Goatherd 15/8, 2017 at 19:4

@weberc2: in particular, I'm pretty sure I could make a statically-guaranteed state-machine in a language without affine types => if you do manage it, I'd really like to know how (I've got to use C++ at work, and could use more static guarantees). As for the func1, func2, note that with an affine type system you get a compile-time error, so there's no use-after-free issue; the solution is to pass by reference, or have func1 return the string. The former requires borrow-checking to be safe, the latter some contortions. – Piave 16/8, 2017 at 6:26

If I understand correctly, you can implement the state machine as functions which take an event as input and return another event-handling function as output. Regarding func1, func2, I agree that an affine type system prevents you from calling both functions, but the use-after-free can be statically avoided without precluding such a large swath of valid programs. If precluding those programs offered some greater safety, I could probably appreciate an affine type system, but it's not obvious to me that this is the case. – Goatherd 16/8, 2017 at 14:45

@weberc2: I know only two ways to preclude use-after-free in a performant manner statically: affine types and memory regions (see the Cyclone language for the latter). And the latter does not work with unions/sum types. For example, in Rust &mut T are affine, not Copy, to be able to guarantee memory safety, and I don't readily see how to do without. As for the state machine: one type per state, one function per transition (consume the state, maybe other arguments, and produces a new state). – Piave 16/8, 2017 at 17:13

All you have to do is that the free happens after the uses. Assuming an immutable x, invoking foo(x); bar(x); is guaranteed not to use-after-free so long as foo and bar don't spawn new threads. Also, I don't dispute that affine types permit safe state machines, I'm just not convinced that it's the only way, and if it is, I'm not convinced that it's a sufficiently useful property to justify the many inconveniences. – Goatherd 16/8, 2017 at 17:52

@weberc2: I am afraid that you are forgetting all the cases where a reference to x is stowed away by either foo or bar (global variables, modifying other arguments, etc...). If arguments never escaped functions, of course such analysis would be easy. Unfortunately, such a language would probably be quite unwieldy. And when arguments can escape functions, then your analysis is too trivial I fear. – Piave 16/8, 2017 at 18:11

Maybe. What is meant by "modifying other arguments"? How would this affect an immutable argument's lifetime? Regarding global variables, that certainly complicates the analysis, but it should still be fairly easy. Are there more complex cases where arguments escape functions? – Goatherd 16/8, 2017 at 18:52

@weberc2: The term you are looking for is Escape Analysis. It used to be thought as "certainly, that can't be hard", but nowadays the common wisdom is "let's do our best, can't cover all cases anyway". – Piave 17/8, 2017 at 6:46

I agree, and I'm not proposing covering every case, but to cover no cases makes the language prohibitively difficult IMHO. I don't mean this to be disparaging; I really like a lot about Rust and I want to use it, I just can't justify the time I spend fighting unnecessary compiler errors. – Goatherd 17/8, 2017 at 16:35

When you move an item, you are transferring ownership of that item. That's a key component of Rust.

Let's say I had a struct, and then I assign the struct from one variable to another. By default, this will be a move, and I've transferred ownership. The compiler will track this change of ownership and prevent me from using the old variable any more:

pub struct Foo {
    value: u8,
}

fn main() {
    let foo = Foo { value: 42 };
    let bar = foo;

    println!("{}", foo.value); // error: use of moved value: `foo.value`
    println!("{}", bar.value);
}

how it is implemented.

Conceptually, moving something doesn't need to do anything. In the example above, there wouldn't be a reason to actually allocate space somewhere and then move the allocated data when I assign to a different variable. I don't actually know what the compiler does, and it probably changes based on the level of optimization.

For practical purposes though, you can think that when you move something, the bits representing that item are duplicated as if via memcpy. This helps explain what happens when you pass a variable to a function that consumes it, or when you return a value from a function (again, the optimizer can do other things to make it efficient, this is just conceptually):

// Ownership is transferred from the caller to the callee
fn do_something_with_foo(foo: Foo) {} 

// Ownership is transferred from the callee to the caller
fn make_a_foo() -> Foo { Foo { value: 42 } }

"But wait!", you say, "memcpy only comes into play with types implementing Copy!". This is mostly true, but the big difference is that when a type implements Copy, both the source and the destination are valid to use after the copy!

One way of thinking of move semantics is the same as copy semantics, but with the added restriction that the thing being moved from is no longer a valid item to use.

However, it's often easier to think of it the other way: The most basic thing that you can do is to move / give ownership away, and the ability to copy something is an additional privilege. That's the way that Rust models it.

This is a tough question for me! After using Rust for a while the move semantics are natural. Let me know what parts I've left out or explained poorly.

Pibroch answered 17/5, 2015 at 16:4 Comment(9)

It seems that the concept comes from C++11. I found a document about C++ move semantics: open-std.org/jtc1/sc22/wg21/docs/papers/2002/…. Please note «From a client code point of view, choosing move instead of copy means that you don't care what happens to the state of the source.». In Rust it means you are forbidden to use the source after a move. – Chokebore 17/5, 2015 at 16:10

I'd also add that not only does moving out from a variable prevents its further usage, it also disables running a destructor on this variable. This is an important difference from C++ where AFAIK types must be explicitly designed to allow moves because their destructors will always be run, and so the move constructor has to make sure that the destructor won't do anything stupid. – Chose 17/5, 2015 at 16:19

@Chokebore yes, C++11 certainly used the name "move semantics" and formally introduced it to the language, but the concept of move semantics has existed for a very long time. However, it was something that programmers had to track manually. The fact that Rust promotes this sticky topic to a first-class citizen and makes it harder to shoot yourself in the foot is one of the very intriguing parts of the language to me! – Pibroch 17/5, 2015 at 16:19

@VladimirMatveev By doing nothing (see my answer) it also does not run a destructor on this variable. – Chokebore 17/5, 2015 at 16:59

@Shepmaster: Yes, contrary to C++, Rust implements an Affine Type System, values may be used (or "consumed") at most once. Thanks to that, Rust allows implementing "state machines" that are type checked by the compiler, and I have seen a number of libraries leveraging this already. – Piave 17/5, 2015 at 17:28

I like this answer because it's good and because of your SO profile picture. ;) However, it should be mentioned that moves don't always copy. Although I don't often use Rust, I know that, in C++, if a vector's move constructor gets called, the whole point is that the possibly large internal array won't get copied. – Entablature 17/5, 2015 at 22:50

@Entablature being pedantic, the move always copies values (with the caveats about the optimizer I say in the post). However, values that have heap-allocated components (Box, Vec, String, so on) are built with a structure that conceptually has a pointer to the data. The pointer is copied, the pointed-at data is not (that's the realm of the Clone trait). Your point that the large allocation is not moved is correct though. – Pibroch 17/5, 2015 at 23:47

"When you move an item, you are transferring ownership of that item." I keep reading this, but it's not obvious why this is a desirable property. – Goatherd 14/8, 2017 at 20:32

@Goatherd Inherently, it's not. It's more about how everything works together and what it enables. Since every value has a single owner, you know at compile time who is responsible for deallocating it. You don't need a garbage collector because the scope of the value is sufficient. Moving is simply the manifestation of how that ownership can be transferred. – Pibroch 14/8, 2017 at 20:47

Please let me answer my own question. I had trouble, but by asking a question here I did Rubber Duck Problem Solving. Now I understand:

A move is a transfer of ownership of the value.

For example the assignment let x = a; transfers ownership: At first a owned the value. After the let it's x who owns the value. Rust forbids to use a thereafter.

In fact, if you do println!("a: {:?}", a); after the letthe Rust compiler says:

error: use of moved value: `a`
println!("a: {:?}", a);
                    ^

Complete example:

#[derive(Debug)]
struct Example { member: i32 }

fn main() {
    let a = Example { member: 42 }; // A struct is moved
    let x = a;
    println!("a: {:?}", a);
    println!("x: {:?}", x);
}

And what does this move mean?

It seems that the concept comes from C++11. A document about C++ move semantics says:

From a client code point of view, choosing move instead of copy means that you don't care what happens to the state of the source.

Aha. C++11 does not care what happens with source. So in this vein, Rust is free to decide to forbid to use the source after a move.

And how it is implemented?

I don't know. But I can imagine that Rust does literally nothing. x is just a different name for the same value. Names usually are compiled away (except of course debugging symbols). So it's the same machine code whether the binding has the name a or x.

It seems C++ does the same in copy constructor elision.

Doing nothing is the most efficient possible.

Chokebore answered 17/5, 2015 at 16:54 Comment(1)

Actually, Rust might do something. When you pass by value, in Rust, it moves the value into the function frame, in which case the move might end up being physical (memcpy). – Piave 17/5, 2015 at 17:24

Rust's move keyword always bothers me so, I decided to write my understanding which I obtained after discussion with my colleagues.

I hope this might help someone.

let x = 1;

In the above statement, x is a variable whose value is 1. Now,

let y = || println!("y is a variable whose value is a closure");

So, move keyword is used to transfer the ownership of a variable to the closure.

In the below example, without move, x is not owned by the closure. Hence x is not owned by y and available for further use.

let x = 1;
let y = || println!("this is a closure that prints x = {}". x);

On the other hand, in this next below case, the x is owned by the closure. x is owned by y and not available for further use.

let x = 1;
let y = move || println!("this is a closure that prints x = {}". x);

By owning I mean containing as a member variable. The example cases above are in the same situation as the following two cases. We can also assume the below explanation as to how the Rust compiler expands the above cases.

The formar (without move; i.e. no transfer of ownership),

struct ClosureObject {
    x: &u32
}

let x = 1;
let y = ClosureObject {
    x: &x
};

The later (with move; i.e. transfer of ownership),

struct ClosureObject {
    x: u32
}

let x = 1;
let y = ClosureObject {
    x: x
};

Aciculum answered 8/6, 2021 at 1:42 Comment(0)

Passing a value to function, also results in transfer of ownership; it is very similar to other examples:

struct Example { member: i32 }

fn take(ex: Example) {
    // 2) Now ex is pointing to the data a was pointing to in main
    println!("a.member: {}", ex.member) 
    // 3) When ex goes of of scope so as the access to the data it 
    // was pointing to. So Rust frees that memory.
}

fn main() {
    let a = Example { member: 42 }; 
    take(a); // 1) The ownership is transfered to the function take
             // 4) We can no longer use a to access the data it pointed to

    println!("a.member: {}", a.member);
}

Hence the expected error:

post_test_7.rs:12:30: 12:38 error: use of moved value: `a.member`

Merganser answered 17/5, 2015 at 17:25 Comment(0)

let s1:String= String::from("hello");
let s2:String= s1;

To ensure memory safety, rust invalidates s1, so instead of being shallow copy, this called a Move

fn main() {
  // Each value in rust has a variable that is called its owner
  // There can only be one owner at a time.
  let s=String::from('hello')
  take_ownership(s)
  println!("{}",s)
  // Error: borrow of moved value "s". value borrowed here after move. so s cannot be borrowed after a move
  // when we pass a parameter into a function it is the same as if we were to assign s to another variable. Passing 's' moves s into the 'my_string' variable then `println!("{}",my_string)` executed, "my_string" printed out. After this scope is done, some_string gets dropped. 

  let x:i32 = 2;
  makes_copy(x)
  // instead of being moved, integers are copied. we can still use "x" after the function
  //Primitives types are Copy and they are stored in stack because there size is known at compile time. 
  println("{}",x)
}

fn take_ownership(my_string:String){
  println!('{}',my_string);
}

fn makes_copy(some_integer:i32){
  println!("{}", some_integer)
}

Salish answered 21/3, 2022 at 20:49 Comment(0)

In general the concept of moves applies to types that "own" a resource. Ownership can be divided into Unique ownership and Shared ownership. Types with Unique ownership include Box, Vec and String which own memory on the heap, but also types like File which own file handles. Types with Shared ownership include Rc and Arc.

A type representing unique ownership cannot safely be shallow* copied, doing so would violate the basic invariants of the type. A type representing shared ownership can be copied, but doing so is non-trivial (and so requires "Clone" rather than "Copy") as bookkeeping information must be updated.

On the other hand these types can generally be moved around. As long as the number of references doesn't change the invariant is upheld.

Mechanically a rust move is very similar to a trivial copy. Semantically, the data representing the type is copied from the old location to the new one. The difference between a "move" and a "copy" is that after a move the source is no longer considered to be a valid object of the type in question. Operations of the type in question, including dropping it, are no longer valid on the source location.

For local variables, the compiler keeps track of whether a variable is in a valid state or not. If the variable might not be in a valid state then it will not let you read from the variable. If the value is not in a valid state then it will not call drop when the variable is reassigned or goes out of scope.

For values stored in other places, the mechanism of moving them may differ. For example Vec has a pop() method which removes the last item from the Vec and returns it to the caller.

This differs somewhat from the C++ concept of a move. In a C++ move the origin need to be left in a valid state where, at the very least, it is safe to call the destructor.

This difference has several consequences.

Moves in rust are always "trivial", a simple copy of the fields, whose cost depends only on the size of the type. Moves in C++ basically always involve a "move constructor" or "move assignment" because they must put the source into a safe state.
Because moves are trivial, the compiler has more freedom in terms of how values are passed to and returned from functions. Rust types can often be passed in registers where the corresponding C++ type must be passed on the stack.
It is practical to define smart pointers as "never null". A Box in rust always owns a valid T. This is not practical in C++ because the "moved from" state for a smart pointer is essentially a null state.

* Depending on the type of resource in question, it may be possible to "deep copy" a type representing unique ownership by copying the resource that the type owns, but this may be an expensive operation.

Waterproof answered 5/2, 2024 at 21:16 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags