Ownership tracking in Rust: Difference between Box<T> (heap) and T (stack)
Asked Answered
E

1

5

Experimenting with the programming language Rust, I found that the compiler is able to track a move of a field of some struct on the stack very accurately (it knows exactly what field has moved). However, when I put one part of the structure into a Box (i.e. putting it onto the heap), the compiler is no longer able to determine field-level moves for everything that happens after the dereference of the box. It will assume that the whole structure "inside the box" has moved. Let's first see an example where everything is on the stack:

struct OuterContainer {
    inner: InnerContainer
}

struct InnerContainer {
    val_a: ValContainer,
    val_b: ValContainer
}

struct ValContainer {
    i: i32
}


fn main() {
    // Note that the whole structure lives on the stack.
    let structure = OuterContainer {
        inner: InnerContainer {
            val_a: ValContainer { i: 42 },
            val_b: ValContainer { i: 100 }
        }
    };

    // Move just one field (val_a) of the inner container.
    let move_me = structure.inner.val_a;

    // We can still borrow the other field (val_b).
    let borrow_me = &structure.inner.val_b;
}

And now the same example but with one minor change: We put the InnerContainer into a box (Box<InnerContainer>).

struct OuterContainer {
    inner: Box<InnerContainer>
}

struct InnerContainer {
    val_a: ValContainer,
    val_b: ValContainer
}

struct ValContainer {
    i: i32
}


fn main() {
    // Note that the whole structure lives on the stack.
    let structure = OuterContainer {
        inner: Box::new(InnerContainer {
            val_a: ValContainer { i: 42 },
            val_b: ValContainer { i: 100 }
        })
    };

    // Move just one field (val_a) of the inner container.
    // Note that now, the inner container lives on the heap.
    let move_me = structure.inner.val_a;

    // We can no longer borrow the other field (val_b).
    let borrow_me = &structure.inner.val_b; // error: "value used after move"
}

I suspect that it has something to do with the nature of the stack vs. the nature of the heap, where the former is static (per stack frame at least), and the latter is dynamic. Maybe the compiler needs to play it safe because of some reason I cannot articulate/identify well enough.

Enterostomy answered 20/5, 2017 at 4:20 Comment(3)
i32 is a Copy type, so the data get copied not moved.Wakerife
But, I am operating on the surrounding struct (ValContainer), not on the contained integer. And custom struct types are by default not Copyable, to my knowledge.Enterostomy
Yup, you were right. Did not properly read your code.Wakerife
B
13

In the abstract, a struct on the stack is kind of just a bunch of variables under a common name. The compiler knows this, and can break a structure into a set of otherwise independent stack variables. This lets it track the movement of each field independently.

It can't do that with a Box, or any other kind of custom allocation, because the compiler doesn't control Boxes. Box is just some code in the standard library, not an intrinsic part of the language. Box has no way of reasoning about different parts of itself suddenly becoming not valid. When it comes time to destroy a Box, it's Drop implementation only knows to destroy everything.

To put it another way: on the stack, the compiler is in full control, and can thus do fancy things like breaking structures up and moving them piecemeal. As soon as custom allocation enters the picture, all bets are off, and the compiler has to back off and stop trying to be clever.

Backstretch answered 20/5, 2017 at 4:34 Comment(8)
What do you mean by boxes control boxes? I also don't quite understand what Drop has to do with it all? Drop is called when the Box value is destroyed, for which the compiler knows exactly when that happens. But why does it loose control?Enterostomy
@MightyNicM: I've tried to clarify the "control boxes" part. The reason Drop is important is because that's what destroys the contents of the Box (or any other type). If the compiler were to move one part out of a Box, the Box has no way of knowing that. There's just suddenly a hole in the middle of its allocation, which could cause problems when it tries to destroy said hole. It's not a problem of when to drop, it's a problem of what to drop. The compiler can deal with "holes" in the stack, but nowhere else.Backstretch
I see. I never truly understood the term "move". Somewhere, I read that the compiler will still copy the value bit by bit, but just does no longer allow the moved part to be modified in any way. So I guess when using the term "hole", you just mean the old copy of the data that shouldn't be used (or destroyed) anymore?Enterostomy
@MightyNicM: Right. Once the compiler moves something, the bits left behind in the old position are invalid and must not be touched. If they are, you can end up with double-frees and other bad behaviour.Backstretch
So, let me then ask the more general question: Is it in principle even possible to track ownership on the heap, assuming the compiler would manage Boxes itself? Since the heap is dynamic, it can contain cyclic and recursive structures, possibly aliases. So even in theory, is it even computable?Enterostomy
@MightyNicM: Rust cannot track heap lifetimes. It might be possible in theory, but I don't believe it's even hypothetically planned for.Backstretch
The important distinction is the destructor, not the stack versus the heap. If you take the non-boxed version and implement Drop for either OuterContainer or InnerContainer, then partial moves are disallowed.Dignify
Some minor pedantry: the compiler in fact does know about Box, and has historically enabled you to do the sort of things you could do with structs to a Box. The only one I remember that still exists is that you can move out of a dereference of a Box, which isn't something anything else can do. Most of this stuff was significantly rolled back for 1.0, because we didn't want the compiler to give one type a bunch of special analysis (why not Rc, Vec, etc...?). There has since been some efforts to come up with general mechanisms to expose the old Box analysis in a principled way.Olivette

© 2022 - 2024 — McMap. All rights reserved.