How does Rust know whether to run the destructor during stack unwind?
Asked Answered
V

3

24

The documentation for mem::uninitialized points out why it is dangerous/unsafe to use that function: calling drop on uninitialized memory is undefined behavior.

So this code should be, I believe, undefined:

let a: TypeWithDrop = unsafe { mem::uninitialized() };
panic!("=== Testing ==="); // Destructor of `a` will be run (U.B)

However, I wrote this piece of code which works in safe Rust and does not seem to suffer from undefined behavior:

#![feature(conservative_impl_trait)]

trait T {
    fn disp(&mut self);
}

struct A;
impl T for A {
    fn disp(&mut self) { println!("=== A ==="); }
}
impl Drop for A {
    fn drop(&mut self) { println!("Dropping A"); }
}

struct B;
impl T for B {
    fn disp(&mut self) { println!("=== B ==="); }
}
impl Drop for B {
    fn drop(&mut self) { println!("Dropping B"); }
}

fn foo() -> impl T { return A; }
fn bar() -> impl T { return B; }

fn main() {
    let mut a;
    let mut b;

    let i = 10;
    let t: &mut T = if i % 2 == 0 {
        a = foo();
        &mut a
    } else {
        b = bar();
        &mut b
    };

    t.disp();
    panic!("=== Test ===");
}

It always seems to execute the right destructor, while ignoring the other one. If I tried using a or b (like a.disp() instead of t.disp()) it correctly errors out saying I might be possibly using uninitialized memory. What surprised me is while panicking, it always runs the right destructor (printing the expected string) no matter what the value of i is.

How does this happen? If the runtime can determine which destructor to run, should the part about memory mandatorily needing to be initialized for types with Drop implemented be removed from documentation of mem::uninitialized() as linked above?

Ventage answered 28/9, 2016 at 14:45 Comment(4)
As Raymond Chen is fond of pointing out, because "undefined behavior" means "anything can happen and still be valid," one of the valid consequences is for everything to appear to run correctly.Mislike
A.K.A. "mistaking absence of evidence for evidence of absence" - in infamous Black Swan.Fauch
@mickeyf @MasonWheeler: Sorry if i didn't understand you but in Rust I would expect if i didn't use any unsafe code (which i don't in the main example above) whatever is the behavior observed (even in the first run of it) is pretty much well defined - that would be (one of) the main reason many would choose Rust in the first place (at least i did).Ventage
@Ventage I don't know rust at all, so I cannot say whether this is "undefined behavior" or not, but to spell out what Mason and I are saying: If what you show is in fact "undefined behavior" then even if it did seem to work this time or even the next 100 times, that could be because you got lucky and that it may not work every time.Fauch
Q
24

Using drop flags.

Rust (up to and including version 1.12) stores a boolean flag in every value whose type implements Drop (and thus increases that type's size by one byte). That flag decides whether to run the destructor. So when you do b = bar() it sets the flag for the b variable, and thus only runs b's destructor. Vice versa with a.

Note that starting from Rust version 1.13 (at the time of this writing the beta compiler) that flag is not stored in the type, but on the stack for every variable or temporary. This is made possible by the advent of the MIR in the Rust compiler. The MIR significantly simplifies the translation of Rust code to machine code, and thus enabled this feature to move drop flags to the stack. Optimizations will usually eliminate that flag if they can figure out at compile time when which object will be dropped.

You can "observe" this flag in a Rust compiler up to version 1.12 by looking at the size of the type:

struct A;

struct B;

impl Drop for B {
    fn drop(&mut self) {}
}

fn main() {
    println!("{}", std::mem::size_of::<A>());
    println!("{}", std::mem::size_of::<B>());
}

prints 0 and 1 respectively before stack flags, and 0 and 0 with stack flags.

Using mem::uninitialized is still unsafe, however, because the compiler still sees the assignment to the a variable and sets the drop flag. Thus the destructor will be called on uninitialized memory. Note that in your example the Drop impl does not access any memory of your type (except for the drop flag, but that is invisible to you). Therefor you are not accessing the uninitialized memory (which is zero bytes in size anyway, since your type is a zero sized struct). To the best of my knowledge that means that your unsafe { std::mem::uninitialized() } code is actually safe, because afterwards no memory unsafety can occur.

Quiet answered 28/9, 2016 at 14:54 Comment(2)
Right - my drop() impl is trivial to just show the printouts of which ones are executed, but i was talking about general cases of-course. One question though why is mem::uninitialized() seen as an assignment to set the drop-flag ? If i were setting it to a particular location (like operator new()) then it would make sense, but assigning a random memory location (wherever it is on stack that time) should not have set the drop-flag ? I mean when would that be meaningful ? (It could set it on next assignment just like for case of b when i is odd).Ventage
well... the difference is that a let x = mem::uninitialized() allows you to take references to x, while let x; doesn't allow this. Therefor you can pass that reference to a function that only writes to that memory. A common example is let mut x: [i8; 42] = mem::uninitialized(); some_slice_function(&mut x)Quiet
T
18

There are two questions hidden here:

  1. How does the compiler track which variable is initialized or not?
  2. Why may initializing with mem::uninitialized() lead to Undefined Behavior?

Let's tackle them in order.


How does the compiler track which variable is initialized or not?

The compiler injects so-called "drop flags": for each variable for which Drop must run at the end of the scope, a boolean flag is injected on the stack, stating whether this variable needs to be disposed of.

The flag starts off "no", moves to "yes" if the variable is initialized, and back to "no" if the variable is moved from.

Finally, when comes the time to drop this variable, the flag is checked and it is dropped if necessary.

This is unrelated as to whether the compiler's flow analysis complains about potentially uninitialized variables: only when the flow analysis is satisfied is code generated.


Why may initializing with mem::uninitialized() lead to Undefined Behavior?

When using mem::uninitialized() you make a promise to the compiler: don't worry, I'm definitely initializing this.

As far as the compiler is concerned, the variable is therefore fully initialized, and the drop flag is set to "yes" (until you move out of it).

This, in turn, means that Drop will be called.

Using an uninitialized object is Undefined Behavior, and the compiler calling Drop on an uninitialized object on your behalf counts as "using it".


Bonus:

In my tests, nothing weird happened!

Note that Undefined Behavior means that anything can happen; anything, unfortunately, also includes "seems to work" (or even "works as intended despite the odds").

In particular, if you do NOT access the object's memory in Drop::drop (just printing), then it's very likely that everything will just work. If you do access it, however, you might see weird integers, pointers pointing into the wild, etc...

And if the optimizer is clever, even without accessing it, it might do weird things! Since we are using LLVM, I invite you to read What every C programmer should know about Undefined Behavior by Chris Lattner (LLVM's father).

Trela answered 28/9, 2016 at 15:16 Comment(4)
Further question: if i is even and compiler can track that b is uninitialized and not call its dtor on unwind, why can't it track something like let u: Type = mem::uninitialized(); is uninitialized too (infact i'm explicitly specifying it as such) and not call its dtor during stack unwind, unless it was assigned ?Ventage
On the bonus part though - U.B should not be possible without unsafe code - so if it appears to work it better be a defined behavior since i haven't used any unsafe code in the main example in O.P. (in case you are referring to it)Ventage
(b.t.w your answer is pretty explanatory/helpful as well but accepted the other just because it was put 1st and was equally helpful)Ventage
@ustulation: UB only occurs if you use mem::uninitialized() which requires unsafe code. I was merely commenting that "nothing weird happening" did not mean that you had steered clear of UB :) As for accepting another answer: you are free to accept whatever you wish! :)Trela
A
3

First, there are drop flags - runtime information for tracking which variables have been initialized. If a variable was not assigned to, drop() will not be executed for it.

In stable, the drop flag is currently stored within the type itself. Writing uninitialized memory to it can cause undefined behavior as to whether drop() will or will not be called. This will soon be out of date information because the drop flag is moved out of the type itself in nightly.

In nightly Rust, if you assign uninitialized memory to a variable, it would be safe to assume that drop() will be executed. However, any useful implementation of drop() will operate on the value. There is no way to detect if the type is properly initialized or not within the Drop trait implementation: it could result in trying to free an invalid pointer or any other random thing, depending on the Drop implementation of the type. Assigning uninitialized memory to a type with Drop is ill-advised anyway.

Ailin answered 28/9, 2016 at 15:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.