Why does modifying a mutable reference's value through a raw pointer not violate Rust's aliasing rules?
Asked Answered
S

1

9

I don't have a particularly solid understanding of Rust's aliasing rules (and from what I've heard they're not solidly defined), but I'm having trouble understanding what makes this code example in the std::slice documentation okay. I'll repeat it here:

let x = &mut [1, 2, 4];
let x_ptr = x.as_mut_ptr();

unsafe {
    for i in 0..x.len() {
        *x_ptr.offset(i as isize) += 2;
    }
}
assert_eq!(x, &[3, 4, 6]);

The problem I see here is that x, being an &mut reference, can be assumed to be unique by the compiler. The contents of x get modified through x_ptr, and then read back via x, and I see no reason why the compiler couldn't just assume that x hadn't been modified, since it was never modified through the only existing &mut reference.

So, what am I missing here?

  • Is the compiler required to assume that *mut T may alias &mut T, even though it's normally allowed to assume that &mut T never aliases another &mut T?

  • Does the unsafe block act as some sort of aliasing barrier, where the compiler assumes that code inside it may have modified anything in scope?

  • Is this code example broken?

If there is some kind of stable rule that makes this example okay, what exactly is it? What is its extent? How much should I worry about aliasing assumptions breaking random things in unsafe Rust code?

State answered 28/9, 2018 at 4:11 Comment(4)
I think that it's LLVM who handle this and as x and x_ptr contain the address of the same type, LLVM must reload xKinnon
@Kinnon Really? I was under the impression that type-based alias analysis allowed LLVM to make stronger assumptions about the disjointness of objects of the same type in memory.State
@Mylin: From memory, TBAA is opt-in (the front-end needs to emit specific attributes) and rustc doesn't opt-in. Instead it uses per-variable annotations.Thuythuya
Indeed, Rust does NOT do any reasoning based on pointee type (except for checking for interior mutability). So what @Kinnon wrote is incorrect for Rust.Culbertson
T
9

Disclaimer: there is no formal memory model, yet.1

First of all, I'd like to address:

The problem I see here is that x, being an &mut reference, can be assumed to be unique by the compiler.

Yes... and no. x can only be assumed to be unique if not borrowed, an important distinction:

fn doit(x: &mut T) {
    let y = &mut *x;
    //  x is re-borrowed at this point.
}

Therefore, currently, I would work with the assumption that deriving a pointer from x will temporarily "borrow" x in some sense.

This is all wishy washy in the absence of a formal model, of course, and part of the reason why the rustc compiler is not too aggressive with aliasing optimizations yet: until a formal model is defined, and code is checked to match it, optimizations have to be conservative.

1 The RustBelt project is all about establishing a formally proven memory model for Rust. The latest news from Ralf Jung were about a Stacked Borrows model.


From Ralf (comments): the key point in the above example is that there is a clear transfer from x to x_ptr and back to x again. So the x_ptr is a scoped borrow in a sense. Should the usage go x, x_ptr, back to x and back to x_ptr, then the latter would be Undefined Behavior:

fn main() {
    let x = &mut [1, 2, 4];
    let x_ptr = x.as_mut_ptr(); // x_ptr borrows the right to mutate

    unsafe {
        for i in 0..x.len() {
            *x_ptr.offset(i as isize) += 2; // Fine use of raw pointer.
        }
    }
    assert_eq!(x, &[3, 4, 6]);  // x is back in charge, x_ptr invalidated.

    unsafe { *x_ptr += 1; }     // BÄM! Used no-longer-valid raw pointer.
}
Thuythuya answered 28/9, 2018 at 9:44 Comment(6)
Indeed, the key point is that x_ptr is derived from x AND x has not been used since x_ptr was created. Both of these have to be true for this code to be correct.Culbertson
Might be worth adding an example like play.rust-lang.org/… showing that using x_ptr again after x was used is not allowed.Culbertson
@RalfJung It's """not allowed""", yet assert_eq!(x, &[3, 4, 6]); right after the last line fails and tells that it has changed to 4, 4, 6... So are we back to having issues which Rust was built to avoid, by simply not even defining what's correct and what's not? If compiler doesn't optimize (I built it on release mode, same thing) it right now to not break it (it seems to, which is why I get correct results as if I was using C), then what's even the point of these arbitrary rules? This is major pain point for me, when it's impossible to figure out what's wrong and what's okay to do...Homopterous
To be completely clear: &mut + &mut = compile error, this is the only thing that's obvious... I got there by trying to figure out whether &mut + *mut = wrong (here we claim that it's wrong), and whether *mut + *mut = wrong (I have yet to find anything mentioning it). If there's no clear rules set yet, then is this UB, but not UB(tm)?Homopterous
It's UB, and exploited in some cases. Some optimizations are temporarily disabled because of LLVM bugs. But just because the compiler does not currently recognize your code in the maximal possible way, doesn't mean it won't get better in the future. You can't expect the compiler to do all the most aggressive optimizations from the start. In C the usual approach seems to be "optimize until someone complains it's wrong"; we'd like to first be sure we know what's right.Culbertson
@Sahsahae also, unsafe Rust does indeed share many of the problems of Undefined Behavior with C and C++. The value in Rust lies in the ability to seal unsafety behind an abstraction, and localize it. Compare std::vector in C++ and Vec in Rust: their implementation is very similar, and it is equally dangerous in both languages. But as a user there is a huge difference: in C++ you have to worry about iterator invalidation etc. all the time, in Rust you know the compiler got your back.Culbertson

© 2022 - 2024 — McMap. All rights reserved.