Why doesn't std::cell::Ref use a reference instead of NonNull?
Asked Answered
G

2

7

The std::cell::Ref struct in Rust is defined as follows:

pub struct Ref<'b, T: ?Sized + 'b> {
    // NB: we use a pointer instead of `&'b T` to avoid `noalias` violations, because a
    // `Ref` argument doesn't hold immutability for its whole scope, only until it drops.
    // `NonNull` is also covariant over `T`, just like we would have with `&T`.
    value: NonNull<T>,
    borrow: BorrowRef<'b>,
}

The // NB comment (I assume Nota bene / Nasty Bug or something?) implies that the following definition would not work, because it would be a noalias violation (do they mean the LLVM attributes in the backend?):

pub struct Ref2<'b, T: ?Sized + 'b> {
    value: &'b T,
    borrow: BorrowRef<'b>,
}

I don't understand this point, as I was under the impression that the non lexical lifetime semantics were properly preserved in code generation. Otherwise the following simple example (which of course compiles) would also be illegal, right?:

struct Foo<'a> {
    v: &'a i32,
}
fn foo(x: &mut i32) {
    let f = Foo { v: x };
    *x = 5; // value modified while the `noalias` f.v pointer is still in scope
}

Could somebody with more knowledge about the internals shed some light on this for me? I fear that I am misunderstanding something critical here, leading to potential issues in my own unsafe code.

Gadoid answered 2/12, 2023 at 14:8 Comment(3)
Your last example works because f is no longer in scope when you modify x thanks to Non-Lexical Lifetimes.Pontefract
@Jmb: please re-read the paragraph right before that code example :).Gadoid
Well, the whole point of RefCell is that the RefCell is still in scope while you access the mutable reference returned by borrow_mut, so if the RefCell held a mutable reference at the same time, that would violate Rust's aliasing rules (never mind LLVM).Pontefract
C
8

Non lexical lifetimes were never a property of code generation. They were purely a borrow checker property, and borrow checking never impacts code generation.

The example you provided is not illegal according to LLVM. The problem with noalias only manifests with function parameter, because only they get a noalias attribute (at least currently). So, the only way to write a problematic code without unsafe code will be:

fn foo(reference: &String, mut data: String) {
    // Do not use `reference` anymore here.
    
    // Write to data.
    data = String::new();
}

fn main() {
    let mut v = String::new();
    foo(&v, v);
}

Except it doesn't compile, because you're moving a borrowed v. So there was actually no way to trigger a miscompilation.

With RefCell however, we can do that:

use std::cell::{RefCell, Ref};

fn foo<'a>(reference: Ref<'a, i32>, data: &'a RefCell<i32>) {
    drop(reference);
    // Do not use `reference` anymore here.
    
    *data.borrow_mut() = 0;
}

fn main() {
    let v = RefCell::new(0);
    foo(v.borrow(), &v);
}

Which would be LLVM UB if Ref would use a reference.

Calvinna answered 2/12, 2023 at 16:41 Comment(5)
I didn't think immutable references got noalias.Grind
@Grind They do. noalias has double meaning: for mutable references it mean nobody will write through anything but this reference, and for shared references it also means nobody will write through any other reference, and since this reference is immutable it's a promise nobody will write at all.Calvinna
Thank you for the great answer. Just one point: I didn't say that NLLs were a property of code generation, but that their semantics would have to be preserved (respected / mirrored), by the assembly that the backend generates for the Rust code you wrote. This is apparently not the case, at least when unsafe is involved, which is super interesting.Gadoid
@ChayimFriedman As a follow-up to your answer: AFAICS there is nothing that prevents library code implementing their own MyRef, and doing it wrong. Where in the usage of UnsafeCell would MyRef fail to adhere to the contract? Also: Inside foo, data can't be mutated through either reference nor data before drop(reference). So while both MyRef and data would be noalias, UB can't actually manifest, can it?Trolley
@Trolley If you will follow the discussion about Ref, you will see that there was talk about language changes. I'm not sure what (and if) documented behavior this code breaks, but it's certainly non-intuitive (and thus people wanted to change it). About the Rust AM UB, I'm not sure if there is any. There could not, like you say (and this will be only LLVM UB), but I'm not sure.Calvinna
G
0

As I found this very interesting, here's a bit more in-depth answer to complement the one by @Chayim Friedman:

The LLVM Language Reference defines the noalias parameter attribute as follows:

[...] indicates that memory locations accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value. [...]

This leads to the problem: If a Ref passed in as a function parameter got this attribute, it would be a lie. Once the Ref is dropped, it's possible to get a mutable reference to the same memory location by calling borrow_mut on the original RefCell.

Note that this is a parameter attribute. It is only placed on function parameters, not arbitrary variables. That's why the example from the question is not problematic, and it's in general not possible to cause this issue in safe Rust.


If we look at the LLVM IR generated for this example (in release mode):

use std::cell::{Cell, RefCell, Ref};
pub struct Ref2<'b, T: ?Sized + 'b> {
    value: &'b T,
    borrow_flag: &'b Cell<isize>,
}

pub fn foo<'a>(myref: Ref<'a, i32>, mydata: &'a RefCell<i32>) {
    drop(myref);    
    *mydata.borrow_mut() = 0;
}

pub fn foo2<'a>(myref: Ref2<'a, i32>, mydata: &'a RefCell<i32>) {
    drop(myref);    
    *mydata.borrow_mut() = 0;
}

We can see the difference:

define void @example::foo(
    ptr nocapture noundef nonnull readnone %myref.0,
    ptr nocapture noundef nonnull align 8 %myref.1,
    ptr nocapture noundef nonnull align 8 %mydata
) 
define void @example::foo2(
    ptr noalias nocapture noundef readonly align 4 dereferenceable(4) %myref.0,
    ptr nocapture noundef nonnull readnone align 8 %myref.1,
    ptr nocapture noundef nonnull align 8 %mydata
)

Only in foo2 does the %myref.0 ptr (our value) contain a noalias function parameter attribute.

Gadoid answered 4/12, 2023 at 14:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.