How to safely get an immutable byte slice from a `&mut [u32]`?
Asked Answered
S

1

10

In a rather low level part of a project of mine, a function receives a mutable slice of primitive data (&mut [u32] in this case). This data should be written to a writer in little endian.

Now, this alone wouldn't be a problem, but all of this has to be fast. I measured my application and identified this as one of the critical paths. In particular, if the endianness doesn't need to be changed (since we're already on a little endian system), there shouldn't be any overhead.

This is my code (Playground):

use std::{io, mem, slice};

fn write_data(mut w: impl io::Write, data: &mut [u32]) -> Result<(), io::Error> {
    adjust_endianness(data);

    // Is this safe?
    let bytes = unsafe {
        let len = data.len() * mem::size_of::<u32>();
        let ptr = data.as_ptr() as *const u8;
        slice::from_raw_parts(ptr, len)
    };

    w.write_all(bytes)
}

fn adjust_endianness(_: &mut [u32]) {
    // implementation omitted
}

adjust_endianness changes the endianness in place (which is fine, since a wrong-endian u32 is garbage, but still a valid u32).

This code works, but the critical question is: Is this safe? In particular, at some point, data and bytes both exist, being one mutable and one immutable slice to the same data. That sounds very bad, right?

On the other hand, I can do this:

let bytes = &data[..];

That way, I also have those two slices. The difference is just that data is now borrowed.

Is my code safe or does it exhibit UB? Why? If it's not safe, how to safely do what I want to do?

Shrum answered 2/5, 2019 at 10:29 Comment(10)
maybe doc.rust-lang.org/std/primitive.slice.html#method.align_to play.rust-lang.org/… ?Catchweight
Seeing as there is no definitive specification of behavior for unsafe Rust, it seems an answer could demonstrate unsafety but could not demonstrate safety.Interlace
I think you're in the clear, because bytes is derived from data -- it's not that different from just doing let foo = &*data, which also gives you a & slice from a &mut slice. Whether one can prove that is sound according to Rust's execution model, I'm not sure.Nonchalant
Why is it a mut slice in the first place? As trentcl already mentioned, it is more reasonable to think about turning one immutable slice to another immutable slice: a &[u32] into a &[u8].Scruple
@Scruple It's mutable because the endianness should be changed in-place. That way we can make sure to call write_all only once. With as much data as possible.Shrum
@MatthieuM. Yeah, I worried that would be a problem :/ We really need the unsafe code guidelines. Soon.Shrum
My suspicion is that you do indeed need to have let bytes = &data[..]; before casting to a raw pointer. In particular, your cast to a raw pointer drops the lifetime, and allows your &mut [u32] to alias with an &[u8], and that seems like UB to me. It would be nice to get @RalfJung's take on this.Manvel
@BurntSushi5: Is it still UB if you don't use data as mutable in the lifespan of the derived pointer :) ? I would have intuitively say that it's fine because "time-wise" there's no overlap.Interlace
Don't know. My guess would be "yes," but that's mostly just me being conservative. Having let bytes = &data[..]; before the raw pointer cast feels like it puts this pretty firmly in "that's safe" territory.Manvel
My $0.02: See Rob Pike's classic blog post The Byte Order Fallacy.Assignee
P
3

In general, creation of slices that violate Rust's safety rules, even briefly, is unsafe. If you cheat the borrow checker and make independent slices borrowing the same data as & and &mut at the same time, it will make Rust specify incorrect aliasing information in LLVM, and this may lead to actually miscompiled code. Miri doesn't flag this case, because you're not using data afterwards, but the exact details of what is unsafe are still being worked out.

To be safe, you should to explain the sharing situation to the borrow checker:

let shared_data = &data[..];

data will be temporarily reborrowed as shared/read-only for the duration shared_data is used. In this case it shouldn't cause any limitations. The data will keep being mutable after exiting this scope.

Then you'll have &[u32], but you need &[u8]. Fortunately, this conversion is safe to do, because both are shared, and u8 has lesser alignment requirement than u32 (if it was the other way, you'd have to use align_to!).

let shared_data = &data[..];
let bytes = unsafe {
    let len = shared_data.len() * mem::size_of::<u32>();
    let ptr = data.as_ptr() as *const u8;
    slice::from_raw_parts(ptr, len)
};
Perigynous answered 24/7, 2019 at 10:9 Comment(1)
I've missed the u32 twist the first time. Updated answer.Perigynous

© 2022 - 2024 — McMap. All rights reserved.