How to transmute a u8 buffer to struct in Rust?
Asked Answered
M

4

15

I have a byte buffer of unknown size, and I want to create a local struct variable pointing to the memory of the beginning of the buffer. Following what I'd do in C, I tried a lot of different things in Rust and kept getting errors. This is my latest attempt:

use std::mem::{size_of, transmute};

#[repr(C, packed)]
struct MyStruct {
    foo: u16,
    bar: u8,
}

fn main() {
    let v: Vec<u8> = vec![1, 2, 3];
    let buffer = v.as_slice();
    let s: MyStruct = unsafe { transmute(buffer[..size_of::<MyStruct>()]) };
}

I get an error:

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
   --> src/main.rs:12:42
    |
12  |     let s: MyStruct = unsafe { transmute(buffer[..size_of::<MyStruct>()]) };
    |                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `std::marker::Sized` is not implemented for `[u8]`
    = note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait>
Marindamarinduque answered 28/2, 2017 at 2:23 Comment(1)
You won't be able to do this because transmute is required to know the sizes at compile-time. Your solution with *mut pointers looks like how you have to do itAlkyne
J
19

If you don't want to copy the data to the struct but instead leave it in place, you can use slice::align_to. This creates a &MyStruct instead:

#[repr(C, packed)]
#[derive(Debug, Copy, Clone)]
struct MyStruct {
    foo: u16,
    bar: u8,
}

fn main() {
    let v = vec![1u8, 2, 3];

    // I copied this code from Stack Overflow
    // without understanding why this case is safe.
    let (head, body, _tail) = unsafe { v.align_to::<MyStruct>() };
    assert!(head.is_empty(), "Data was not aligned");
    let my_struct = &body[0];

    println!("{:?}", my_struct);
}

Here, it's safe to use align_to to transmute some bytes to MyStruct because we've used repr(C, packed) and all of the types in MyStruct can be any arbitrary bytes.

See also:

Jerrylee answered 11/12, 2019 at 18:59 Comment(5)
Nice, I guess this fits the question more closely. I don't remember what I was trying to do with this back in 2017, something about interfacing with asn1c code.Marindamarinduque
@Marindamarinduque align_to didn't exist in 2017, so the equivalent of this answer would have been a lot uglier then...Jerrylee
Heh. I was using an outdated Rust version even back then too.Marindamarinduque
This relies on non-guaranteed align_to implementation behavior: "[align_to] may make the middle slice the greatest length possible [but] it is permissible for all of the input data to be returned as the prefix or suffix slice." (Though at least that'll cause a panic, not UB.) :-/Gangplank
@SørenLøvborg it's true, and Miri has even made that choice before (no idea if it still does). However, I expect that that flexibility will rarely be exercised in practice.Jerrylee
S
19

You can use methods on raw pointers and functions in std::ptr to directly read/write objects in place.

In your case:

fn main() {
    let v: Vec<u8> = vec![1, 2, 3];
    let s: MyStruct = unsafe { std::ptr::read(v.as_ptr() as *const _) };
    println!("here is the struct: {:?}", s);
}

I would encourage you to wrap this in a reusable method and perform a length check on the source buffer before attempting the read.

Senary answered 28/2, 2017 at 12:23 Comment(2)
Thanks, didn't know about those read and write functions in std::ptr. I was looking in std::mem for something like that. Kinda strange IMO that those are separate modules.Marindamarinduque
@sudo: I agree that the division isn't clear to me either :(Senary
J
19

If you don't want to copy the data to the struct but instead leave it in place, you can use slice::align_to. This creates a &MyStruct instead:

#[repr(C, packed)]
#[derive(Debug, Copy, Clone)]
struct MyStruct {
    foo: u16,
    bar: u8,
}

fn main() {
    let v = vec![1u8, 2, 3];

    // I copied this code from Stack Overflow
    // without understanding why this case is safe.
    let (head, body, _tail) = unsafe { v.align_to::<MyStruct>() };
    assert!(head.is_empty(), "Data was not aligned");
    let my_struct = &body[0];

    println!("{:?}", my_struct);
}

Here, it's safe to use align_to to transmute some bytes to MyStruct because we've used repr(C, packed) and all of the types in MyStruct can be any arbitrary bytes.

See also:

Jerrylee answered 11/12, 2019 at 18:59 Comment(5)
Nice, I guess this fits the question more closely. I don't remember what I was trying to do with this back in 2017, something about interfacing with asn1c code.Marindamarinduque
@Marindamarinduque align_to didn't exist in 2017, so the equivalent of this answer would have been a lot uglier then...Jerrylee
Heh. I was using an outdated Rust version even back then too.Marindamarinduque
This relies on non-guaranteed align_to implementation behavior: "[align_to] may make the middle slice the greatest length possible [but] it is permissible for all of the input data to be returned as the prefix or suffix slice." (Though at least that'll cause a panic, not UB.) :-/Gangplank
@SørenLøvborg it's true, and Miri has even made that choice before (no idea if it still does). However, I expect that that flexibility will rarely be exercised in practice.Jerrylee
M
1

I gave up on the transmute stuff. *mut (raw pointers) in Rust are pretty similar to C pointers, so this was easy:

#[repr(C, packed)] // necessary
#[derive(Debug, Copy, Clone)] // not necessary
struct MyStruct {
    foo: u16,
    bar: u8,
}

fn main() {
    let v: Vec<u8> = vec![1, 2, 3];
    let buffer = v.as_slice();
    let mut s_safe: Option<&MyStruct> = None;
    let c_buf = buffer.as_ptr();
    let s = c_buf as *mut MyStruct;
    unsafe {
        let ref s2 = *s;
        s_safe = Some(s2);
    }
    println!("here is the struct: {:?}", s_safe.unwrap());
}

The unsafe tag there is no joke, but the way I'm using this, I know my buffer is filled and take the proper precautions involving endianness later on.

Marindamarinduque answered 28/2, 2017 at 2:23 Comment(2)
You don't need the first two lines inside the unsafe block to be there. It's better to move them out in order to minimise the amount of potentially unsafe code. Also, given you don't know the size of the buffer, you need to make sure that it is big enough or else it could panic. Here is an example.Alkyne
Thanks, that example is nicer than mine. I forgot to take advantage of Rust's knowledge of the vector size to make mine safer.Marindamarinduque
D
1

I am currently using:

unsafe { transmute::<[u8; 0x70], Header>(header_data) };

It's unsafe ofcourse, but works well.

let mut header_data = [0; 0x70];
reader.seek(SeekFrom::Start(0))?;
reader.read_exact(&mut header_data)?;
let header = unsafe { transmute::<[u8; 0x70], Header>(header_data) };
Deficit answered 5/3, 2023 at 12:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.