How to expose a Rust `Vec<T>` to FFI?
Asked Answered
L

2

33

I'm trying to construct a pair of elements:

  • array: *mut T
  • array_len: usize

array is intended to own the data

However, Box::into_raw will return *mut [T]. I cannot find any info on converting raw pointers to slices. What is its layout in memory? How do I use it from C? Should I convert to *mut T? If so, how?

Lucretialucretius answered 30/8, 2016 at 10:23 Comment(0)
P
38

If you just want some C function to mutably borrow the Vec, you can do it like this:

extern "C" {
    fn some_c_function(ptr: *mut i32, len: ffi::size_t);
}

fn safe_wrapper(a: &mut [i32]) {
    unsafe {
        some_c_function(a.as_mut_ptr(), a.len() as ffi::size_t);
    }
}

Of course, the C function shouldn't store this pointer somewhere else because that would break aliasing assumptions.

If you want to "pass ownership" of the data to C code, you'd do something like this:

use std::mem;

extern "C" {
    fn c_sink(ptr: *mut i32, len: ffi::size_t);
}

fn sink_wrapper(mut vec: Vec<i32>) {
    vec.shrink_to_fit();
    assert!(vec.len() == vec.capacity());
    let ptr = vec.as_mut_ptr();
    let len = vec.len();
    mem::forget(vec); // prevent deallocation in Rust
                      // The array is still there but no Rust object
                      // feels responsible. We only have ptr/len now
                      // to reach it.
    unsafe {
        c_sink(ptr, len as ffi::size_t);
    }
}

Here, the C function "takes ownership" in the sense that we expect it to eventually return the pointer and length to Rust, for example, by calling a Rust function to deallocate it:

#[no_mangle]
/// This is intended for the C code to call for deallocating the
/// Rust-allocated i32 array.
unsafe extern "C" fn deallocate_rust_buffer(ptr: *mut i32, len: ffi::size_t) {
    let len = len as usize;
    drop(Vec::from_raw_parts(ptr, len, len));
}

Because Vec::from_raw_parts expects three parameters, a pointer, a size and a capacity, we either have to keep track of the capacity as well somehow, or we use Vec's shrink_to_fit before passing the pointer and length to the C function. This might involve a reallocation, though.

Philina answered 1/9, 2016 at 11:53 Comment(3)
That's what I end up using: <github.com/maidsafe/safe_core/pull/321/…>. Except for the assert!, which I was thinking about using, but I wasn't confident/convinced enough.Lucretialucretius
I have no clue why but if you have a Vec with 0 length/capacity/size, and you do .as_mut_ptr(), it returns 0x1 (aka an invalid pointer).Frequentation
The docs for Vec::shrink_to_fit suggest that your assertion assert!(vec.len() == vec.capacity()); is not always correct (the docs only assert capacity() >= len()). This other question concerns essentially the same problem, so I mentioned this answer there, but that question is specifically about the possibility that shrink_to_fit isn't exact, which doesn't seem to be addressed here but does seem to be relevant.Neddie
O
11

You could use [T]::as_mut_ptr to obtain the *mut T pointer directly from Vec<T>, Box<[T]> or any other DerefMut-to-slice types.

use std::mem;

let mut boxed_slice: Box<[T]> = vector.into_boxed_slice();

let array: *mut T = boxed_slice.as_mut_ptr();
let array_len: usize = boxed_slice.len();

// Prevent the slice from being destroyed (Leak the memory).
mem::forget(boxed_slice);
Oilcan answered 30/8, 2016 at 10:28 Comment(16)
will array be valid once the vector is destroyed? The idea is that array will own the data (I'll update the question).Lucretialucretius
@vinipsmaker: No. Therefore, keep the vector from being destroyed by forgetting it. See update.Oilcan
Sorry, I'm confused how the deallocation code would work within this approach. into_boxed_slice will return... what's the layout of that in memory? boxed_slice.as_mut_ptr() is guaranteed to return a pointer to the first char? How do I convert back to Box<[T]> so I can deallocate?Lucretialucretius
@vinipsmaker: (1) the layout is not specified. the current implementation uses (ptr, len). (2) perhaps you should ask a new question. but you could try slice::from_raw_parts_mut and Box::from_raw, or use Vec::from_raw_parts but you need to pass the capacity as well.Oilcan
How slice::from_raw_parts + Box::from_raw would be okay here? Isn't Box::from_raw getting a pointer to the stack-allocated slice and not the original slice? Or Box<[T]> is a special case?Lucretialucretius
@vinipsmaker: The result of slice::from_raw_parts_mut(array, array_len) will refer to your heap-allocated memory. There is no stack allocation for any array involved here. And Box::from_raw will take ownership of it again so it'll be properly deallocated when it's dropped. But all this basically boils down to Vec::from_raw_parts(array, array_len, array_len) with the vec taking ownership of it again. BTW: into_boxed_slice is important here as it reduces the capacity to match the size. Otherwise you'd need to remember the original capacity as well.Philina
@sellibitze: can you link me to a reference? Box::drop is different than slice::Drop (I know slice is just a view and have no need to drop...). As far as I see, unless Box gives special treatment to slice, it cannot know the slice heap-allocated content (pointers stored internally by slice). All Box should see is slice as an opaque box.Lucretialucretius
@vinipsmaker: To be honest, I have trouble understanding where you think a problem is. Box::from_raw cannot know what the given pointer refers to. It just expects the pointer to come from some other Box that is now gone without having deallocated the pointee. So, you could feed it with some other pointer and violate this assumption. That's why this function is marked unsafe.Philina
If I understood correctly, Box is a smart pointer and it'll manage the heap-allocated memory. It'll use Rust's move semantics to move data from stack to heap without problems (there are no shared mutable state), but it's Box who will call deallocate. Box<T>::into_raw returns a pointer to T and it doesn't care about T's internal structure. T, in my case, is slice and AFAIK Box won't deallocate the region pointed by the slice. slice doesn't have drop. Only if Box gives special treatment to slice I can see how this work.Lucretialucretius
I just can't see a guarantee I want on Rust for Box<T>: <doc.rust-lang.org/std/boxed/struct.Box.html>. C++ smart pointers are better documented. I can know exactly who is responsible to deallocate memory: <en.cppreference.com/w/cpp/memory/unique_ptr>. I'm using an unsafe API, there is a reason why I'm worried. And Rust see memory leaks as safe (but realistically you'd only need to worry if you need an unsafe API as well).Lucretialucretius
@vinipsmaker: Box::from_raw is like unique_ptr's constructor taking a raw pointer. Box::into_raw is like unique_ptr::release. So, given one box b you can do Box::from_raw(b.into_raw()) which is similar to C++'s unique_ptr<int>(otherptr.release()) if otherptr is also a unique_ptr<int>. What your T is doesn't really matter. Both unique_ptr and Box support arrays as well: unique_ptr<int[]> or Box<[i32]>. This is properly deallocated/destroyed, too. You may be confusing this with a borrowed slice &[i32] which is just a reference and dropping it doesn't do anything.Philina
@vinipsmaker: Also, a borrowed reference/slice coerces to a raw pointer. So, given a "borrowed slice" of type &mut [i32] that's not really borrowed but the result of slice::from_raw_parts_mut you can stick it into Box::from_raw since it automatically turns into a *mut [i32]. The functions involved are all unsafe, of course, because it's easy to misuse. You must not use Box::from_raw with something that's already owned by something else or with something that requires a different kind of deallocation than what Box would be doing to release the memory.Philina
What your T is doesn't really matter... this is not true. T does matter. If the heap-allocated memory is pointed by an internal member of T, I don't even know what Box is managing (unless it gives special treatment to T, which would solve the problem). What Box<T> does know is a pointer to slice. It doesn't know the memory pointed by slice, which is the actual heap-allocated memory. Unless it gives special treatment to slice, which I cannot find on the reference documenation.Lucretialucretius
You may be confusing this with a borrowed slice &[i32]. This actually can be an answer to my question. Thanks for the patience to discuss with me. Gonna read more about it.Lucretialucretius
@sellibitze: That's an excellent explanation. Would you mind to amend these comments to your answer?Lucretialucretius
I created a new question just for the boxed slice discussion: #39331841Lucretialucretius

© 2022 - 2024 — McMap. All rights reserved.