How to use SmallVec with Cow
Asked Answered
J

1

6

I want to use SmallVec with Cow. I tried this:

use smallvec::SmallVec;
use std::borrow::Cow;

fn main() {
    let s = "hello world".to_owned();
    let mut s = Cow::Borrowed(s.as_bytes());
    clear_subslice(&mut s, 2, 6);
}

fn clear_subslice(text: &mut Cow<'_, [u8]>, start: usize, end: usize) {
    match text {
        Cow::Borrowed(v) => {
            if !v[start..end].iter().all(|&c| c == b' ') {
                let mut v = SmallVec::from_slice(v);
                v[start..end].iter_mut().for_each(|c| *c = b' ');
                *text = Cow::Owned(v);
            }
        }
        Cow::Owned(v) => {
            v[start..end].iter_mut().for_each(|c| *c = b' ');
        }
    }
}
error[E0271]: type mismatch resolving `<[u8] as std::borrow::ToOwned>::Owned == smallvec::SmallVec<_>`
  --> src/main.rs:16:25
   |
16 |                 *text = Cow::Owned(v);
   |                         ^^^^^^^^^^^^^ expected struct `std::vec::Vec`, found struct `smallvec::SmallVec`
   |
   = note: expected type `std::vec::Vec<u8>`
              found type `smallvec::SmallVec<_>`

It works only with types that have ToOwned implemented to a particular type. In this case, &[u8] has ToOwned implemented with target Vec.

I tried to implement ToOwned with target as SmallVec but without success.

Is it possible to use SmallVec with Cow?

One solution I know of is using a custom Cow enum:

pub enum SmallCow<'a, A: Array> {
    Borrowed(&'a [A::Item]),
    Owned(SmallVec<A>),
}

Is there any other way?

Josefajosefina answered 3/10, 2019 at 4:44 Comment(2)
@FrenchBoiethios: But you would need to add a impl ToOwned for &[u8]. The SmallVec team would still suffer the orphan rule. And even if they weren't, there is already an implementation for this trait in std.Erleena
@Erleena Oh, you're right... What was I thinking...Brumby
M
2

The fact of the matter is that Cow<'a, T> requires T to implement ToOwned and the owned version of the Cow<'a, T> is the associated type Owned of ToOwned. Moreover, Owned, has to implement Borrow<T>. As it stands, Cow<'a, [u8]> can only use Vec<u8> as its owned variant because [T] implements ToOwned with Vec<T> as the Owned associated type.

I see two options for you. Either you can make your own implementation of Cow that uses different trait bounds (or as you suggested, simply specialize to your exact use-case), or you can use newtypes to wrap [u8] and SmallVec<A> and implement ToOwned on the wrapper for [u8] and Borrow<SliceWrapper<u8>> on the wrapper for SmallVec<A>. I'll concentrate on the latter since you seem to already have the former covered.

A newtype is a wrapper that, in essence, declares a new type which is equivalent to the original type, but doesn't have any traits or methods. The usual way to do this is with a tuple struct.

use small_vec::{Array, SmallVec};

struct SmallVecWrap<A: Array>(SmallVec<A>);

struct SliceWrap<T>([T]);

Notice that SliceWrap<T> is an unsized type since [T] is, so we'll always use it behind a pointer. It's important that we do this, since when we implement Borrow on SmallVecWrap<A>, it'll be Borrow<SliceWrap<T>>, rather than Borrow<&SliceWrap<T>>. That is, Borrow uses the unsized type as its type parameter (I suppose it might be possible to do without that, but you'd have an extra layer of indirection and you wouldn't be able to use mutating methods on the slice).

The one major issue I ran into with this approach is that there doesn't seem to be a way to turn &[u8] into &SliceWrap<u8> without an unsafe block. This does make a certain amount of sense since without any extra information, the two types might be semantically different. For example, NonZeroU8 is in a similar situation but it doesn't make sense to convert a u8 into NonZeroU8 without checking if it's zero. RFC #1909, unsized rvalues, may help with this, but I couldn't get it to work. I'll note that MIRI didn't find any problems when run on your testcase.

The other problem with this approach is that you have to either always defer to the wrapped type (e.g. v.0 in the example code) and then potentially rewrap the returned value or else reimplement all the traits and methods that you need. This same problem applies to the SmallCow<'a, A> approach, but you only need to implement Cow<'a, T>'s traits and methods, and there aren't as many of those.

If you decide to always defer to the wrapped type's methods, you'll probably want to make the newtypes' fields public (e.g. SliceWrap<T>(pub [T])) so you can use them outside of this one module.

The final issue with this approach is again with ToOwned. ToOwned requires a single type to convert into but SmallVecWrap<A> is not a single type, even if the type of the elements of A is fixed. For example, &[u8] could validly be converted into SmallVecWrap<[u8, 1]>, SmallVecWrap<[u8, 2]>, etc. One possible workaround is to attach the type A to SliceWrap<T>:

struct SliceWrap<T, A: Array> {
    array: std::marker::PhantomData<A>,
    slice: [T],
}

Then you could implement ToOwned for SliceWrap<T, A> with Owned as SmallVecWrap<A>.

Anyway, here's the complete example.

use smallvec::{Array, SmallVec}; // 0.6.10
use std::borrow::{Borrow, Cow, ToOwned};

struct SmallVecWrap<A: Array>(SmallVec<A>);

#[repr(transparent)]
struct SliceWrap<T>([T]);

impl<T> SliceWrap<T> {
    // for convenience
    fn from_slice(slice: &[T]) -> &Self {
        // As far as I can tell, there's no way to do this without unsafe.
        // This should be safe since SliceWrap<T> is transparently a [T].
        // All we're doing is changing a (fat) pointer to a [T]
        // into a (fat) pointer to SliceWrap<T>.
        // I won't claim expertise on this, though.
        unsafe { &*((slice as *const [T]) as *const SliceWrap<T>) }
        //          ^                   ^
        // These parentheses aren't needed, but it's clearer this way
    }

    // I guess we didn't need this
    #[allow(dead_code)]
    fn from_mut_slice(slice: &mut [T]) -> &mut Self {
        // Same caveats apply
        unsafe { &mut *((slice as *mut [T]) as *mut SliceWrap<T>) }
    }
}

impl<A: Array> Borrow<SliceWrap<A::Item>> for SmallVecWrap<A> {
    fn borrow(&self) -> &SliceWrap<A::Item> {
        SliceWrap::from_slice(self.0.borrow())
    }
}

// Note: We have to choose a particular array size
// to use for the owned SmallVec<A>.
const OWNED_ARRAY_SIZE: usize = 4;
impl<T: Clone> ToOwned for SliceWrap<T> {
    type Owned = SmallVecWrap<[T; OWNED_ARRAY_SIZE]>;

    fn to_owned(&self) -> SmallVecWrap<[T; OWNED_ARRAY_SIZE]> {
        SmallVecWrap(self.0.into())
    }
}

fn main() {
    let s = "hello world".to_owned();
    let mut s = Cow::Borrowed(SliceWrap::from_slice(s.as_bytes()));
    clear_subslice(&mut s, 2, 6);
}

fn clear_subslice(text: &mut Cow<'_, SliceWrap<u8>>, start: usize, end: usize) {
    match text {
        Cow::Borrowed(v) => {
            if !v.0[start..end].iter().all(|&c| c == b' ') {
                let mut v = SmallVec::from_slice(&v.0);
                v[start..end].iter_mut().for_each(|c| *c = b' ');
                *text = Cow::Owned(SmallVecWrap(v));
            }
        }
        Cow::Owned(v) => {
            v.0[start..end].iter_mut().for_each(|c| *c = b' ');
        }
    }
}

(playground)


A third option exists for you: don't use SmallVec<A> unless you've benchmarked and determined that these small allocations are significantly slowing down your program.

Mcgruter answered 8/10, 2019 at 8:6 Comment(1)
Very nice answer. Thank you. I actually ran some benchmarks using the custom setup with SmallVec and it's actually slower (though lesser heap allocations) than the regular Vec case.Josefajosefina

© 2022 - 2024 — McMap. All rights reserved.