How can I take an item from a Vec in Rust?
Asked Answered
F

3

40

I'm looking for a method that consumes a Vec and returns one element, without the overhead of restoring Vec's invariants the way remove and swap_remove do:

fn take<T>(vec: Vec<T>, index: usize) -> Option<T>

However, I can't find such a method. Am I missing something? Is this actually unsafe or impossible?

This is a different question from Built in *safe* way to move out of Vec<T>? There the goal was a remove method that didn't panic on out of bounds access and returned a Result. I'm looking for a method that consumes a Vec and returns one of the elements. None of the answers to the above question address my question.

Functional answered 21/8, 2017 at 19:2 Comment(3)
Do you mean specifically Option<T> rather than Option<&T> that you can get from vec.get(index), or did you miss that .get exists?Sherillsherilyn
@Sherillsherilyn I specifically mean Option<T> like it says in my question. What I want is similar to option.take() if that makes sense?Functional
Note: if you find yourself throwing Vec on a regular basis, you may want to see if you can avoid materializing them to start with. Not doing anything is always faster than doing something, no matter how efficient you are at doing it.Galatians
S
34

You can write your function like this:

fn take<T>(mut vec: Vec<T>, index: usize) -> Option<T> {
    if vec.get(index).is_none() {
        None
    } else {
        Some(vec.swap_remove(index))
    }
}

The code you see here (get and swap_remove) is guaranteed O(1).

However, kind of hidden, vec is dropped at the end of the function and this drop operation is likely not O(1), but O(n) (where n is vec.len()). If T implements Drop, then drop() is called for every element still inside the vector, meaning dropping the vector is guaranteed O(n). If T does not implement Drop, then the Vec only needs to deallocate the memory. The time complexity of the dealloc operation depends on the allocator and is not specified, so we cannot assume it is O(1).


To mention another solution using iterators:

fn take<T>(vec: Vec<T>, index: usize) -> Option<T> {
    vec.into_iter().nth(index)
}

While Iterator::nth() usually is a linear time operation, the iterator over a vector overrides this method to make it a O(1) operation. Of course, if T implements Drop, this again is an O(n) function as n elements need to be dropped.

Sylvestersylvia answered 21/8, 2017 at 19:8 Comment(5)
Ah, this is a bit of an ugly solution, but I think it should work! I still think a take method on Vec would be great...Functional
@Functional Maybe you should open a RFCChrysoprase
How about changing mut vec: Vec<T> to vec: &mut Vec<T> in first solution?Auckland
@lukwol: Why? The OP explicitly asked something that consumes the vector.Galatians
I don't know what was the case when this answer was written, but as of today, into_iter().nth() is O(1), unless the items implement Drop - but that is also true about the "swap_remove() then drop the Vec" approach.Exhilarate
G
8

The reason fn take<T>(vec: Vec<T>, index: usize) -> Option<T> does not exist in the standard library is that it is not very useful in general. For example, supposing that you have a Vec<String> of length 10, it means throwing away 9 strings and only using 1. This seems wasteful.

In general, the standard library will try to provide an API that is useful in a maximum of scenarios, and in this instance it would be more logical to have a fn take<T>(vec: &mut Vec<T>, index: usize) -> Option<T>.

The only question is how to preserve the invariant, of course:

  • it can be preserved by exchanging with the last element, which is what Vec::swap_remove does,
  • it can be preserved by shifting the successor elements in, which is what Vec::drain does.

Those are very flexible, and can be adapted to fill more specific scenarios, such as yours.


Adapting swap_remove:

fn take<T>(mut vec: Vec<T>, index: usize) -> Option<T> {
    if index < vec.len() {
        Some(vec.swap_remove(index))
    } else {
        None
    }
}

Adapting drain:

fn take<T>(mut vec: Vec<T>, index: usize) -> Option<T> {
    if index < vec.len() {
        vec.drain(index..index+1).next()
    } else {
        None
    }
}

Noting that the former is more efficient: it's O(1).


I'm looking for a method that consumes the Vec and returns one element, without the overhead of restoring Vec's invariants the way remove and swap_remove do.

This reeks of premature micro-optimization to me.

First of all, note that it is necessary to destroy the elements of the vector; you can accomplish this in two ways:

  1. swap_remove, then iterate over each element to destroy them,
  2. Iterate over each element to destroy them, skipping the specific index.

It is not clear to me that the latter would be faster than the former; if anything it looks more complicated, with more branches (I advise two loops), which may throw off the predictor and may be less amenable to vectorization.

Secondly, before complaining about the overhead of restoring the Vec's invariant, have you properly profiled the solution?

If we look at the swap_remove variant, there are 3 steps:

  1. swap_remove (O(1)),
  2. destroy each remaining element (O(N)),
  3. free the backing memory.

Step 2 may be optimized out if the element has no Drop implementation, but otherwise I would be it's a toss whether (2) or (3) is dominating the cost.

TL;DR: I am afraid that you are fighting ghost issues, profile before trying to optimize.

Galatians answered 22/8, 2017 at 9:36 Comment(2)
I'm not necessarily only complaining about the overhead of restoring Vec's invariants (although the code I'm working with is in a tight inner loop, so speed really does matter to me), I also want to write code that follows what I mean. In this case, I really want to express the take operation, not a remove operation. Even if this was just a speed issue, I need an alternative to profile against anyway.Functional
@Others: "Tight inner loop" and "dropping Vec" don't sound too good together. Once you've written the code, I encourage you to post it to Code Reviews or reddit and ask how to optimize it; hopefully there'll be a way to avoid a memory deallocation in your inner loop as those are costly.Galatians
G
0

If your type T has the Default trait implemented, what is for most Rust standard library types the case - except for rare cases eg. tuples more than twelve items long - std::mem::take corresponds pretty much to your requirement:

fn take<T: std::default::Default>(mut vec: Vec<T>, index: usize) -> Option<T> {
    if index < vec.len() {
        Some(std::mem::take(&mut vec[index]))
    }
    else {
        None
    }
}
Gynarchy answered 2/6 at 9:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.