Cost of cloning iterator originating from into_iter() in Rust?

Asked 23/10, 2023 at 13:26 Answered 23/10, 2023 at 13:35

I'm trying to figure out what is the cost of cloning iterators that originate from into_iter() in Rust, but can't find anything meaningful.

Consider the code like this:

let v = vec![1,2,3,4,...]; // Some large vector
let iter = v.into_iter().map(...some closure...);
let another_iter = iter.clone(); // What is copied here??

Since I've moved the vector into an iterator, iter now owns the internal buffer with vector values. This is exactly what I want to achieve to abstract the container type.

However, what happens when I call iter.clone()? Does it copy the whole internal buffer with data (could be very expensive) or just copy the iterator state while referring to the same buffer (cheap)?

Is there an idiomatic way of storing and cheaply cloning such iterators originating from into_iter()?

Brockman answered 23/10, 2023 at 13:26 Comment(0)

As each IntoIterator implementation can define it's own type IntoIter: Iterator<Item = Self::Item>; the answer is it totally depends on the iterator produced by into_iter.

For std::vec::IntoIter it is cloning the internal buffer as can be seen by the Clone implementation of it:

impl<T: Clone, A: Allocator + Clone> Clone for IntoIter<T, A> {
    #[cfg(not(test))]
    fn clone(&self) -> Self {
        self.as_slice().to_vec_in(self.alloc.deref().clone()).into_iter()
    }
    #[cfg(test)]
    fn clone(&self) -> Self {
        crate::slice::to_vec(self.as_slice(), self.alloc.deref().clone()).into_iter()
    }
}

Airlift answered 23/10, 2023 at 13:32 Comment(3)

Thanks, this clarifies the things! However, I wonder what is the rationale of such design? Since this is immutable iterator, what prevents several copies of sharing the same buffer? – Brockman 23/10, 2023 at 13:39

@Brockman vec.into_iter() moves items out of the vec, which prevents access from where they were stored originally. If you want to have multiple iterators share the same storage, use vec.iter(), which returns references instead. – Osprey 23/10, 2023 at 13:49

@Brockman there is nothing immutable about vec::IntoIter it owns the items meaning it can do anything it likes with them. Also to get items from it has to be mutable and it will return owned items as well. Remember you can always let mut x = x; to be able to mutate owned items even if the original binding is immutable. – Airlift 23/10, 2023 at 13:52

It clones the entire buffer. Essentially equal to vec.clone(), except that it discards already-iterated items. You can see the code.

You can use Itertools::tee() instead, but it will be more efficient only if the Vec is very big and you keep both iterators around the same item, so there isn't a big lag. Even its docs warn:

Note: If the iterator is clonable, prefer using that instead of using this method. Cloning is likely to be more efficient.

Bankrupt answered 23/10, 2023 at 13:35 Comment(0)

Recommended topics

Hot tags