Efficiency of flattening and collecting slices
Asked Answered
C

1

2

If one uses the standard .flatten().collect::<Box<[T]>>() on an Iterator<Item=&[T]> where T: Copy, does it:

  • perform a single allocation; and
  • use memcpy to copy each item to the destination

or does it do something less efficient?

Crier answered 26/10, 2019 at 14:21 Comment(1)
Do you mean .flatten().copied().collect::<Box<[T]>>()?Threap
I
3

Box<[T]> does not implement FromIterator<&T>, so I'll assume your actual inner iterator is something that yields owned Ts.

FromIterator<T> for Box<[T]> forwards to Vec<T>, which uses size_hint() to reserve space for lower + 1 items, and reallocates as it grows beyond that (moving elements as necessary). So the question is, what does Flatten<I> return for size_hint?

The implementation of Iterator::size_hint for Flatten<I> forwards to the internal struct FlattenCompat<I>, which is a little complicated because it supports double-ended iteration, but ultimately returns (0, None) if the outer iterator has not been advanced or exhausted.

So the answer to your question is: it does something less efficient. Namely, (unless you have already called next or next_back on the iterator at least once) it creates an empty Vec<T> and grows it progressively according to whatever growth strategy Vec uses (which is unspecified, but guaranteed by the documentation to result in O(1) amortized push).

This isn't an artificial limitation; it is fundamental to the way Flatten works. The only way you could pre-calculate the size of the flattened iterator is by exhausting the outer iterator and adding up all the inner size_hints. This is a bad idea both because it doesn't always work (the inner iterators may not return useful size_hints) and because you also have to find a way to keep the inner iterators around after exhausting the outer one; there's no solution that would be acceptable for a general purpose iterator adapter.

If you know something about your particular iterator that enables you to know what the final size should be, you can reserve the allocation yourself by calling Vec::with_capacity and use Extend to fill it from the flattened iterator, rather than using collect.

Izawa answered 26/10, 2019 at 15:33 Comment(1)
Awesome answer! I was already doing exactly what you suggested in your final para, but on review wondered whether flattening and collecting would be a more eloquent way of achieving the same; alas, I shall leave as-is. Thank you.Crier

© 2022 - 2025 — McMap. All rights reserved.