What's the most performant way to create a sorted copy of a C++ vector?

Asked 20/3, 2017 at 23:23 Answered 25/1 at 16:39

Given a C++ vector (let's say it's of doubles, and let's call it unsorted), what is the most performant way to create a new vector sorted which contains a sorted copy of unsorted?

Consider the following naïve solution:

std::vector<double> sorted = unsorted;
std::sort(sorted.begin(), sorted.end());

This solution has two steps:

Create an entire copy of unsorted.
Sort it.

However, there is potentially a lot of wasted effort in the initial copy of step 1, particularly for a large vector that is (for example) already mostly sorted.

Were I writing this code by hand, I could combine the first pass of my sorting algorithm with step 1, by having the first pass read the values from the unsorted vector while writing them, partially sorted as necessary, into sorted. Depending on the algorithm, subsequent steps might then work just with the data in sorted.

Is there a way to do such a thing with the C++ standard library, Boost, or a third-party cross-platform library?

One important point would be to ensure that the memory for the sorted C++ vector isn't needlessly initialized to zeroes before the sorting begins. Many sorting algorithms would require immediate random-write access to the sorted vector, so using reserve() and push_back() won't work for that first pass, yet resize() would waste time initializing the vector.

Edit: As the answers and comments don't necessarily see why the "naïve solution" is inefficient, consider the case where the unsorted array is in fact already in sorted order (or just needs a single swap to become sorted). In that case, regardless of the sort algorithm, with the naïve solution every value will need to be read at least twice--once while copying, and once while sorting. But with a copy-while-sorting solution, the number of reads could potentially be halved, and thus the performance approximately doubled. A similar situation arises, regardless of the data in unsorted, when using sorting algorithms that are more performant than std::sort (which may be O(n) rather than O(n log n)).

Uticas answered 20/3, 2017 at 23:23 Comment(6)

Copy first, then sort the copy. The std::vector has an efficient copy method. – Fulllength 20/3, 2017 at 23:25

Naïve solution's pretty solid in this case. – Shaer 20/3, 2017 at 23:28

below link is not general algorithm, but worth to read. probablydance.com/2016/12/27/i-wrote-a-faster-sorting-algorithm – Ulaulah 21/3, 2017 at 1:5

It would be nice if a insertion_sort method is added in the STL so that this task can be done in a nice way. – Matteo 29/10, 2021 at 6:13

Are you asking for general case correctness, or can we assume particular things to be more performant? – Lupine 25/1 at 17:16

You can create a sorted "view" of the vector without copying by creating a new sorted vector of std::reference_wrappers. – Purgative 25/1 at 17:25

The standard library - on purpose - doesn't have a sort-while-copying function, because the copy is O(n) while std::sort is O(n log n).

So the sort will totally dominate the cost for any larger values of n. (And if n is small, it doesn't matter anyway).

Cloutman answered 20/3, 2017 at 23:30 Comment(3)

While std::sort is O(n log n) in the general case, it isn't for all possible input values. Consider data that is almost entirely sorted but only two values need to be swapped: "sort-while-copy" would be roughly twice the speed of "copy-then-sort." Similarly, consider linear-time sorting algorithms, such as a radix sort or a bucket sort. Boost contains spreadsort, which works with integers and floats, but requires an initial copy operation. – Uticas 21/3, 2017 at 0:40

If you have very specific knowledge of your data, you can of course often do something better than the general algorithms. If you always have exactly two elements out of order, you could for example swap those, use the sorted array as input to some function, and then swap them back again. That might be even faster. But, no, the standard library doesn't have something like that. And if you have this specific knowledge, using std::sort might be the worst choice as it can reach O(n^2) for already sorted data. – Cloutman 21/3, 2017 at 11:8

@Uticas radix sort and bucket sort don't generalise to arbitrary data. In the domains I've worked in, they've never even been applicable. – Lupine 25/1 at 17:22

Assuming the vector of doubles doesn't contain special numbers like NAN or infinity, then the doubles can be treated as 64 bit sign + magnitude integers, which can be converted to be used for a radix sort which is fastest. These "sign + magnitude integers" will need to be converted into 64 bit unsigned integers. These macros can be used to convert back and forth SM stands fro sign + magnitude, ULL for unsigned long long (uint64_t). It's assumed that the doubles are cast to type unsigned long long in order to use these macros:

#define SM2ULL(x) ((x)^(((~(x) >> 63)-1) | 0x8000000000000000ull))
#define ULL2SM(x) ((x)^((( (x) >> 63)-1) | 0x8000000000000000ull))

Note that using these macros will treat negative zero as less than positive zero, but this is normally not an issue.

Since radix sort needs an initial read pass to generate a matrix of counts (which are then converted into the starting or ending indices of logical bucket boundaries), then in this case, the initial read pass would be a copy pass that also generates the matrix of counts. A base 256 sort would use a matrix of size [8][256], and after the copy, 8 radix sort passes would be performed. If the vector is much larger than cache size, then the dominant time factor will be the random access writes during each radix sort pass.

Ditmore answered 21/3, 2017 at 0:39 Comment(2)

I am indeed aware of this trick, and it's a good one, and I am indeed familiar with sorting implementations that use it (see my comment on @BoPersson's answer). Unfortunately, this doesn't help with the issue of that wasteful initial copy while using the Standard Library, since all the implementations I can find seem to be sort-in-place. I am essentially looking for a pre-written sort implementation that would work the lines of std::vector<double> sorted = fast_sorted_copy(unsorted); so that I don't need to roll my own. – Uticas 21/3, 2017 at 0:47

@Uticas - I'm not aware of a standard library implementation of radix sort. The first pass could read, convert, write and also create a matrix of counts that get converted into indices. example of radix sort base 256 – Ditmore 21/3, 2017 at 0:53

There is std::partial_sort_copy() in <algorithm>

Catalogue answered 25/1 at 16:39 Comment(1)

As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center. – Glottis 25/1 at 17:20

Recommended topics

Hot tags