Why does C++11 require std::sort to have WCET O(n log n)?
Asked Answered
S

1

11

Since C++11, the C++ Standard Library (c.f. Section 25.4.1.1 of a draft verion of the standard) requires the std::sort algorithm to have asymptotic worst case execution time O(n log n) instead of average case.

Following the change, for example the quicksort algorithm does not comply the specification anymore. This was pointed out in a bugreport for LLVM's libc++. Instead, the algorithms introsort or pdqsort, which have worst case running time O(n log n), are used usually.

Is there any documentation on the motivation for that change? Is there some anecdote or incident that lead to that change?

Slacks answered 17/3, 2021 at 8:47 Comment(18)
Because it can I guess. Why ask for less when you can ask for more?Charades
where can I find the standard or a draft thereof?Slacks
@MaxHaslbeck I use this site https://github.com/timsong-cpp/cppwpIntricate
#82156Tsarina
Indepentently of the actual reasons: Specifying the worst case is formally stronger than solely specifying the average case in terms of algorithm robustness.Connett
@Connett I think intuitively you are right. For your claim it is important which kind of distribution of input arrays is used to determine the average case complexity of a sorting algorithm. I think it might be hard to prove your claim rigorously. :)Slacks
No, that isn't the question. It was the question, until C++11 got rid of it.Neutretto
@MaxHaslbeck did you misread Secundis comment? It is in line with the point you raise. Making a statement about worst case is stronger than making a statement for average case, also for the reason you mentionVociferous
Well, in the linked resources I can nowhere see "worst case". Is it implied? Or is it poor wording?Dhar
@TobiasBrösamle complexity must always be same or better than specified (so yes it is implicit), while average means that it can be worse for specific casesVociferous
@largest_prime_is_463035818 I guess I misread. @Connett pointed out an important aspect. Sorry for diverting the discussion. I'd like to focus on the motivation for the change. Were there maybe some DoS attacks exploiting some patterns that made std::sort quadratic?Slacks
@TobiasBrösamle in that document (I think) "average case" is noted explicitely and if not explicitely stated "worst case" is consideredSlacks
no, well I am sorry for interfering. Question is clear enough, we just have to wait for someone to know the answer and write it down :)Vociferous
The the 1998 standard probably specified O(n log n) average complexity because algorithms offering that were relatively mature, while sorting algorithms with O(n log n) WCET were less so (e.g. introsort was created in 1997). In 2011, more algorithms with O(n log n) WCET would have been available and mature, so the standard could be updated. That lag is normal, and allows for rigorous verification of claims about the performance of algorithms (rather than introducing and later having to retrospectively remove a requirement because it is found that claims about newer algorithm don't hold up)Sheaves
The 1999 STL documentation from SGI contains a note about adopting Introsort but doesn't give an exact date.Athletic
It's not surprising that Introsort was adopted into some implementation of the STL fairly early because David Musser and Alex Stepanov had collaborated on generic programming since the 1980s.Athletic
Demo of quadratic number of operation in Clang's std::sort (from the bug report): gcc.godbolt.org/z/KdfT3rY1KPettis
This is issue 713 of the standard library here.Charades
E
0

IIUC, this was LWG issue 713. In that issue (from 2007) Matt Austern wrote:

The complexity of sort() is specified as "Approximately N log(N) (where N == lastfirst) comparisons on the average", with no worst case complicity specified. The intention was to allow a median-of-three quicksort implementation, which is usually O(N log N) but can be quadratic for pathological inputs. However, there is no longer any reason to allow implementers the freedom to have a worst-cast-quadratic sort algorithm. Implementers who want to use quicksort can use a variant like David Musser's "Introsort" (Software Practice and Experience 27:983-993, 1997), which is guaranteed to be O(N log N) in the worst case without incurring additional overhead in the average case. Most C++ library implementers already do this, and there is no reason not to guarantee it in the standard.

Apparently the issue was discussed at Sophia Antipolis on Wednesday 2008-06-10. The very short minutes are viewable only by WG21 members, but don't say anything significant anyway; it seems like the proposed resolution was accepted pretty much without discussion, modulo one editorial tweak.

Effervescent answered 10/8, 2024 at 2:29 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.