How to correctly (yet efficiently) implement something like "vector::insert"? (Pointer aliasing)
Asked Answered
D

3

6

Consider this hypothetical implementation of vector:

template<class T>  // ignore the allocator
struct vector
{
    typedef T* iterator;
    typedef const T* const_iterator;

    template<class It>
    void insert(iterator where, It begin, It end)
    {
        ...
    }

    ...
}

Problem

There is a subtle problem we face here:
There is the possibility that begin and end refer to items in the same vector, after where.

For example, if the user says:

vector<int> items;
for (int i = 0; i < 1000; i++)
    items.push_back(i);
items.insert(items.begin(), items.end() - 2, items.end() - 1);

If It is not a pointer type, then we're fine.
But we don't know, so we must check that [begin, end) does not refer to a range already inside the vector.

But how do we do this? According to C++, if they don't refer to the same array, then pointer comparisons would be undefined!
So the compiler could falsely tell us that the items don't alias, when in fact they do, giving us unnecessary O(n) slowdown.

Potential solution & caveat

One solution is to copy the entire vector every time, to include the new items, and then throw away the old copy.

But that's very slow in scenarios such as in the example above, where we'd be copying 1000 items just to insert 1 item, even though we might clearly already have enough capacity.

Is there a generic way to (correctly) solve this problem efficiently, i.e. without suffering from O(n) slowdown in cases where nothing is aliasing?

Dip answered 20/8, 2012 at 2:52 Comment(15)
You can use the predicates std::less etc, which are guaranteed to give a total order, even when the raw pointer comparisons do not.Davena
That said, I'm pretty sure (I haven't really though about this problem before) that the vectors defined in the standard have undefined behaviour if you try to do this, because the iterators become invalidated by the insert. (Not to discourage an implementation which extends/modifies std::vector to support this).Davena
@Mankarse: Whoa, really?! So less(p1, p2) doesn't engage in UB, whereas p1 < p2 does?? I didn't know that!Dip
Yep! [comparisons]/8: For templates greater, less, greater_equal, and less_equal, the specializations for any pointer type yield a total order, even if the built-in operators <, >, <=, >= do not.Davena
In the best case you would suffer an extra if. Why would you suffer O(n) slowdown all the time?Vang
@Mankarse: Please post that as an answer! It's an amazing find! :DDip
@Dave: Well, if you copy the entire thing, which was my initially proposed solution, it's O(n). Like I mentioned, the naive if implementation doesn't work correctly.Dip
@Mehrdad Hmmm. If there's not enough capacity you copy into new memory and it's easy. If there is and the request is to insert a piece of itself, copy that section to the end of the data and swap each element with the elements starting at where. If there is enough capacity and the request isn't to insert a piece of itself, copy where to end() to the end of the data and copy begin to end to where. Sorry if the way I wrote it is confusing, but I would consider what I just wrote the naive if implementation.Vang
@Dave: Hmm, I don't think swapping works by itself -- I think you'd need std::rotate, right?Dip
@Mehrdad std::rotate is basically a loop of swaps right? I've never used it. But actually std::rotate_copy might be all the swapping logic I described already written for you.Vang
@Dave: Right, but how many swaps? It looks O(n) to me, if I'm looking at it right. (You could argue it's always within a constant factor, etc. but it's still practically much slower than it needs to be.)Dip
@Mehrdad Well, it's O(n) but O is a crappy descriptor. You only have to loop through where to end() and begin to end in either case that you aren't allocating new memory. Not begin() to where. How can you avoid that? You can use SSE to optimize small types I guess.Vang
@Dave: Oh hmm, yeah, thinking about it more, you'd still have to touch just as many elements, so it's O(n) anyway, my bad. (Although it's swaps instead of copy, so it's slower.) But I have another question: You said, "If there is and the request is to insert a piece of itself"... but how do you test for that in the first place?Dip
@Mehrdad Based on what you've said the standard says about pointer comparisons, you can't do it in a generic cross platform way. Realistically you can do the obvious pointer comparison and it will work on every platform known to man, but pedantically you can't I guess.Vang
@Dave: Yeah, that was the whole issue to begin with. I don't buy the "it will work on every platform" thing, because of (if nothing else) bugs like this. Also realize that, if the compiler is somehow able to tell that the only value you ever assign to a pointer is the beginning or the end of an array, then it can always optimize away comparisons... it's hard, but not completely out of the question.Dip
D
6

You can use the predicates std::less etc, which are guaranteed to give a total order, even when the raw pointer comparisons do not.

From the standard [comparisons]/8:

For templates greater, less, greater_equal, and less_equal, the specializations for any pointer type yield a total order, even if the built-in operators <, >, <=, >= do not.

Davena answered 20/8, 2012 at 3:31 Comment(2)
What does total order mean?Vang
Also related, in the case of VC++ (I filed this): connect.microsoft.com/VisualStudio/feedback/details/758683/…Dip
P
0

But how do we do this? According to C++, if they don't refer to the same array, then pointer comparisons would be undefined!

Wrong. The pointer comparisons are unspecified, not undefined. From C++03 §5.9/2 [expr.rel]:

[...] Pointers to objects or functions of the same type (after pointer conversions) can be compared, with a result defined as follows:

[...]
-Other pointer comparisons are unspecified.

So it's safe to test if there is an overlap before doing the expensive-but-correct copy.

Interestingly, C99 differs from C++ in this, in that pointer comparisons between unrelated objects is undefined behavior. From C99 §6.5.8/5:

When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. [...] In all other cases, the behavior is undefined.

Psychosis answered 20/8, 2012 at 3:43 Comment(6)
How is an unspecified result going to be useful the check for overlap? If it is unspecified, then any result has no meaning (it is just guaranteed to not reformat your hard-drive or whatever).Davena
Also, it's worth noting that, in the case of Visual C++, MSDN specifically says that comparing unrelated pointers is "not guaranteed", so it's undefined on some implementations, anyway. :PDip
@Mankarse: An unspecified result is useful because if you have an array [a, a+length] and a pointer p, you can test if p points into the array with if (p >= a && p < a+length). You don't care which condition is false, as long as one is.Psychosis
@Mehrdad: Do you have a citation for that? If that's true, then that just means that Visual C++ is not a strictly conforming C++ implementation. Although I'd be very, very surprised if Visual C++ could generate code for any of its supported ISAs that didn't have defined (yet unspecified) behavior.Psychosis
@AdamRosenfield: You could've just searched it yourself, but here: msdn.microsoft.com/en-us/library/ya5wt12d.aspx "Comparison of pointers is guaranteed valid only when the pointers refer to objects in the same array or to the location one past the end of the array."Dip
That wording leaves some room for interpretation. "...is guaranteed valid only..." doesn't exactly imply either unspecified or undefined.Psychosis
H
0

Actually, this would be true even if they were regular iterators. There's nothing stopping anyone doing

std::vector<int> v; 
// fill v 
v.insert(v.end() - 3, v.begin(), v.end());

Determining if they alias is a problem for any implementation of iterators.

However, the thing you're missing is that you're the implementation, you don't have to use portable code. As the implementation, you can do whatever you want. You could say "Well, in my implementation, I follow x86 and < and > are fine to use for any pointers.". And that would be fine.

Harriettharrietta answered 20/8, 2012 at 3:57 Comment(1)
Well, x86 has nothing to do with this... aside from the fact that we'd be ignoring segmentation, the compiler can optimize the code away anyway. :-) And for my case (VC++), MSDN says it specifically does not guarantee correct behavior here, so it would be unwise to assume otherwise, especially since I don't want to tie my code to VC++. Regarding iterators: Yeah, but when you're the implementation, it boils down to using either pointers or offsets (which wouldn't have the same problem, but which would be slower in general), and pointers exhibit this problem.Dip

© 2022 - 2024 — McMap. All rights reserved.