C++ aliasing rules
Asked Answered
c++
P

2

23

Just wondering if someone would confirm a few aliasing rules for me.

I know that aliasing (i.e load-store issues) could cause the following type of code to be suboptimal, because we can't assume that x, y, z don't overlap:

// case 1:
void plus(size_t n, double *x, double *y, double *z)
{
    for (size_t i = 0; i != n; ++i)
        z[i] = x[i] + y[i];
} 

I know that there's a C keyword __restrict that hints to the compiler that it shouldn't consider the overlapping case, and hence potentially generate better code:

// case 2:
void plus(size_t n, double *__restrict x, double *__restrict y, double *__restrict z)
{ // as above... }

But how does aliasing work with C++ style code, where we would be dealing with container objects passed by reference, rather than the C-like examples above with raw pointers??

For instance, I'm assuming that there would be aliasing issues if we did the following:

// case 3:
void plus(std::vector<double> &x, std::vector<double> &y, std::vector<double> &z)
{ // similar to above... }

And to move to a less trivial example, does it make any difference if the underlying data types in the containers are different?? At the implementation level most containers dynamically manage storage with pointers, so it's not clear to me how the compiler could ensure that the following doesn't alias:

// case 4:
void foo(std::vector<mytype1> &x, std::vector<mytype2> &y)
{ // interwoven operations on x, y... }

I'm not trying to micro-optimise, but I'm wondering if it's currently better to pass restricted pointers to containers around, rather than references.

EDIT: To clear up some terminology, as pointed out: restrict is the C99 keyword. There's also __restrict and __restrict__ in various compilers, but they all do the same thing.

Phocomelia answered 12/6, 2011 at 7:42 Comment(5)
For the container example, unless &x == &y I don't believe there's any way they could overlap, assuming standards-compliant vector implementations.Hophead
@Sven: Distinct std::vector's don't overlap, but in general you could pass any object type by reference, potentially ones that do overlap. So I would expect that compilers need to ensure correctness, with conservative assumptions about aliasing. That's why __restrict came in...Phocomelia
@Darran: Are there actually compilers that will do vector optimizations or something when passed an object that overloads operator[], rather than an actual array or pointer? Somehow I doubt it.Hophead
@Sven: As I understand it, the load store issue is not just about whether loops can be unrolled into sse/sse2/etc vector instructions, but more to do with the fact that that inside a loop, after a store op, all of the variables accessed through pointers that may alias need to be re-loaded, which effects register use. As for std::vector, I would hope that operator[] would inline on most compilers/implementations, otherwise there could be big time performance issues, aliasing or not...Phocomelia
restrict is a C keyword (since C99), but __restrict is not. As the __ indicates, it is a compiler extension.Showboat
S
9

According to the strict-aliasing rule, you are not allowed to alias the same memory with pointers to different types (except char* and friends), so case 4 could only apply if one of the types was a char*.

Case 3 though isn't all that different from case 1, as references are implemented as pointers on all compilers I know, though the standard doesn't demand that and an implementation is free to come up with something else.

Song answered 12/6, 2011 at 7:53 Comment(4)
I agree that case 1 and 3 are essentially the same as far as aliasing goes, but if you're working with references as per case 3, you can't use __restrict, that's why I was thinking that passing containers via restricted pointers might be better than references...Phocomelia
@Darren: The C++ standard doesn't even mention the __restrict keyword, it's a MSVC extension. Maybe you should add that bit to the question.Song
Thanks for the strict alias info, that was new to me. Case 4 is probably the one that occurs most often in real code, so aliasing issues may not occur as often as I thought they might. And _restrict, or restrict, _restrict, etc (as I pointed out) is a C (99) keyword, but GCC and MSVC at least support it for C++Phocomelia
except char* and friends: Can you help to add a specific list of types allowed to make your answer more complete? Thanks. =)Vicechancellor
S
5

It's not specific at all to C++. Consider this C99 bit:

struct vector {
    double* data;
    size_t n;
};

void
plus(struct vector* restrict x, struct vector* restrict y, struct vector* restrict z)
{
    // same deal as ever
}

Here, restrict buys us very little: x->data, y->data and z->data are all double* and are allowed to alias. This is exactly like case 1, even when using restrict.

If there were a restrict keyword in C++ (or when using an extension), the best bet would probably to do plus(vecA.size(), &vecA[0], &vecB[0], &vecB[0]), using the same plus as in case 2. And in fact it's possible to do this right now, using a C89-style interface without restrict but that uses the keyword under the covers.

Showboat answered 12/6, 2011 at 9:17 Comment(2)
Could you define something like struct vector { double *restrict data; //etc }; in this sort of case, to prevent aliasing??Phocomelia
@Darren You can and it would work in this case but I think that given void f(struct vector* x, struct vector* y) then f(p, p) is problematic: x->data will be double* restrict but will alias y->data (IIUC). restrict really seems more like a powerful tool to reserve for special situations than a panacea.Showboat

© 2022 - 2024 — McMap. All rights reserved.