Is this unsafe usage of a braced initializer list in a range-based for loop?
Asked Answered
S

1

12

This is very similar to a question I asked earlier today. However, the example I cited in that question was incorrect; by mistake, I was looking at the wrong source file, not the one that actually had the error I described. Anyway, here's an example of my problem:

struct base { };
struct child1 : base { };
struct child2 : base { };

child1 *c1;
child2 *c2;

// Goal: iterate over a few derived class pointers using a range-based for loop.
// initializer_lists are convenient for this, but we can't just use { c1, c2 }
// here because the compiler can't deduce what the type of the initializer_list
// should be. Therefore, we have to explicitly spell out that we want to iterate
// over pointers to the base class.
for (auto ptr : std::initializer_list<base *>({ c1, c2 }))
{
   // do something
}

This was the site of some subtle memory corruption bugs in my application. After experimenting a bit, I found that changing the above to the following made the crashes that I was observing go away:

for (auto ptr : std::initializer_list<base *>{ c1, c2 })
{
   // do something
}

Note that the only change was an extra set of parentheses around the braced initializer list. I'm still trying to wrap my mind around all of the forms of initialization; am I right in surmising that in the first example, the inner braced initializer is a temporary that doesn't get its lifetime extended for the lifetime of the loop, due to the fact that it is an argument to the std::initializer_list copy constructor? Based on this previous discussion of what the equivalent statement for the range-based for is, I think that's the case.

If this is true, then does the second example succeed because it uses direct list-initialization of the std::initializer_list<base *>, and therefore the contents of the temporary braced list get their lifetime extended to the duration of the loop?

Edit: After some more work, I now have a self-contained example that illustrates the behavior that I'm seeing in my application:

#include <initializer_list>
#include <memory>

struct Test
{
    int x;
};

int main()
{
    std::unique_ptr<Test> a(new Test);
    std::unique_ptr<Test> b(new Test);
    std::unique_ptr<Test> c(new Test);
    int id = 0;
    for(auto t : std::initializer_list<Test*>({a.get(), b.get(), c.get()}))
        t->x = id++;
    return 0;
}

If I compile this on macOS with Apple LLVM version 8.1.0 (clang-802.0.42), as follows:

clang++ -O3 -std=c++11 -o crash crash.cc

then the resulting program dies with a segfault when I run it. Using an earlier version of clang (8.0.0) doesn't exhibit the issue. Likewise, my tests with gcc on Linux did not have any issues either.

I'm trying to determine whether this example illustrates undefined behavior due to the temporary lifetime issues I mentioned above (in which case my code would be in the wrong, and the newer clang version would be vindicated), or whether this is potentially just a bug in this particular clang release.

Edit 2: The test is live here. It looks like the change in behavior occurred between mainline clang v3.8.1 and v3.9.0. Previous to v3.9, the effective behavior of the program was just int main() { return 0; }, which is reasonable as there are no side effects to the code. In later versions, however, the compiler seems to optimize out the calls to operator new but keeps the writes to t->x in the loop, hence trying to access an uninitialized pointer. I suppose that might point to a compiler bug.

Saree answered 26/4, 2017 at 3:17 Comment(1)
I can't see any issue with the code. Even if it is not a good idea to use std::initializer list, there is nothing wrong with it. And teh lifetime of the data from the initializer list during the range base for loop is valid. So I can't see any reason in using the list as a source of your problem. BTW: If you simple generate your pointers with Base* ptr1 = new Child1 ... you can directly write for( auto ptr: { ptr1, ptr2 })... As all ptr have the same type ( the base ) there is no problem with type deduction. The same if using smart pointers with base type.Illinois
E
2

I'll answer in two parts: A meta-answer and specific one.

The general (and more relevant) answer: Don't use initializer_list's unless you have to.

Is this unsafe usage of a braced initializer list?

Of course it is. Generally, you shouldn't construct initializer lists yourself. It's an ugly piece of detail on the seam between the language and the standard library. You should ignore its very existence unless you absolutely have to work with it, and even then - do so because some other code made you, don't put yourself into holes from which you need to be a language lawyer to climb out. The definition of std::initializer_list has a lot of fine print full of weasel words like "proxy", "provides access to" (as opposed to "is"), "may be implemented as" (but also may not) and "is not guaranteed to exist after". Ugh.

More specifically, you're using a ranged-based for loop. Whatever made you want to explicitly instantiate an std::initializer_list? What's wrong with the following:

struct Test { int x; };

int main() {
    std::unique_ptr<Test> a(new Test);
    std::unique_ptr<Test> b(new Test);
    std::unique_ptr<Test> c(new Test);
    int id = 0;
    for( auto t : {a.get(), b.get(), c.get()} )
         t->x = id++;
}

? That's the way your example should be written.

Now, you might be asking: "But what's the type of that brace-enclosed initializer list?"

Well, to that I would say:

  1. What's it to you? The intent is clear, and there is no reason to believe any segfault should occur here. If this code had not worked, you could (and should) have complained straight to the standard committee about how the language was brain-dead.
  2. I don't know. Or I rather, I make it a point not to know unless I really have to.
  3. (Ok, just between you and me - it's std::initializer_list<Test *> &&, but really, forget that fact.)

About your specific final version of the code

Your loop,

for(auto t : std::initializer_list<Test*>({a.get(), b.get(), c.get()})) { t->x = id++; }

is equivalent to the following code:

auto && range = std::initializer_list<Test*>({a.get(), b.get(), c.get()});
for (auto begin = std::begin(range), end = range.end(); begin != end; ++begin) {
    auto t = *begin;
    t->x = id++;
}

The initializer list definition says:

The underlying array is not guaranteed to exist after the lifetime of the original initializer list object has ended.

I am not a language lawyer, but my guess is as follows: It seems like you're initializing the outer initializer_list with an inner one, and thus not getting a guaranteed lifetime extension of the inner one beyond the constructor of the outer initializer_list. It is therefore up to the compiler's whim whether to keep it alive or not, and it's entirely possible that different versions of different compilers behave differently in this respect.

Enthusiastic answered 3/1, 2020 at 22:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.