Constructing a vector of structs (with some custom constructors) from exactly two string literals crashes. Why?
Asked Answered
H

1

25

Can you guess the output of this trivial program?

#include <vector>
#include <string>
#include <exception>
#include <iostream>

int main()
{
    try { 
        struct X {
            explicit X(int) {}
            X(std::string) {} // Just to confuse you more...
        };
        std::vector<X>{"a", "b"};
    } catch (std::exception& x) {
        std::cerr << x.what();
    }
}

Well, I couldn't, which cost me a day of "research", before landing here, with finally having it distilled from some complex real-life code (with type aliases everywhere, anon. unions with non-POD members & hand-orchestrated ctors/dtors etc., just for the vibe).

And... I still can't see what's going on! Can someone please give a gentle hint? (Hopefully just a blind spot. I no longer do C++ professionally.)

Note: clean compile* with both (latest) MSVC /W4 and GCC -Wall; same output in both (semantically).

* Even without the "confuse-the-reader" line. Which I think I'm gonna have nightmares from.


(Please bear with me for trying not to spoiler it by spelling everything out even more — after all, this truly is self-explanatory as-is, right? Except, the exact opposite for me...)

Happy answered 30/8, 2023 at 21:46 Comment(30)
Change it to vector<X>{"a", "b", "c"}; to understand the problem. Change it to vector<X>{X{"a"}, X{"b"}}; to fix the problem.Putman
@MooingDuck -- that was my original intention, indeed! :) But then I realized I still didn't understand it properly. :)Happy
coliru.stacked-crooked.com/a/830465cc50cf110a I'm confused. What problem? This appears to work exactly how I'd expect it to. It constructs a vector with two Xs, one from "a" and one from "b". I can't figure out what else you would be expecting to happen?Spectacles
@Eljay, I have a whole suite of variations I tried, including those you mentioned, ineed. But that blind spot I mentioned still seems to prevent me from seeing the solution.Happy
@MooingDuck the problem is, obviously then, my wrong expectation. I just can't see how those string literals make any sense, in that vector's init list, especially with X's string ctor removed. As I said: it must be a blind spot, and I'll probably facepalm myself afterwards...Happy
Xs string constructor isn't removed. Why would you think it's removed?Spectacles
@MooingDuck: "Even without the "confuse-the-reader" line. Which I think I'm gonna have nightmares from." Try it without the string ctor, for the extra fun, as I hinted!Happy
Debuggers are really cool. Here you could have stepped in and seen the program had landed in the wrong vector constructor.Globuliferous
coliru.stacked-crooked.com/a/6c5042098aaab902 aha, Fascinating. I was also totally wrong about what was going on.Spectacles
@user4581301, yes, I suspected that it's the wrong ctor, but CPPReference didn't help me out, and I still don't get why. (Besides, my VStudio install is broken, can't start the debugger at the moment.) And I've been in total disbelief all along, and thought I'd understand it in a minute anyway. :)Happy
You'll love this one...Globuliferous
@user4581301, ahh, a classic, indeed! Believe it or not, I did lose another day for that one too, earlier this year! :)Happy
@user4581301: Meh, that one's easy. Can't bind a non-const lvalue reference to a temporary.Ailey
@BenVoigt: works also with std::string by value Demo though ;-)Boob
Fascinating to see that there's already a Close vote on this...Happy
There's been many cornercases to think of when evolving the language when it comes to initialization / initalizer lists etc. I don't think the question deserved a downvote and am a bit surprised it only got one up.Flita
Whups. That reference wasn't supposed to be there. that was a typo on my part.Globuliferous
One of many trip-wires introduced by "uniform initialization" . IMHO it was a mistake to re-use curly braces for this, and only added to the problems ; especially when it comes to aggregate initialization which is semantically different to non-aggregate, but now cannot be distinguished just from the syntax.Paralogism
Similar (not duplicate) - #46666414Paralogism
@Paralogism (re braces etc.) Indeed. And this is a wonderful set of landmines even beyond that (the new range ctors masquerading as aggreg. initializers): the implicit conversions to horrible effect, the red herring from the string ctor, or the lack of even a single warning whatsoever, and the explicit keyword mumbling "not my job"...Happy
@Paralogism The stl tag was there for a reason actually, because this isn't just a simple vector issue: it comes from the interop. of iterators, ranges and containers in general. I just happened to use a vector. The extra tags are welcome, thx.Happy
Note that the stl tag refers to the Standard Template Library, a very influential library that formed the ideological basis (and no small amount of the initial implementations) of the C++ Standard Library back in 1998. Odds are that you are not using the STL today, so removing the tag is justified. std would be more appropriate, but probably still too broad since this is about std library containers. Not sure if there's a tag for that, though.Globuliferous
Note that vector<X> v = {"a", "b"}; works correctly. It only breaks without =.Scutcheon
@M.M: Even more similar (one of the examples has the same basic problem): Initializing vector<string> with double curly bracesAlexandriaalexandrian
@user3840170, is there a way to revert your changes (I can see no rollback for your edit)? Because adding std:: for such a short, simple example, where clarity and brevity is paramount, is mere OCD, and lowers the readability of the question for no benefits. For the title, I'm open to some changes (albeit I'm planning to add a summary answer, where Google will be able to easily find everything), but that must be a separate change.Happy
@Sz.: It's more readable when the line of code you care about tells you it's std::vector causing the issue. The changes improved the clarity. Previously there was "action at a distance" from using namespace std; (FWIW, removing the using namespace also made the code shorter)Ailey
@Globuliferous I did not realise that the STL was an independent library before being incorporated into the standard. Any chance you know of a resource to take a look at the last major revision before incorporation? The term being used interchangeably with the standard library collections has made it very hard to google.Anthropography
@Chuu: Try "SGI Template Library". This may be the latest revision: stlport.org/doc/sgi_stl.htmlAiley
@Scutcheon vector<X> v = {"a", "b"}; would still call the same constructor. Try vector<X> v = {"a", "b", "c"}; and it won't compileFlita
@Happy Here's a funny version: std::vector<X> v{"ab", "b"}; and compile with optimization. It still has undefined behavior, but will most likely create one X (using the int constructor) from 'a'. The reason is that the compiler most likely will just store "ab\0" in memory so the string literal "b" is just a pointer to "ab" + 1.Flita
F
30
std::vector<X>{"a", "b"};

This creates a vector from two iterators of type const char* using the constructor that takes two iterators:

template< class InputIt >
constexpr vector( InputIt first, InputIt last,
                  const Allocator& alloc = Allocator() );

Constructs the container with the contents of the range [first, last). This overload participates in overload resolution only if InputIt satisfies LegacyInputIterator, to avoid ambiguity with the overload (3). (below)

constexpr vector( size_type count,
                  const T& value,
                  const Allocator& alloc = Allocator() );

It's just bad luck that the decay of the two const char[]s becomes perfect iterators that fulfills the LegacyInputIterator requirement.

The iterators do not point to an array/contiguous area and the program therefore has undefined behavior.

What happens under the hood is most likely that it'll try to get from the first const char* to the second and run out of bounds as soon as its passing the null terminator after the 'a'.

A similar construction that would actually work:

const char* arr = "working";

struct X {
    explicit X(int i) {
        std::cout << static_cast<char>(i); 
    }
};

const char* first = arr;     // begin iterator
const char* last = arr + 7;  // end iterator

std::vector<X>{first, last}; // prints "working"
Flita answered 30/8, 2023 at 21:53 Comment(14)
But why? Why does it even go ahead and nonchalantly create an X vector from those (supposedly unrelated) const char* iterators? (With zero warnings, too.)Happy
@Sz.: would you expect warning from std::vector<X>{first, last};? which one?Boob
@Sz.: From outside std::vector::vector(begin, end) the compiler doesn't know the two pointers need to be related. Inside, the compiler doesn't know that they aren't related.Ailey
Under the wood, there are probably const data "a\0b" so "b" happens to be "a" + 2.Boob
@Happy I updated it a bit with an explanationFlita
@Boob there's nothing to prevent the compiler from putting b before a.Starlastarlene
Ted, thanks! Could you please add a note about the source of my main confusion: i.e. why the iterator types (which are pretty strictly matched everywhere else in the STL, traditionally) don't seem to help here at all, so I can accept your answer?Happy
@MarkRansom: links in comment got the surprising result of size of the resulting vector is 2, as OP expects (but not using std::string constructor). but we reason about UB anyway here ;-)Boob
@Happy if those strings are being interpreted as iterators, they're iterators to char which probably triggers your int constructor. They aren't pointers to char* at that point.Starlastarlene
@Happy It's just unfortunate that the decay of a const char[] becomes a perfect iterator that fulfills the LegacyInputIterator requirement.Flita
Upvoted for sure, but I'd really like that in the answer, to save some others from this pitfall, possibly, if you don't mind.Happy
@Happy Added that to the answer too.Flita
Another footnote: I think that the explicit for the int ctor in the original example is another nicely reinforcing red herring here: it may well be surprising that even that can't stop the conspiracy of those implicit conversions...Happy
@Happy Just add X(char) = delete; so no one falls into the same pitfall again.Cassady

© 2022 - 2024 — McMap. All rights reserved.