Using C-style arrays in std::vector - MSVC bug?
Asked Answered
S

2

16

I'm currently trying to get a legacy codebase to build under C++20, and I encountered something like this:

size_t someCount; // value comes from somewhere else
…
std::vector<const char *[2]> keyValues(someCount);

I can't trivially change this to something like std::vector<std:array<const char *, 2>> because this is later passed to some API outside of my control. The above abomination compiles fine with Clang and GCC, and even MSVC as long as I don't enable C++20, but it breaks in MSVC in C++20, as you can see here at Godbolt.

I assume this is related to the DefaultInsertable requirement on T if the above constructor is used (which is indeed the only requirement mandated by the standard). According to cppreference (see previous link), STL implementations up to C++17 used placement new to default-construct the elements, and starting in C++20, std::construct_at is being used for DefaultInsertable types. This may trigger the regression from C++17 to C++20 for MSVC.

The standard says that a type is DefaultInsertable if this expression is well-formed:

allocator_traits<A>::construct(m, p)

So in my case, that would be:

const char * dummy[2];
using Allocator = std::allocator<const char *[2]>;
Allocator a;
// This must be well-formed:
std::allocator_traits<Allocator>::construct(a, dummy);

This compiles fine and without warnings in GCC, Clang and MSVC, so I'll go ahead and assume that const char *[2] is a DefaultInsertable type. But that means that the constructor call in my first example should compile.

Is this an MSVC bug?

The compiler error is:

C:/data/msvc/14.34.31931-Pre/include\xutility(218): error C2440: 'return': cannot convert from 'const char **' to '_Ty (*)'
        with
        [
            _Ty=const char *[2]
        ]

C:/data/msvc/14.34.31931-Pre/include\xutility(218): note: Types pointed to are unrelated; conversion requires reinterpret_cast, C-style cast or parenthesized function-style cast
C:/data/msvc/14.34.31931-Pre/include\xmemory(673): note: see reference to function template instantiation '_Ty (*std::construct_at<_Objty,,0x0>(_Ty (*const )) noexcept)' being compiled
        with
        [
            _Ty=const char *[2],
            _Objty=const char *[2]
        ]
Stlouis answered 24/2, 2023 at 15:34 Comment(12)
What's the story with the "API outside of my control"? Is it something that's explicitly demanding to get a std::vector<const char *[2]>, or no soup for you?Specialism
@SamVarshavchik more or less. It's CORBA's ORB_init, which is declared roughly like this: ORB_init(…, const char* options[][2]=0). The std::vector is used to dynamically build the argument. We could of course also manually allocate that with new, but I'd rather not make it worse…Stlouis
All you need to do is allocate a single array full of const char *[2]s. Given the situation, I would not object, too much, to using std::array, and then using reinterpret_cast to make the whole problem go away.Specialism
I managed to extract the exact compiler error from the link, and added it to the questionCervine
@SamVarshavchik You mean having a std::vector<std::array<const char *, 2>> and then reinterpret_casting its data() to const char *[2]? That has to be UB, right? This feels like a very clear violation of the strict aliasing rule. Also, we have no guarantees about the memory layout of std::array, right?Stlouis
Actually, the layout of std::array is explicitly spelled out. There are some pretty good guarantees about it. This most likely violates strict aliasing, but I see it as the lesser of all evils, and this is what unit tests are for.Specialism
@LukasBarth: "That has to be UB, right?" ... so what?Claudieclaudina
@SamVarshavchik: "Actually, the layout of std::array is explicitly spelled out." Not well enough to make it guarantee-ably well-defined. array could have padding at the end. An array<char, 2> may be 4-bytes in size. You could static_assert to ensure that a particular array instantiation doesn't have padding though.Claudieclaudina
This question is tagged as language-lawyer, so 'This is UB, so what?' is not what I'm going for. I try to write well-formed C++ programs. Everything else is completely unmaintainable in my experience. YMMV.Stlouis
libc++ also rejects this. demoLorrinelorry
godbolt.org/z/Me81ar1GT 'initializing': cannot convert from 'const char **' to 'T (*)'. Apparently new T() returns a const char** when T is a const char*[2], which makes little sense to me, but is the crux of the issueCervine
Possible duplicate: #67735565Dogcatcher
H
14

Before C++20, using array types in std::vector with the default allocator std::allocator was not supported, because std::allocator::destroy would try a simple (pseudo)-destructor call that is ill-formed for anything but class and scalar types. Thereby array types did not satisfy the Erasable requirement with the default allocator, which std::vector requires to be satisfied.

Since C++20 std::allocator is defaulted through std::allocator_traits to use std::destroy_at which in turn has a special case for array types that calls the destructor individually on each element. So Erasable is now satisfied with the default allocator.

However, C++20 also defaulted std::allocator's construct through std::allocator_traits to use std::construct_at. In contrast to the old std::allocator::construct, std::construct_at returns the result of the placement-new expression and has its return-type declared as T* instead of void. T* is also the type of the first argument, i.e. T is the element type.

The issue here is that the placement-new expression inside std::construct_at will not return a pointer-to-array for array types, but instead return a pointer to the first element of the array. If T is an array type U[N], then the return type of the new expression is therefore U*, not U(*)[N] as construct_at is declared to return. Therefore the construct_at instantiation is ill-formed and array types do not satisfy the DefaultInsertable requirement with the default allocator.

It is probably unintended that std::construct_at doesn't work with bounded array types. There is an open LWG issue 3436 for that.

Hyetal answered 24/2, 2023 at 20:48 Comment(1)
Thanks for the answer. Probably you should duplicate this answer here as well: #4612773, because it is asserted there that You cannot store arrays in a vectorDogcatcher
C
4

https://en.cppreference.com/w/cpp/memory/construct_at says that all it requires is ::new(std::declval<void*>()) T(std::declval<Args>()...), which does compile in MSVC v19, so it appears that std::construct_at indeed has some bug where it accidentally coerces the T*[] into a T** and then can't convert it back.

Cervine answered 24/2, 2023 at 16:41 Comment(1)
Yes, that seems like the most likely variant. The standard does not formulate the requirement in terms of std::construct_at (but via the allocator as in my question), but I think that std::construct_at should have more or less the same behavior as allocator_traits<A>::construct(…).Stlouis

© 2022 - 2024 — McMap. All rights reserved.