Is layout-compatibility in the c++11 (working draft) standard too weak?
Asked Answered
M

1

7

Of course, the answer is "no", because the people who wrote it thought really hard about it, however I want to know why.

Considering that (template-less) classes are often declared in header files, which are then included in several files which are compiled separately, sub-consider these two files:

file1.c

#include <cstddef>

struct Foo {
public:
   int pub;
private:
   int priv;
};

size_t getsize1(Foo const &foo) {
  return sizeof(foo);
}

file2.c

#include <cstddef>

struct Foo {
public:
   int pub;
private:
   int priv;
};

size_t getsize2(Foo const &foo) {
  return sizeof(foo);
}

In general Foo will be declared in a header file and included in both, but the effect is as shown above. (That is, including a header is no magic, it just puts the headers content on that line.) We can compile both and link them to the following:

main.cc

#include <iostream>
struct Foo {
public:
   int pub;
private:
   int priv;
};

size_t getsize1(Foo const &);
size_t getsize2(Foo const &);

int main() {
    Foo foo;
    std::cout << getsize1(foo) << ", " << getsize2(foo) << ", " << sizeof(foo) << '\n';
}

One way to do so, is by using g++:

g++ -std=c++11 -c -Wall file1.cc 
g++ -std=c++11 -c -Wall file2.cc 
g++ -std=c++11 -c -Wall main.cc 
g++ -std=c++11 -Wall *.o -o main

And (on my architecture and environment), this shows: 8, 8, 8. The sizeof's are the same for each compilation of file1.cc, file2.cc and main.cc

But does the c++11 standard guarantee this, is it really OK to expect to have layout compatibility with all 3 Foo's? Foo contains both private and public fields, hence it is not a standard-layout struct as is defined in Clause 9 par 7 of the c++11 standard (working draft):

A standard-layout class is a class that:

  • has no non-static data members of type non-standard-layout class (or array of such types) or reference,
  • has no virtual functions (10.3) and no virtual base classes (10.1),
  • has the same access control (Clause 11) for all non-static data members,
  • has no non-standard-layout base classes,
  • either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
  • has no base classes of the same type as the first non-static data member.

Since we are using structs, and to be thorough, the next par says:

A standard-layout struct is a standard-layout class defined with the class-key struct or the class-key class. A standard-layout union is a standard-layout class defined with the class-key union.

To the best of my knowledge, the standard only defines layout-compatibility between structs in standard layout (Clause 9.2, par 18).

Two standard-layout struct (Clause 9) types are layout-compatible if they have the same number of non-static data members and corresponding non-static data members (in declaration order) have layout compatible types (3.9).

So is it guaranteed all three Foo's are layout-compatible, and more importantly why?

Why would a (non-deterministic) compiler, which creates different layouts for Foo during compilation, not be a c++11 compiler?

Manor answered 23/10, 2014 at 13:58 Comment(14)
I can't give an authoritative or definite answer, but I would say that yes it would be expected to have the same layout of the three different (but equal) classes, otherwise having them in header file would not work (as you notice).Septilateral
Sense would suggest that, but I was looking for something in the c++11 standard that enforces it. It is the same as that it is nice to different compiler versions to end up with the same ABI, but technically this guarantee is non-existent; formally breaking the correctness of the idea all types of binary package managers have. However, in that case the compiler developer has the responsibility for this, the case described in the OP should be handled, IMHO, by the language.Manor
This has nothing at all to do with the notion of "layout-compatible", and everything to do with One Definition Rule. All three definitions of of Foo refer to the same entity, which must have the same set of properties in all translation units. To that end, all the definitions must be token-for-token identical, and the tokens must have the same meaning; otherwise, the program is ill-formed, no diagnostic required. See 3.2/5 for details.Beatnik
Another argument of common sense (though not backed by the standard AFAIK): Given the same input and options, the compiler must produce the same layout, otherwise the binary layout of any program/shared object risks changing with every compilation. That would completely break any hope of dynamic linking.Canula
That seems very reasonable, the ODR text in the standard is too long to read well in a few minutes; hence I will postpone this for a bit. However, if I understand correctly, if both classes had different names, the compiler is allowed to compile them differently?Manor
@Canula the standard does not promise any dynamic linking.Zeke
@Cameron: I am aware of the problems that exist with such a non-deterministic compiler, I was just wondering how the standard described this deterministic behavior.Manor
"if both classes had different names, the compiler is allowed to compile them differently?" Yes (unless they are standard-layout types).Zeke
"Of course, the answer is 'no', because the people who wrote it though really hard about it" I find this argument vacuous. Were it true then no prior standard wording would ever be changed.Cantina
@LightnessRacesinOrbit: True. However, I wrote that to denote that I was not doubting the C++11 standard implementers, just my own ability to understand it.Manor
@n.m. Translation units are not 'aware' of each other, so in order to have different compilation outputs of two classes which are token-wise the same, except for the name, the compiler would need to use the classes name a hint for different class layout and implementation, which is ludicrous, but allowed :) It can't use randomness, because of ODR.Manor
"ludicrous, but allowed" Why not? Hash the name, seed a PRNG with the result, compute a random permutation of the members, arrange members in that order. That'll show them. Don't rely on stuff that the standard doesn't promise!Zeke
@Herbert: Speaking cynically, I find that a little healthy scepticism levelled at the ISO working group is not always a bad thing ;)Cantina
@n.m. Can't do random permutation of the members, only of the groups of members separated by access control keywords. There's a guarantee that members within each section are laid out at increasing addresses. Sections could be rearranged though, and random padding could be inserted between members (as long as alignment requirements are satisfied).Beatnik
E
14

The three Foos are layout-compatible because they are the same type, struct ::Foo.

[basic.types]

11 - If two types T1 and T2 are the same type, then T1 and T2 are layout-compatible types.

The classes are the same type because they have the same (fully-qualified) name and have external linkage:

[basic]

9 - A name used in more than one translation unit can potentially refer to the same entity in these translation units depending on the linkage (3.5) of the name specified in each translation unit.

Class names declared at namespace scope that are not declared (recursively) within an unnamed namespace have external linkage:

[basic.link]

2 - A name is said to have linkage when it might denote the same [...] type [...] as a name introduced by a declaration in another scope:
— When a name has external linkage , the entity it denotes can be referred to by names from scopes of other translation units or from other scopes of the same translation unit. [...]
4 - An unnamed namespace or a namespace declared directly or indirectly within an unnamed namespace has internal linkage. All other namespaces have external linkage. A name having namespace scope that has not been given internal linkage above has the same linkage as the enclosing namespace if it is the name of [...]
— a named class (Clause 9), or an unnamed class defined in a typedef declaration in which the class has the typedef name for linkage purposes (7.1.3) [...]

Note that it is allowed to have multiple definitions of a class type appearing in different translation units, as long as the definitions consist of the same token sequence:

[basic.def.odr]

6 - There can be more than one definition of a class type (Clause 9) [...] in a program provided that each definition appears in a different translation unit, and provided [...] each definition [...] shall consist of the same sequence of tokens [...]

So if the Foos had different names, they would not be the same type; if they appeared within an anonymous namespace or within a function definition (except an inline function; see [dcl.fct.spec]/4) they would not have external linkage and so would not be the same type. In either case they would be layout-compatible only if they were standard-layout.


Some examples:

// tu1.cpp
struct Foo { private: int i; public: int j; };

// tu2.cpp
struct Foo { private: int i; public: int j; };

The two Foos are the same type.

// tu1.cpp
struct Foo { private: int i; public: int j; };

// tu2.cpp
struct Foo { private: int i; public: int k; };

ODR violation; undefined behavior.

// tu1.cpp
struct Foo { private: int i; public: int j; };

// tu2.cpp
struct Bar { private: int i; public: int j; };

Different names, so different types. Not layout-compatible.

// tu1.cpp
struct Foo { int i; int j; };

// tu2.cpp
struct Bar { int i; int j; };

Different names, different types, but layout-compatible (since standard-layout).

// tu1.cpp
namespace { struct Foo { private: int i; public: int j; }; }

// tu2.cpp
namespace { struct Foo { private: int i; public: int j; }; }

Internal linkage; different types.

// tu1.cpp
static void f() { struct Foo { private: int i; public: int j; }; }

// tu2.cpp
static void f() { struct Foo { private: int i; public: int j; }; }

No linkage; different types.

// tu1.cpp
inline void f() { struct Foo { private: int i; public: int j; }; }

// tu2.cpp
inline void f() { struct Foo { private: int i; public: int j; }; }

Same type by [dcl.fct.spec]/4.

Eliaeliades answered 23/10, 2014 at 14:28 Comment(3)
Could you add that it is crucial that both classes have the same name for them to be of the same type? (see n.m.'s comment)Manor
So, wrap Foo in an anonymous namespace, and all bets are off?Conway
@Yakk absolutely, unless the Foos are standard-layout.Eliaeliades

© 2022 - 2024 — McMap. All rights reserved.