Does public and private have any influence on the memory layout of an object? [duplicate]

Asked 22/3, 2016 at 8:29 Answered 22/3, 2016 at 8:54

This is a followup to another question of mine: What is the optimal order of members in a class?

Does it change anything (except visibility) if I organize the members in such a way that public, protected and private take turns?

class Example
{
public:
  SomeClass m_sc; 
protected:
  char m_ac[32];      
  SomeClass * m_scp;
private:
  char * m_name;
public:
  int m_i1;
  int m_i2;
  bool m_b1;
  bool m_b2;
private:
  bool m_b3;
};

Is there a difference between this class and a class where I make all members public at runtime? I want to follow the rule of ordering the types from large to small (if readability is not seriously harmed).

I assume that it does not affect the compiled program at all, just like const is only checked during compilation. Is this correct?

Kain answered 22/3, 2016 at 8:29 Comment(2)

Avoid protected, avoid naked pointers. – Mooring 22/3, 2016 at 12:59

Practically a duplicate: https://mcmap.net/q/23193/-do-these-members-have-unspecified-ordering/560648 Please search before asking. – Reseau 22/3, 2016 at 15:6

The answer depends on the language version, because this has changed from C++03 to C++11.

In C++03, the rule was:

Members within the same access control block (that is, from one of public, protected, private keywords to the next one from that set) are to be allocated in order of declaration within class, not necessarily contiguously.

In C++11, the rule was changed to this:

Members with the same access control level (public, protected, private) are to be allocated in order of declaration within class, not necessarily contiguously.

So in C++03, you could guarantee this (I use @ to mean the offset of a member within the class):

@m_ac < @m_scp
@m_i1 < @m_i2 < @m_b1 < @m_b2

In C++11, you have a few more guarantees:

@m_ac < @m_scp
@m_sc < @m_i1 < @m_i2 < @m_b1 < @m_b2
@m_name < @m_b3

In both versions, the compiler can re-order the members in different chains as it sees fit, and it can even interleave the chains.

Note that there is one more mechanism which can enter into the picture: standard-layout classes.

A class is standard-layout if it has no virtuals, if all its non-static data members have the same access control, it has no base classes or non-static data members of non-standard-layout type or reference type, and if it has at most one class with any non-static data members in its inheritance chain (i.e. it cannot both define its own non-static data members and inherit some from a base class).

If a class is standard-layout, there is an additional guarantee that the address of its first non-static data member is identical to that of the class object itself (which just means that padding cannot be present at the beginning of the class layout).

Note that the conditions on being standard-layout, along with practical compilers not making pessimising choices, effectively mean that in a standard-layout class, members will be arranged contiguously in order of declaration (with padding for alignment interspersed as necessary).

Artema answered 22/3, 2016 at 8:36 Comment(2)

Do you have any data on how compilers usually order the members in practice? – Radio 22/3, 2016 at 9:21

@JanDvorak No, sorry. I don't program systems where data member ordering would matter that much, so I never had a reason to look it up. I would say that platform ABI documents and compiler docs might be good starting points for finding out. – Artema 22/3, 2016 at 9:33

I would argue that rule of ordering is not always the best. The only thing you gain doing that is avoiding "padding".

However, another "rule" to follow is having your most "hot" members at the top so that they can fit in a cache line, which is normally 64 bytes.

Imagine you have a loop that checks a flag of your class, and it's offset is 1, and your other member is at offset 65, and another one at offset 200. You will get cache misses.

int count = 0;

for (int i = 0; i < 10; i++)
{
     if (class->flag/*offset:1*/ == true && class->flag2  == true/*offset:65*/)
             count += class->n; /*offset: 200*/            
}

That loop will be a lot more slower than a cache friendly version like :

int count = 0;

for (int i = 0; i < 10; i++)
{
     if (class->flag/*offset:1*/ == true && class->flag2  == true/*offset:2*/)
             count += class->n; /*offset: 3*/

}

The latter loop only needs to read a single cache line per iteration. It can't get faster.

Tumer answered 22/3, 2016 at 8:54 Comment(2)

Having the "hot" members packed together is a micro-optimization that is just not worth it in general. In the case of concurrent data structures, it's also a pessimization (because of false sharing), which is an admittedly rare case but does reinforce the fact that it is not a universal rule. – Mooring 22/3, 2016 at 12:58

@MatthieuM. You are right about false sharing, that's why I said "rule" (tried to make it sound that it's not a real rule just like the one he was using) but failed I guess. I work in a place where delays are not allowed so It's an optimization I use, not everyone needs it tho, I agree. If you know multiple threads will write/read some shared data very concurrently, of course you gonna use a different design. – Tumer 22/3, 2016 at 18:29

Recommended topics

Hot tags