Why is address of char data not displayed?
Asked Answered
T

8

70
class Address {
      int i ;
      char b;
      string c;
      public:
           void showMap ( void ) ;
};

void Address :: showMap ( void ) {
            cout << "address of int    :" << &i << endl ;
            cout << "address of char   :" << &b << endl ;
            cout << "address of string :" << &c << endl ;
}

The output is:

         address of int    :  something
         address of char   :     // nothing, blank area, that is nothing displayed
         address of string :  something 

Why?

Another interesting thing: if int, char, string is in public, then the output is

  ... int    :  something 
  ... char   :   
  ... string :  something_2

something_2 - something is always equal to 8. Why? (not 9)

Tremblay answered 1/2, 2011 at 9:19 Comment(0)
F
104

When you are taking the address of b, you get char *. operator<< interprets that as a C string, and tries to print a character sequence instead of its address.

try cout << "address of char :" << (void *) &b << endl instead.

[EDIT] Like Tomek commented, a more proper cast to use in this case is static_cast, which is a safer alternative. Here is a version that uses it instead of the C-style cast:

cout << "address of char   :" << static_cast<void *>(&b) << endl;
Fond answered 1/2, 2011 at 9:21 Comment(4)
Can you illustrate why static_cast is a safer alternative here maybe by giving examples? I don't understand what's the harm in using void * here.Oversold
@VishalSharma There isn't really one here if you know what b is. But a C++ cast gives you additional safety when b isn't what you think it is; a C cast will just blindly do what you tell it without really caring, which is ungood.Tamtama
Is the objective here just to instill good practice because I'm still not getting how cout<<(void *)&b is not good even when I don't know what b is? In any case it should just print the address, shouldn't it?Oversold
@VishalSharma Yes, avoiding C-style casts is a good practice. The behaviour in this particular case will be the same. In other cases, & could have been overloaded (so you don't get "the address") or you could be doing something where const/volatile-correctness matters, or, or, orTamtama
D
37

There are 2 questions:

  • Why it does not print the address for the char:

Printing pointers will print the address for the int*and the string* but will not print the contents for char* as there is a special overload in operator<<. If you want the address then use: static_cast<const void *>(&c);

  • Why the address difference between the int and the string is 8

On your platform sizeof(int) is 4 and sizeof(char) is 1 so you really should ask why 8 not 5. The reason is that string is aligned on a 4-byte boundary. Machines work with words rather than bytes, and work faster if words are not therefore "split" a few bytes here and a few bytes there. This is called alignment

Your system probably aligns to 4-byte boundaries. If you had a 64-bit system with 64-bit integers the difference would be 16.

(Note: 64-bit system generally refers to the size of a pointer, not an int. So a 64-bit system with a 4-byte int would still have a difference of 8 as 4+1 = 5 but rounds up to 8. If sizeof(int) is 8 then 8+1 = 9 but this rounds up to 16)

Daugava answered 7/2, 2011 at 17:4 Comment(0)
E
14

When you stream the address of a char to an ostream, it interprets that as being the address of the first character of an ASCIIZ "C-style" string, and tries to print the presumed string. You don't have a NUL terminator, so the output will keep trying to read from memory until it happens to find one or the OS shuts it down for trying to read from an invalid address. All the garbage it scans over will be sent to your output.

You can probably get it to display the address you want by casting it, as in (void*)&b.

Re the offsets into the structure: you observed the string is placed at offset 8. This is probably because you have 32-bit ints, then an 8-bit char, then the compiler chooses to insert 3 more 8-bit chars so that the string object will be aligned at a 32-bit word boundary. Many CPUs/memory-architectures need pointers, ints etc. to be on word-size boundaries to perform efficient operations on them, and would otherwise have to do many more operations to read and combine multiple values from memory before being able to use the values in an operation. Depending on your system, it may be that every class object needs to start on a word boundary, or it may be that std::string in particular starts with a size_t, pointer or other type that requires such alignment.

Ertha answered 1/2, 2011 at 9:22 Comment(0)
C
11

Because when you pass a char* to std::ostream it will print the C-style (ie: char array, char*) string it points to.

Remember that "hello" is a char*.

Corpulent answered 1/2, 2011 at 9:21 Comment(4)
"hello" is a const char[6].Gunthar
@MSalters: no. It's char[6] and decays in char* when used.Corpulent
It is char[6] only in C, but in C++ it is const char[6]. Interestingly enough it can still decay to char * though (backwards compability with C).Fond
@hrnt: That was deprecated in C++03 and removed altogether in C++11.Tamtama
C
4

The address of char is being treated as a nul-terminated string and is displaying the contents of that address, which is probably undefined, but in this case an empty string. If you cast the pointers to void *, you will get the results you desire.

The difference between something2 and something being 8 is due to aligned and ability of the compiler to decide for itself where in the stack the variables are declared.

Commute answered 1/2, 2011 at 9:23 Comment(3)
Since there is no Constructor, isn't a default constructor automatically created, which will set b = 0 therefore automatic null termination ? Also +1Ezzell
@Muggen: The code isn't complete above, so who knows what constructor is provided.Commute
@Muggen: No, the generated default ctor will not zero-initialize b. You must explicitly do that; e.g. Address() (as a temporary), new Address() (compare to new Address), Address var = Address(), (in 0x) Address var {}; (I believe, need to double check), or an Address object with static storage duration (function/namespace/global static).Semiconductor
W
2

For the second issue - the compiler by default will pad structure members. The default pad is to the sizeof(int), 4 bytes (on most architectures). This is why an int followed by a char will take 8 bytes in the structure, so the string member is at offset 8.

To disable padding, use #pragma pack(x), where x is the pad size in bytes.

Whiskey answered 1/2, 2011 at 9:25 Comment(12)
I doubt that packing would but the address of the string at a five byte offset (on many compilers), due to alignment requirements.Absolutely
Isn't data alignment platform-specific ? Also, AFAIK it is not in standard for int to be 4 bytes.Ezzell
@Muggen - Data alignment is indeed platform-specific, but most often it is to sizeof(int) - the native CPU size. On a 32 bit CPU this is 4 bytes.Whiskey
@Christopher - The offset is not 5 bytes, but 3. The int is from address 0 to 3. The char should be from 4 to 5, but instead is from 4 to 7. Finally the string starts from 8.Whiskey
@Eli: The char is at byte 4. Bytes 5 through 7 are padding, not part of the char, which by definition has sizeof(char)==1. I was referring to an offset of 5 relative to the beginning of the enclosing object.Absolutely
@KitsuneYMG: That is actually a decision of the compiler writer and the runtime environment author, not of the CPU. (Case in point: sizeof(long) on the same hardware is different between 64 bit linux and 64 bit windows.)Absolutely
@Kitsune - I've checked this, and you appear to be right. I haven't had much experience working with 64 bit architectures... Thanks for the heads-upWhiskey
@Christopher - of course, this is exactly what I've meant - the char is padded to 4 bytes.Whiskey
@Eli: I'm pretty sure we both mean the same thing, but I'm a bit in nitpicking mode right now: The char is not padded. The enclosing object is padded and the padding (of three bytes) is inserted after the char.Absolutely
@Christopher - I see what you mean and I agree.Whiskey
I believe the definitions go (in bytes): [unsigned|signed]char >=1, short>char, long>short, short <=int<=long. This is why I almost always use the exact-size types from cstdint if size matters.Sleepwalk
@KistuneYMG: C99 dictates in 6.5.3.4.3 that sizeof(unsigned char) == sizeof(signed char) == sizeof(char) == 1.Absolutely
K
2

Your syntax should be

cout << (void*) &b
Kneehigh answered 1/2, 2011 at 10:24 Comment(0)
R
0

hrnt is right about the reason for the blank: &b has type char*, and so gets printed as a string until the first zero byte. Presumably b is 0. If you set b to, say, 'A', then you should expect the printout to be a string starting with 'A' and continuing with garbage until the next zero byte. Use static_cast<void*>(&b) to print it as a an address.

For your second question, &c - &i is 8, because the size of an int is 4, the char is 1, and the string starts at the next 8-byte boundary (you are probably on a 64-bit system). Each type has a particular alignment, and C++ aligns the fields in the struct according to it, adding padding appropriately. (The rule of thumb is that a primitive field of size N is aligned to a multiple of N.) In particular you can add 3 more char fields after b without affecting the address &c.

Rachellerachis answered 13/2, 2011 at 20:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.