Why are the addresses of argc and argv 12 bytes apart?

Asked 8/2, 2020 at 15:34 Answered 8/2, 2020 at 16:8

I ran the following program on my computer (64-bit Intel running Linux).

#include <stdio.h>

void test(int argc, char **argv) {
    printf("[test] Argc Pointer: %p\n", &argc);
    printf("[test] Argv Pointer: %p\n", &argv);
}

int main(int argc, char **argv) {
    printf("Argc Pointer: %p\n", &argc);
    printf("Argv Pointer: %p\n", &argv);
    printf("Size of &argc: %lu\n", sizeof (&argc));
    printf("Size of &argv: %lu\n", sizeof (&argv));
    test(argc, argv);
    return 0;
}

The output of the program was

$ gcc size.c -o size
$ ./size
Argc Pointer: 0x7fffd7000e4c
Argv Pointer: 0x7fffd7000e40
Size of &argc: 8
Size of &argv: 8
[test] Argc Pointer: 0x7fffd7000e2c
[test] Argv Pointer: 0x7fffd7000e20

The size of the pointer &argv is 8 bytes. I expected the address of argc to be address of (argv) + sizeof (argv) = 0x7ffed1a4c9f0 + 0x8 = 0x7ffed1a4c9f8 but there is a 4 byte padding in between them. Why is this the case?

My guess is that it could be due to memory alignment, but I am not sure.

I notice the same behaviour with the functions I call as well.

Edik answered 8/2, 2020 at 15:34 Comment(6)

Why not? They could be 174 bytes apart. An answer will depend on your operating system and/or a wrapper library that does setup for main. – Cession 8/2, 2020 at 15:38

@aschepler: It should not depend on any wrapper that does setup for main. In C, main can be called as a regular function, so it needs to receive arguments like a regular function and must obey the ABI. – Betoken 8/2, 2020 at 15:52

@aschelper: I notice the same behaviour for other functions as well. – Edik 8/2, 2020 at 15:55

It's an interesting 'thought experiment', but really, there is nothing that should be more than a 'I wonder why'. These addresses can change depending on the os, compiler, compiler version, processor architecture and in no way should be depended upon in 'real life'. – Foetation 9/2, 2020 at 12:31

the result of sizeof must be printed using %zu – Pearcy 10/2, 2020 at 3:22

@phuclv: For small objects, one could also cast the result of sizeof to unsigned and then format with %u. – Filature 11/2, 2020 at 22:1

On your system, the first few integer or pointer arguments are passed in registers and have no addresses. When you take their addresses with &argc or &argv, the compiler has to fabricate addresses by writing the register contents to stack locations and giving you the addresses of those stack locations. In doing so, the compiler chooses, in a sense, whatever stack locations happen to be convenient for it.

Betoken answered 8/2, 2020 at 15:53 Comment(1)

Note that this could happen even if they are passed on the stack; the compiler has no obligation to use the incoming-value slot on the stack as the storage for the local objects the values go into. It might make sense to do this is the function is eventually going to tail-call and needs the current values of these objects to produce the outgoing arguments for the tail-call. – Apothecium 8/2, 2020 at 16:39

Why are the addresses of argc and argv 12 bytes apart?

From the perspective of the language standard, the answer is "no particular reason". C does not specify or imply any relationship between the addresses of function parameters. @EricPostpischil describes what is probably happening in your particular implementation, but those details would be different for an implementation in which all arguments are passed on the stack, and that is not the only alternative.

Moreover, I'm having trouble coming up with a way in which such information could be useful within a program. For example, even if you "know" that the address of argv is 12 bytes before the address of argc, there's still no defined way to compute one of those pointers from the other.

Laurentian answered 8/2, 2020 at 16:8 Comment(18)

Computing one from the other via conversion through uintptr_t is well-defined provided uintptr_t is defined. The upcoming "provenance" changes make a mess of this and make it difficult to do in a well-defined way, I think, but in all past versions of C it's been formally well-defined. – Apothecium 8/2, 2020 at 16:41

@R..GitHubSTOPHELPINGICE: Computing one from the other is partially defined, not well defined. The C standard is not strict on how the conversion to uintptr_t is performed, and it certainly does not define relationships between the addresses of parameters or where arguments are passed. – Betoken 8/2, 2020 at 17:4

@EricPostpischil: If you already computed that the difference between (uintptr_t)&a and (uintptr_t)&b) is d, then (void*)((uintptr_t)&a+d)==&b. That's what I mean by well-definedness here. – Apothecium 8/2, 2020 at 23:12

@R..GitHubSTOPHELPINGICE: No such property is defined in the C standard. C 2018 7.20.1.4, which specifies the uintptr_t type, says that “any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer.” The passage on converting pointers to integer types generally, 6.3.2.3 6, says “A pointer to an object type may be converted to a pointer to a different object type.”… – Betoken 9/2, 2020 at 0:14

… A non-normative footnote says “The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.” Nothing specifies requirements on what the results will be if you convert, do arithmetic, and convert back. In a flat address space with plain pointers, then conforming to the intent, as expressed in the footnote, would give that result. But as we see from the normative text, this is not fully specified by the standard. It is not well defined. – Betoken 9/2, 2020 at 0:15

@EricPostpischil: I don't follow what you're claiming. As you cited, 7.20.1.4 specifies that you can round-trip pointers through uintptr_t. As such, (void *)((uintptr_t)&a + ((uintptr_t)&b-(uintptr_t)&a)) == (void *)(uintptr_t)&b == (void *)&b. The first equality is the definedness of the cast to (void *) as a function of the operand value; the second is the round-trip property. – Apothecium 9/2, 2020 at 0:36

@R..GitHubSTOPHELPINGICE: The fact that you can round-trip means that g(f(x)) = x, where x is a pointer, f is convert-pointer-to-uintptr_t, and g is convert-uintptr_t-to-pointer. Mathematically and logically, it does not imply that g(f(x)+4) = x+4. For example, if f(x) were x² and g(y) were sqrt(y), then g(f(x)) = x (for real non-negative x), but g(f(x)+4) ≠ x+4, in general. In the case of pointers, the conversion to uintptr_t might give an address in the high 24 bits and some authentication bits in the low 8 bits. Then adding 4 just screws up the authentication; it does not update… – Betoken 9/2, 2020 at 0:43

… the address bits. Or the conversion to uintptr_t might give a base address in the high 16 bits and an offset in the low 16 bits, and adding 4 to the low bits might carry into the high bits, but the scaling is wrong (because the address represented is not base•65536+offset but rather is base•64+offset, as it was in some systems). Quite simply, the uintptr_t you get from a conversion is not necessarily a simple address. – Betoken 9/2, 2020 at 0:45

@R..GitHubSTOPHELPINGICE from my reading of the standard, there is only a weak guarantee that (void *)(uintptr_t)(void *)p will compare equal to (void *)p. And it is worthwhile to note that the committee has commented on nearly this exact issue, concluding that "implementations ... may also treat pointers based on different origins as distinct even though they are bitwise identical." – Physician 9/2, 2020 at 8:21

@R..GitHubSTOPHELPINGICE: Sorry, I missed that you were adding a value calculated as the different of two uintptr_t conversions of addresses rather than a different of pointers or a “known” distance in bytes. Sure, that is true, but how is it useful? It remains true that “there's still no defined way to compute one of those pointers from the other” as the answer states, but that calculation does not calculate b from a but rather calculates b from both a and b, since b must be used in the subtraction to calculate the amount to add. Computing one from the other is not defined. – Betoken 9/2, 2020 at 12:12

@RyanAvella: That's what I referred to as the "provenance mess". But it's really not about treating pointers that are bitwise identical differently. It's about treating integers which are equal as values (not to mention equal bitwise) differently, which has always been explicitly forbidden by the C language (e.g. C requires negative zero integer, if it exists, to behave identically to normal zero in all expressions, and requires padding bits not to affect resulting value of expressions). So provenance absolutely is a change to the language semantics. – Apothecium 9/2, 2020 at 16:30

@EricPostpischil: Note that there's no way to even talk about "12 bytes apart" in the abstract machine with pointers that aren't into the same array, except under a conversion to an integer type or other numeric representation (e.g. %p) under an implementation-defined definition that preserves byte distances within arrays and extends that difference operation to the whole domain of the integer type. – Apothecium 10/2, 2020 at 3:27

@RyanAvella: The authors of the Standard have never reached a consensus as to what kinds of non-portable constructs are within the Standard's jurisdiction. If the Standard would explicitly state that support for certain constructs that are useful but non-portable is a quality-of-implementation issue outside the Standard's jurisdiction, that would resolve a lot of issues. I doubt that that will ever happen, though, because it would paint clang and gcc in a rather bad light. – Filature 10/2, 2020 at 5:12

@RyanAvella: In clang, the act of comparing for equality two integer values that are derived from pointers can knock execution off the rails even if the integers are never converted to pointers. Why people view clang as a quality compiler when it makes fundamentally unsound optimizations is beyond me. – Filature 10/2, 2020 at 5:20

@RyanAvella: See godbolt.org/z/WSV8uH for an example of clang getting knocked off the rails by an integer comparison. Sure clang is being "clever" by inferring that since (uintptr_t)(y+i) happens to be coincidentally equal (uintptr_t)(x+5), it can't possibly equal y, but that's unsound. If x happens to have five elements, y happens to immediately follow it, and i happens to equal zero, then (x+5), (y+i), and y could all legitimately represent the same address, even though the former pointer couldn't access the same storage as the other two. – Filature 10/2, 2020 at 5:28

@Filature The implementation is allowed to embed metadata in the least significant bits of a pointer. (or even add padding bits with this metadata, if it wants) Then even though the two pointers may point to the same byte of memory, they can still compare as inequal. Any time you rely on the underlying implementation of a pointer (e.g. assuming it is a flat memory model) you are basically asking for undefined behavior. – Physician 11/2, 2020 at 5:54

@RyanAvella: An integer comparison should always do one of two things: yield 0 with no side effects, or yield 1 with no side effects. In the function as written, if i is zero, having the comparison yield 0 should result in y[0] and the return value both being equal to 1; having the comparison yield 1 should result in y[0] and the return value both being equal to 2. In the code generated by clang, however, if the comparison happens to yield 1, then y[0] would equal 2, but the function would return 1. While there would be no requirement that the comparison ever yield 1... – Filature 11/2, 2020 at 15:43

...an implementation that is going to behave erroneously if the comparison yields 1 would be required to ensure that if always yields 0 in all UB-free scenarios. The fact that a comparison might yield 0 or 1, chosen in Unspecified fashion, would allow the compiler to either execute or skip the branch at its leisure, but each individual execution of the function must behave in a consistent with one or the other choice. – Filature 11/2, 2020 at 15:48

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags