Why does const int main = 195 result in a working program but without the const it ends in a segmentation fault?
Asked Answered
U

2

33

Consider following C program (see live demo here).

const int main = 195;

I know that in the real world no programmer writes code like this, because it serves no useful purpose and doesn't make any sense. But when I remove the const keyword from above the program it immediately results in a segmentation fault. Why? I am eager to know the reason behind this.

GCC 4.8.2 gives following warning when compiling it.

warning: 'main' is usually a function [-Wmain]

const int main = 195;
          ^

Why does the presence and absence of const keyword make a difference here in the behavior of the program?

Unbidden answered 23/10, 2015 at 15:2 Comment(5)
According to the standard, this is simply undefined behavior.Refund
@machine_1 195 is the encoding for the opcode ret (return from function) on 8086 and its successors. You can guess what happens when you put that in a variable and call that variable as a function.Pileup
It is probably relevant to link to How can a program with a global variable called main instead of a main function work?Nagano
Did you choose the value on purpose to coincide with ret instruction?Homeopathic
@Homeopathic If you do some searching you can find various versions of this in several places. On the stack exchange network this was one of the older references. In my answer to the link above we can find a 1984 IOCCC entry that does something similar but is much more sophisticated.Nagano
P
62

Observe how the value 195 corresponds to the ret (return from function) instruction on 8086 compatibles. This definition of main thus behaves as if you defined it as int main() {} when executed.

On some platforms, const data is loaded into an executable but not writeable memory region whereas mutable data (i.e. data not qualified const) is loaded into a writeable but not executable memory region. For this reason, the program “works” when you declare main as const but not when you leave off the const qualifier.

Traditionally, binaries contained three segments:

  • The text segment is (if supported by the architecture) write-protected and executable, and contains executable code, variables of static storage duration qualified const, and string literals
  • The data segment is writeable and cannot be executed. It contains variables not qualified const with static storage duration and (at runtime) objects with allocated storage duration
  • The bss segment is similar to the data segment but is initialized to all zeroes. It contains variables of static storage duration not qualified const that have been declared without an initializer
  • The stack segment is not present in the binary and contains variables with automatic storage duration

Removing the const qualifier from the variable main causes it to be moved from the text to the data segment, which isn't executable, causing the segmentation violation you observe.

Modern platforms often have further segments (e.g. a rodata segment for data that is neither writeable nor executable) so please don't take this as an accurate description of your platform without consulting platform-specific documentation.

Please understand that not making main a function is usually incorrect, although technically a platform could allow main to be declared as a variable, cf. ISO 9899:2011 §5.1.2.2.1 ¶1, emphasis mine:

1 The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters (...) or with two parameters (...) or equivalent; or in some other implementation-defined manner.

Pileup answered 23/10, 2015 at 15:7 Comment(5)
Some good points, zwol touched on some of these here in the comments to my answer to a similar C++ version of this questionNagano
Please include your comment on the encoding for the opcode ret in your answer. This is key to understanding the behaviour described.Gudrunguelderrose
@user19474 Better this way?Pileup
@FUZxxl: nice answer. But ur answer doesn't explain why program exits with garbage value as return status instead of 0? It would be better if you could tell the reason behind it.Unbidden
@PravasiMeet The ret instruction exits the current function. It does not set a return value. Thus the program exits with whatever was in the eax register at the time main returned, i.e. a random garbage value.Pileup
B
11

In C, main at global scope is almost always a function.

To use main as a variable at global scope makes the behaviour of the program undefined.

(It just might be the case that when you write const the compiler optimises out the variable to a constant and so your program behaviour is different. But the program behaviour is still undefined).

Barth answered 23/10, 2015 at 15:6 Comment(4)
Nope. A platform may allow main to be declared “in an implementation defined manner.”Pileup
I'm too old to trawl through the standard but I imagine it must always be a function!Barth
Cf. ISO 9899:2011 §5.1.2.2.1 “The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters: (...) or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared): (...) or equivalent; 10) or in some other implementation-defined manner.”Pileup
OK. Good standard reference! I've amended.Barth

© 2022 - 2024 — McMap. All rights reserved.