Does '#'-character have to be at the start of a line in the C preprocessor? [duplicate]
Asked Answered
L

2

13

I have programmed C for quite a while now. During this time I have learned that it is a common convention to put the "#"-character that comes before preprocessor-directives at column one.

Example:

 #include <stdio.h>

 int main(void) {
 #ifdef MACRO1
 #ifdef MACRO2
      puts("defined(MACRO1) && defined(MACRO2)");
 #else
      puts("defined(MACRO1)");
 #endif
 #else
      puts("!defined(MACRO1)");
 #endif
      return 0;
 }

When people indent their preprocessor directives they usually do it like this:

 #include <stdio.h>

 int main(void) {
 #ifdef MACRO1
 # ifdef MACRO2
     puts("defined(MACRO1) && defined(MACRO2)");
 # else
     puts("defined(MACRO1)");
 # endif
 #else
     puts("!defined(MACRO1)");
 #endif
     return 0;
 }

I do not think that I have ever seen anyone format it like this:

 #include <stdio.h>

 int main(void) {
 #ifdef MACRO1
  #ifdef MACRO2
     puts("defined(MACRO1) && defined(MACRO2)");
  #else
     puts("defined(MACRO1)");
  #endif
 #else
     puts("!defined(MACRO1)");
 #endif
     return 0;
 }

My question is if the C language standard demands that the #-character should be in column one.

So is the third option above even legal?

If all above cases are legal then I want to know if this is legal.

 #include <stdio.h>

 int main(void) {
 #ifdef MACRO
     puts("defined(MACRO)");
 /* Now there are other characters before the `#` */ #endif
     return 0;
 }

Here is the #endif no longer on the "start" of the line because there are other non-whitespace characters in the way.

What seems weird about the last example is that Vim text-editor does not highlight the #endif that comes after the comment.

Screenshot

All these examples I have given compiles without any warnings using gcc with the -Wall -pedantic flags turned on (including the last one with a comment before #endif).

Note that I am just curious about the syntax. I always put #-character at column one like everyone else when I program. I would never write things like ++i; #endif in serious projects.

Lovage answered 9/1, 2015 at 23:42 Comment(12)
I have indented #define's for a long time and never had any problems in GCC. You can also have whitespace after the hash and before the directive name. As for the VIM - I call "bug".Q
I indent preprocessor directives all the time. But I never put whitespace between the # and the directive name, I always make sure the # and directive name are attached, and just put whitespace in front of the # instead. It works fine.Tutor
I downvoted because this would have taken about 30minutes to test, the OP has all the details, they just need to write a .c file, with a #ifdef some visible action #endif in it, then run the program,, then edit the column location of the #ifdef and run it again.Broken
@Broken I disagree: This is a typical language-lawyer question that cannot be answered by checking compilation with a compiler. This kind of question must be answered by quoting the standard. Pity, the question already has five tags, I would have added the language-lawyer tag otherwise...Legwork
@Broken it's not a good reason to dowvote and I upvoted it because even if a compiler accepts it it does not mean all compilers will accept. You have to quote the Standard here.Hebdomad
I swapped your c-preprocessor tag for language-lawyer, as you're asking a specific question regarding the standard. The pair of c and preprocessor should suffice, and the added tag might reduce the tendency to downvote and/or vote to close. If you object to the change, please feel free to roll it back.Antinomy
Imma downvoting too. It's too easy to test.Wickliffe
@MartinJames: It's only easy to test whether your particular compiler allows directives to be indented. The question is what the language standard guarantees.Materfamilias
@KenWhite: I don't agree that it's a language-lawyer question (though I don't disagree strongly enough to remove the tag myself). It's a question about what the language standard guarantees, but it's not particularly obscure, and it has very practical implications for real-world programming.Materfamilias
@Keith: As I said, if someone objects to the change, they should roll it back. I made it because a) the question asks specifically what the standard says, and b) the post has already gotten one downvote because it appears the OP could have tested quickly to see if it was acceptable or not and didn't do so (I was not that downvoter - I up-voted this question). It seemed the tag was better than a repeated preprocessor tag. I don't see anything in the tag wiki for the lawyer tag that says obscure, but I could be wrong. :-)Antinomy
As a developer who does a lot of enhancement/maintenance work, I WANT such directives at the start of a line. It's difficult enough sorting out someone else's code as it is, without 'hiding' directives that have a huge effect on the functionality.Wickliffe
@Broken @Martin James: If you read my whole question you will see that I have tested this code with gcc using -Wall and -pedantic. It compiled finely without any warnings. Like Keith pointed out I want to know what the standard guarantees, not what works with gcc.Lovage
J
21

In some pre-standard C preprocessors (meaning before 1989), the preprocessors only recognized # at the beginning of the line.

Since the C89/C90 standards required the preprocessor to recognize the # as the first non-blank character on the line (and the C99 and C11 standards do too), it is now perfectly legitimate to indent the directives, and it has been practical for even portable code to do so for all of this millennium.

In ISO/IEC 9899:2011 (the C11 standard), Section 6.10 Preprocessing directives says:

A preprocessing directive consists of a sequence of preprocessing tokens that satisfies the following constraints: The first token in the sequence is a # preprocessing token that (at the start of translation phase 4) is either the first character in the source file (optionally after white space containing no new-line characters) or that follows white space containing at least one new-line character.

The translation phases are defined in section 5.1.1.2 Translation phases.

  1. The source file is decomposed into preprocessing tokens 7) and sequences of white-space characters (including comments). A source file shall not end in a partial preprocessing token or in a partial comment. Each comment is replaced by one space character. New-line characters are retained. Whether each nonempty sequence of white-space characters other than new-line is retained or replaced by one space character is implementation-defined.

  2. Preprocessing directives are executed, macro invocations are expanded, and _Pragma unary operator expressions are executed. If a character sequence that matches the syntax of a universal character name is produced by token concatenation (6.10.3.3), the behavior is undefined. A #include preprocessing directive causes the named header or source file to be processed from phase 1 through phase 4, recursively. All preprocessing directives are then deleted.

Occasionally, you will find coding standards originating from the 1980s which still stipulate '# at start of line'.

I usually don't indent preprocessor directives, but it is legitimate to do so.

Jilly answered 10/1, 2015 at 0:0 Comment(4)
If I'm reading phase 3 right (and just to be explicit), this means that /* Now there are other characters before the '#' */ #endif is valid, correct?Exaggerate
@Cornstalks: That's how I read it, but if anyone offered me code to review with comments at the start of a preprocessing directive line, I'd send them back to fix it before approving it, and probably give them a lecture on sanitary coding practices too!Jilly
Oh certainly! Like many things, just because it's valid/legal doesn't make it a good idea.Exaggerate
While most uses of the preprocessor don't need indentation (include guards, includes, constant definitions), I find it good practice to indent #if ... #else ... #endif blocks for readability. I also find it good practice to indent macro definitions that are meant to be used within the scope of one function only to match the indentation of normal code (and add the corresponding #undef at the end of the function). Not that I would use it much, but when these things are necessary, proper indentation does help readability.Legwork
E
6

No, and here's a quote from the C standard to go with it (from section 6.10):

A preprocessing directive consists of a sequence of preprocessing tokens that satisfies the following constraints: The first token in the sequence is a # preprocessing token that (at the start of translation phase 4) is either the first character in the source file (optionally after white space containing no new-line characters) or that follows white space containing at least one new-line character.

So it's a # at the start of the file or after some whitespace that contains at least one new-line character.

This means:

# define foo
  # define bar

foo's definition is fine because the # is the first token in the file. bar's definition is fine because the # "follows white space containing at least one new-line character."

Exaggerate answered 10/1, 2015 at 0:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.