How are if statements in C syntactically unambiguous?
Asked Answered
A

7

5

I don't know a whole lot about C, but I understand the basics and as far as I can tell:

int main() {
  if (1 == 1) printf("Hello World!\n");
  return 0;
}

and

int main() {
  if (1 == 1) 
    printf("Hello World!\n");
  return 0;
}

and

int main() {
  if (1 == 1) {
    printf("Hello World!\n");
  }
  return 0;
}

are all precisely syntactically equivalent. The statement is true; the string is printed; the braces are (apparently) optional.

Sometimes, especially here on SO, I see something like the following:

int main() {
  if (1 == 1)
    printf("one is one\n");
  printf("is this inside the if statement??/who kn0WS\n");
  return 0;
}

By the power vested in CodeGolf, I have been led to believe that C is whitespace-agnostic; the lexical analyser breaks the tokens up into their component parts and strips whitespace outside strings.

(I mean, the whole reason for the semicolons-on-every-statement-thing is so the parser can strip \n, \t, literal spaces and still know where each statement ends, right??)

So how is it possible to unambiguously parse the previous bit of code (or perhaps someone can come up with a better example of what I mean), if whitespace is to be disregarded?

If C programmers want to write in whitespace-dependent Pythonic syntax, why do they write C, and why is it taught wherever C is taught that it's okay to write lexically ambiguous (both to me, a programmer, and the computer) statements like this?

Abattoir answered 16/1, 2016 at 20:8 Comment(1)
Why don't you read C11 draft standard, 6.8.4 Selection statements (possibly also 6.8.2 Compound statement), see whether you can figure it out?S
C
5
if (1 == 1)
  printf("one is one\n");
printf("is this inside the if statement??/who kn0WS\n");

The second printf() should never execute inside the if statement.

The reason being that the previous line ends with a semicolon, which indicates the end of the if-block to execute.

(I mean, the whole reason for the semicolons-on-every-statement-thing is so the parser can strip \n, \t, literal spaces and still know where each statement ends, right??)

So how is it possible to unambiguously parse the previous bit of code (or perhaps someone can come up with a better example of what I mean), if whitespace is to be disregarded?

Parsing example:

if (1 == 1) // if - ( and ) - statements (or block) follow, skip all whitespace

// no { found -> single statement, scan until ; (outside quotes / comments)

printf("one is one\n"); // ; encountered, end of if-block

Without braces, only one statement belongs to the if-block.

But, as said already, it's a good habit to use braces. If you later add a statement (a quick temporary printf() for example), it will always be inside the block.

Special case:

int i = 0;
while(i++ < 10);
    printf("%d", i);

Here printf() will only execute once. Mark the ; at the end of while().

In case of an empty statement, it's better to use:

while(i++ < 10)
    ;

to make the intention clear (or, as an alternative, an empty block {} can be used as well).

Corena answered 16/1, 2016 at 20:11 Comment(3)
So braceless if-blocks can consist of a maximum of one statement?Abattoir
Yes - but best to use braces (as the saying goes - always use braces then you never get caught with your trousers down).Moorehead
an alternative to line break beore ; is an empty block {}Flats
B
5

In C, an if statement takes exactly statement after the truth expression, regardless of indentation. Normally this statement is indented for clarity, but C ignores indentation. In any case, there is no amiguity in any of your examples.

What is ambiguous, in C and in many other languages, is the "dangling else". For instance, suppose you have a nested if statement with a single else after the second one. It could group as:

if (expr)
    if (expr)
        statement
    else
        statement

Or it could group as:

if (expr)
    if (expr)
        statement
else
    statement

The only difference between these two is how they're indented, which C ignores. In this case, the ambiguity is resolved by using the first interpretation, i.e., the else statement binds to the nearest preceding if statement. To achieve the second interpretation, curly braces are needed:

if (expr) {
    if (expr)
        statement
}
else
    statement

However, even in the first case, it's a good idea to include the curly braces, even though they aren't required:

if (expr) {
    if (expr)
        statement
    else
        statement
}
Burrill answered 16/1, 2016 at 20:23 Comment(3)
Coming mostly from golang, I wish they were required.Abattoir
@cat: …and Python avoids the problem by requiring locally consistent indentation. There's more than one way to skin the … oh, sorry. There's more than one approach to this issue.Erickson
@JonathanLeffler indeed, it does, now only if I could have Python with braces and optional static typing (also, thanks for the laugh :P )Abattoir
C
3

tl;dr The only ambiguity is in how difficult it is for a human to read. From the compiler's perspective, the syntax is perfectly unambiguous.

There are only two (compilable and syntactically acceptable) possibilities after an if statement:

  1. Braces, as in

    if(x) {
        DoFoo();
    }
    // or
    if(x) { DoFoo(); }
    

    In this case, whatever is in the {...} will execute if the condition is met.

  2. No braces, as in

    if(x)
        DoFoo();
    // or
    if(x) DoFoo();
    

    In this case, only the next statement will execute if the condition is met.

You are correct that C is whitespace-agnostic. As a result, omitting the braces can lead to some tricky bugs. For example, in this code, DoBar() will execute whether or not the condition is met:

if(x)
    DoFoo();
    DoBar();

Inconsistent use of braces can also easily result in invalid code. For example, this looks valid (at a glance) from a human perspective, but it's not:

if(x)
    DoFoo();
    DoBar();
else
    DoBaz();

None of the examples you posted are ambiguous from the compiler's perspective, but the no-braces versions are confusing from a human perspective. Leaving out the braces frequently leads to hard-to-find bugs.

Cycad answered 16/1, 2016 at 20:15 Comment(1)
@TomKarzes Fixed. Thanks.Cycad
M
2

Without braces it is just the next statement after the if. The whitespace does not matter

It is good practice and makes life easier to always use the braces. Good indentation as well. Code is then easy to read and does not lead to errors when people add/remove statements after the if

Moorehead answered 16/1, 2016 at 20:12 Comment(0)
T
2

Readability is the only ambiguity in the statement:

if (1 == 1)
  printf("one is one\n");
printf("is this inside the if statement??/who kn0WS\n");

The only time the first statement following the if(...) statement should execute, is if it is evaluated TRUE.

Braces, {...} help to remove readability ambiguities,

if (1 == 1)
{
     printf("one is one\n");
}
printf("is this inside the if statement??/who kn0WS\n");

but the syntax rules are still the same.

Opinions vary, but I always choose to use braces.
Not using them is fine at the time code is written. But down the road, you just know someone will come along and add another statement under the first and expect it to be executed.

Teetotalism answered 16/1, 2016 at 20:18 Comment(0)
A
2

In general, in a statement or a loop with a single instruction, the curly brackets are optionals; instead, if you have two or more instructions you have to add them. For example :

for(i = 0; i < 2; i++)
   for(j = 0; j < 4; j++)
      If(...)
         printf(..);
      else
         printf(..);

Is equivalent to :

for(i = 0; i < 2; i++)
   {
         for(j = 0; j < 4; j++)
         {
             If(...)
             {
                 printf(..);
              }
              else
             {
                 printf(..);
             }
        }
   }

As you may note this is more something related to indentation of the code. Personally I don't use curly brackets if I have a single instruction as doing that will make your code shorter and cleaner.

Alphonsealphonsine answered 16/1, 2016 at 20:28 Comment(5)
If you used K&R-style on the braces (eg. if (cond) { and } else { ) it wouldn't be nearly so long...Aviary
@JohnHascall Do you mean for example : if(..) {..} on a single line?Alphonsealphonsine
Hard to convey successfully in the comment format, but a short if (cond) statement; on one line is an example. An if (cond) { statement1; statement2; } on four lines is another. An if (cond) { statement1; } else { statement2; } on five lines is another.Aviary
@JohnHascall I prefer the one I wrote, as with a glance you are capable of seeing if you are missing a curly brackets, and which instructions inside the curly brackets refer to. By the way I'm aware of the K&R-style, but just like I wrote, in these simple cases I think that personal preferences leads the discussionAlphonsealphonsine
And I prefer K&R because it allows you to see more code at a time, which I find useful to understanding in some case. But, reasonable people can disagree agreeably...Aviary
A
2

Another reason to use braces is that a simple typo can bite you hard:

#include <stdio.h>
int main(void) {
  if (0 == 1)
    printf("zero is one\n"),
  printf("is this inside the if statement?? /who kn0WS\n");
  return 0;
}

Look carefully...

Aviary answered 17/1, 2016 at 16:25 Comment(4)
Indeed! I also see your sneaky attempt to disarm my trigraph but to no end ;-)Abattoir
Trigraphs should never have been squirted darkly from whichever orifice whence they came...Aviary
I've heard people call them "one of the most clever/useful things about C" but I don't see how they serve a purpose aside from confusing programmers.Abattoir
The standards group eventually realized how stupid they are and came up with the less-evil digraphs. Now if they would only banish trigraphs.Aviary

© 2022 - 2024 — McMap. All rights reserved.