Return value from loop expression with break
Asked Answered
S

2

12

Example 1

fn five() -> i32 {
    5   // ; not allowed I understand why
}

fn main() {
    let x = five();
    println!("The value of x is: {x}");
}

Example 2 (from https://doc.rust-lang.org/stable/book/ch03-05-control-flow.html)

fn main() {
    let mut counter = 0;

    let result = loop {
        counter += 1;

        if counter == 10 {
            break counter * 2;
        }
    };
    println!("The result is {result}");
}

I understand why in Example 1 it must be 5 and not 5;, but I am confused with Example 2, and have a few questions.

Question 1:

Why do we have ; here? It will work without ;, so why is it there? Is it some Rust convention or is there some technical reason?

Question 2:

If I do break; counter * 2; it will not return a value. What is the difference between break; counter * 2; and break counter * 2;?
Why does the second one work?

Question 3:

If I do:

break counter * 2
println!("After break");

compile error is: error: expected ;, found println
If I do:

break counter * 2;
println!("After break");

there is no more compile error, but:

15 |             println!("After break");
   |             ^^^^^^^^^^^^^^^^^^^^^^^ unreachable statement

But at least I understand this.
What I do not understand is why the break counter * 2 is working fine but if I add something after it we have compile error.

To be honest, I am confused with this Example 2 my understanding is that if we want to return value from expression last line should be without ";" (like in Example 1), but clearly Example 2 proves otherwise.

Someone answered 6/8, 2022 at 18:34 Comment(1)
Additionally to the answers I'd suggest keeping The Rust Reference in a next tab open. It has a good amount of information about the language doc.rust-lang.org/reference/statements-and-expressions.html Though it doesn't mention break is diverging.Pasty
D
20

Rust is a very expression-oriented language. And expressions, crucially, return values. When you write a function, you're writing an expression. That expression can consist of several statements separated by semicolons.

This is where Rust diverges from most other C-derived languages. Expressions are the driver in Rust, and semicolons separate statements. So a valid expression is { a ; b ; c ; d }, where d is the eventual result. a, b, and c are mere side effects. In C, by contrast, a function is a sequence of statements terminated by semicolons, and statements contain expressions. So in C, a function body might look like { a ; b ; c ; d ; }, where each statement is executed for side effects, and one of them might happen to be a return statement, but it's still a statement.

If a sequence of expressions in Rust ends in a semicolon, Rust assumes you meant to insert an extra () as the end, so { a ; b ; c ; d ; } translates to { a ; b ; c ; d ; () }. This is why we don't have to write () at the end of all of our unit-returning functions. It's just the default.

It's a more functional way of looking at things. A function returns a value, and whatever else happens is a side effect. The "usual" return value at the end of a function is simply that value, as an expression.

Now, because Rust supports a more imperative style (and because it's often useful and convenient), Rust also supports statements such as break and return, which break out of the usual flow of control early. These are statements. They have side effects which happen to return values, but they are not the "usual" return value of the expression.

let result = loop {
  counter += 1;
  if counter == 10 {
    break counter * 2;
  }
};

The inside of the loop, like most things in Rust, is an expression. So it can return a value. That value is ignored, since the loop is just going to run again. In this case, the block is equivalent to

let result = loop {
  counter += 1;
  if counter == 10 {
    break counter * 2;
    ()
  } else {
    () // 'if' can also insert () in the else block when used as an expression
  }
};

and we return () explicitly. If you remove the semicolon, you get

let result = loop {
  counter += 1;
  if counter == 10 {
    break counter * 2
  } else {
    () // 'if' can also insert () in the else block when used as an expression
  }
};

break, as an expression, "returns" a value as well. That value is of the diverging type, called never or !. Since break is guaranteed to diverge (i.e. to exit the usual flow of control), it returns !, which is the only type in Rust that is compatible with every other type. So the result of this if expression is still (), since ! can convert to (). This is all moot, of course, since the loop will just run again if not broken, but that's what Rust is reasoning about internally.

In summary, you're not trying to return from the last line of the { ... } block of the loop. You're trying to break out of the loop, which is not a normal return; it breaks the usual rules that the loop would follow, so it needs a special statement, and statements in Rust are separated from one another by a semicolon. The fact that you can end the statement sequence without a semicolon is incidental here, since loop ignores its block's result anyway.

Dixon answered 6/8, 2022 at 18:47 Comment(4)
This is a wonderful explanation. When I read the questions I thought I knew the answers but then I read the answers by Silvio and I definitely learnt quite a few things from them myself. Just let me add that loop is the only loop where a break expression can return a value.Andri
And the reason is that loop is the only loop which is guaranteed to execute at least once.Fenestra
@Fenestra Not quite; loop is the only loop which is guaranteed not to exit except by break so the type of the loop value can be controlled by the break. “At least once” wouldn't help if it exited on a non-break condition after the first one.Heavyset
A wonderful answer. One small nit: technically, if expr { stmt; } is not like if expr { stmt; () } else { () }. The former can be chained in statements (stmt; if expr { stmt; } stmt;) without a trailing semicolon, while the latter can't. Internally, expressions are divided into "needs trailing semicolon" and "doesn't need" (and I think this is actually "needs", "accepts" and "rejects" but I'm unsure). if that does not end with a trailing expression does not need a trailing semicolon, and thus it is more of a standalone statements. Can be easier to view this as you explained :)Laughingstock
D
-3

Question 1: Why do we have ; here? It will work without ;, so why is it there? It it some Rust convention or there is some technical reason?

If you are talking about break counter * 2;, then ; is required because its not the last code in that function. If it was the last code in that function. If it was the last code, like in this example, you can omit the semicolon. enter image description here

Drawee answered 23/1, 2023 at 8:0 Comment(2)
Please do not upload images of code/data/errors.Detrital
OP could also omit the semicolon in their example, that's why they ask, also rustfmt would insert a semicolon in your example, too.Detrital

© 2022 - 2024 — McMap. All rights reserved.