Semantics of lifetime parameters
Asked Answered
D

4

9

Consider the following example from the Book:

fn main() {
    let string1 = String::from("abcd");
    let string2 = "xyz";

    let result = longest(string1.as_str(), string2);
    println!("The longest string is {}", result);
}

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

It is said that (emphasis mine)

The function signature now tells Rust that for some lifetime 'a, the function takes two parameters, both of which are string slices that live at least as long as lifetime 'a. The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a. In practice, it means that the lifetime of the reference returned by the longest function is the same as the smaller of the lifetimes of the references passed in. These constraints are what we want Rust to enforce.

Shouldn't the bolded sentence be The function signature also tells Rust that the string slice returned from the function will live at most as long as lifetime 'a.? That way, we are assured that as long as both x and y are alive, then the return value would also be valid, because the latter references the former.

To paraphrase, if x, y and the return value all live at least as long as lifetime 'a, then the compiler can simply let 'a be an empty scope (which any item can outlive) to satisfy the restriction, rendering the annotation useless. This doesn't make sense, right?

Discrepant answered 2/6, 2021 at 9:32 Comment(11)
No, it should live at least, because if it lives more it is still a valid lifetime.Rightminded
@Rightminded but it won’t be valid it it lives too long, e.g. outliving both x and y.Discrepant
aaah, aha, ok. I misunderstood. In that case both x and y are bound by the same lifetime so the returning reference could live at most as you say. Makes sense yes.Rightminded
No, the returned value will live at least as long as both x and y are valid. It may become invalid as soon as one of x or y is no longer valid, but it may also live longer (in particular if the other one is still valid).Exalt
As a consequence, the caller is guaranteed that he can safely use the returned value so long as both x and y are still valid.Exalt
@Exalt I see, but I suppose in that case x and y can live at most as long as 'a? If all of x, y, and the return value can outlive 'a, then the annotation doesn't seem extremely helpful to me, because it cannot function as a "lifetime barrier".Discrepant
x and y must (not "can") live at least as long as 'a. Yes, x, y and the return value can outlive 'a. Forget about 'a, the point of the annotation is to link the lifetime of the return value to the lifetimes of x and y. What the annotation means is literally: "you can use the return value as long as both x and y remain valid. As soon as either x or y becomes invalid, then the return value may be invalid and so can no longer be used safely". 'a is just a placeholder that represents the lifetime when all three values are guaranteed to be valid.Exalt
@Exalt Yeah, I'm exactly trying to understand how the annotation links the lifetime of the return value to the lifetimes of the arguments. I mean, a < b and a < c tells us nothing about the relationship between b and c, so it's weird that lifetime annotations can establish linkage from the fact that both the argument and the return value can outlive some arbitrary lifetime.Discrepant
Expressed in formal language, the annotation translates to: for all 'a, 'a≤'x and 'a≤'y implies 'a≤'r (with 'x, 'y and 'r the lifetimes of x, y, and the return value respectively). For that relation to hold for all 'a, then you must necessarily have 'x≤'r or 'y≤'rExalt
@Exalt This is a really good explanation. I love it! There is still one problem though: according to the Book, Note that the longest function doesn’t need to know exactly how long x and y will live, only that some scope can be substituted for 'a that will satisfy this signature.. Do you think the "some scope" here should be "all scopes"?Discrepant
I feel the same confusion when reading this paragraph! Really happy to see some other one have the same feeling :)Crinite
E
9

Expressed in formal language, the annotation translates to:

for all 'a, 'a≤'x and 'a≤'y implies 'a≤'r

With 'x, 'y and 'r the lifetimes of x, y, and the return value respectively.

This links the lifetime of the return value to the lifetimes of the parameters because for that relation to hold for all 'a, then you must necessarily have 'x≤'r or 'y≤'r.

The compiler will use that annotation at two times:

  1. When compiling the annotated function, the compiler doesn't know the actual lifetimes of x and y and it doesn't know 'a (since 'a will be chosen at the call site, like all generic parameters). But it knows that when the function gets called, the caller will use some lifetime 'a that matches the input constraints 'a≤'x and 'a≤'y and it checks that the code of the function respects the output constraint 'a≤'r.

  2. When calling the annotated function, the compiler will add to its constraint solver an unknown scope 'a in which the return value can be accessed, along with the constraints that 'a≤'x and 'a≤'y plus whatever extra constraints are required due to the surrounding code and in particular where x and y come from and how the return value is used. If the compiler is able to find some scope 'a that matches all the constraints, then the code compiles using that scope. Otherwise compilation fails with a "does not live long enough" error.

Exalt answered 3/6, 2021 at 6:51 Comment(7)
I think this is backwards. It should be for all 'a, 'a <= 'r implies 'a <= 'x and 'a <= 'y. With for all 'a, 'a<= 'x and 'a<= 'y implies 'a <= 'r 'r would be at least as long as the intersection of 'x and 'y, but could be longer.Ascertain
In the case of the longest function used in the question, 'r=='x' or 'r==y depending on which value is returned, so 'r could indeed be longer than the intersection of 'x and 'y.Exalt
Rereading your answer carefully, I see 2. is what I expect. For number 1., how could the code of the function not respect 'a<='r?Ascertain
fn foo<'a, 'b> (x: &'a str, y: &'b str) -> &'a str { return y; } doesn't guarantee that 'a <= 'r (and doesn't compile for this reason).Exalt
Thank you. I feel embarrassed, that I didn't see that example.Ascertain
I didn't quite get this: you must necessarily have 'x≤'r or 'y≤'r. This is saying return value lives at least as long as function arguments. Shouldn't the relationship be the other way?Allheal
@Allheal no it's the right way. The return value may live longer than the arguments (returning a 'static reference is allowed) but any returned value must live at least as long as one of the arguments or the function will fail to compile.Exalt
T
2

When I first read that section of the book, I have the same reaction as you. Now I finally understand what the books mean after reading rustonomicon.

Every reference in rust have a lifetime. Lifetime is a region/span of code where the reference is borrowing the value. Obviously this implies that the value will live 'at least as long as' the duration we borrow this value.

When we have a function definition:
fn longest(x: &str, y: &str) -> &str

This function return a reference to the caller. For the caller to correctly use the returned reference, the caller need to know how long the value in this reference live. The callee need to guarantee that the value inside the reference live 'at least as long as' some span of code the reference is used. In this example I used lifetime1 and lifetime2. I annotate the lifetime (this is not valid rust) for visualization.

fn longest<'a, 'b, 'c>(x: &'a str, y: &'b str) -> &'c str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}
fn main() {
    'lifetime1: {
        let string1 = String::from("abcd");
        let string2 = "xyz";

        'lifetime2: {
            let result: &'lifetime2 = longest<'lifetime1, 'lifetime1, 'lifetime2>(string1.as_str(), string2);
            println!("The longest string is {}", result);
        }
    }
}

In the code above the caller guarantee that the value in x will live at least as long as 'a which is 'lifetime1 and the value in y will live long as 'b which is 'lifetime1. The callee will guarantee that the value in the returned reference will live at least as long as 'c which is 'lifetime2.
So 'at least as long as' have different guarantor, for input, the caller must guarantee it, for output, the callee must guarantee it.

Now the code above is still wrong. Function longest can be called from anywhere, so the body of the function don't exactly know what is the relation between 'a, 'b and 'c. We cannot assign disjoint or smaller lifetime to a bigger one.

so this is one possible fix:

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}
fn main() {
    'lifetime1: {
        let string1 = String::from("abcd");
        let string2 = "xyz";

        'lifetime2: {
            let result: &'lifetime2 = longest<'lifetime2>(string1.as_str(), string2);
            println!("The longest string is {}", result);
        }
    }
}

because string1 and string2 live longer than lifetime2 we can borrow it with a reference that have lifetime2. So the above code is valid.

Another solution is to declare that 'a and 'b is bigger than or superset of 'c in our original solution with

'a: 'c and 'b : 'c

fn longest<'a: 'c, 'b: 'c, 'c>(x: &'a str, y: &'b str) -> &'c str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}
fn main() {
    'lifetime1: {
        let string1 = String::from("abcd");
        let string2 = "xyz";

        'lifetime2: {
            let result: &'lifetime2 = longest<'lifetime1, 'lifetime1, 'lifetime2>(string1.as_str(), string2);
            println!("The longest string is {}", result);
        }
    }
}

Now we can assign &'a str to &'c str since 'c is subset of 'a.

Tedie answered 13/3, 2023 at 11:52 Comment(0)
F
1

We can consider the case from your example code with a slight scope modification

fn main() {
    let string1 = String::from("abcd");

    {
        let string2 = "xyz";
        let result = longest(string1.as_str(), string2);
        println!("The longest string is {}", result);
    }
}

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

Here, we recognize that for the function call longest above, the lifetime a ends up being the lifetime of string2, because both parameters x and y must live at least as long as a, so if a were the lifetime of string1, then the second parameter to longest, which is just string2 would not live as long as string1 and the statement "both parameters must live at least as long as a" would be false.

We admit that lifetime a is the lifetime of string2. We know that the string slice returned by longest could be either string1 or string2. Since we make the constraint in the declaration that the return value also lives at least as long as lifetime a, we are really saying that the return value lives at least as long as string2, the string with the shorter of the two lifetimes.

If longest returned string2, then the returned string slice would live exactly as long as lifetime a. If longest returned string1, however, the returned string slice would live as long as the lifetime of string1, which is longer than that of lifetime a (the lifetime of string2), so we say that the string slice returned from the function will live at least as long as a.

An important thing to note here is that we don't know which slice longest is going to return, so we only allow the lifetime of the returned reference to be that of the smaller of the two lifetimes, since during the smaller of the two lifetimes, both strings are certainly still alive.

Forras answered 2/6, 2021 at 10:31 Comment(0)
C
0

Initially, I shared your confusion: why is it at least instead of at most? At least seems to guarantee nothing at all. Only after reading Jmb's and Kevin's answers and the discussions like 3, 4, 5, 6, 7, 8, 9, etc., I finally understood (hopefully).

Preliminaries:

  1. "xxx (reference) lives/outlives 'a" means that the subject of reference xxx lives/outlives 'a, rather than xxx itself (scope) or its use (see 4).

  2. Slice is an ambiguous term; e.g., &str and str are sometimes both referred to as string slices. In "... the string slice returned from the function will live ...", it likely means &str, but due to point 1, this distinction does not really matter here.

  3. Given "xxx (reference) lives at least 'a", we can use xxx at most 'a for 100% safety.

  4. A function signature is a contract that contains both constraints and guarantees, varying based on whether you're the caller or the callee. Specifically, parameters are a guarantee to the callee but a constraint to the caller, while the return value is a guarantee to the caller but a constraint to the callee.

  5. Like generic type parameters, lifetime parameters are part of function signature, and they are unknown at definition but determined only at the call site.

Now let's apply these concepts to longest. Hereafter, we denote the returned reference as r. The lifetime annotations represent a contract: 'a is an unknown lifetime determined at function call. For every such 'a, if x and y outlive 'a, then r will also outlive 'a. (PS: As mentioned here , this implies that r will always live at least as long as x or y.)

This contract can be viewed from two perspectives:

  • Callee's perspective: It's guaranteed that x and y outlive 'a. The callee itself must ensure that r will also outlive 'a.

  • Caller’s perspective: It's guaranteed that r outlives 'a. The caller itself must ensure that the actual parameters x and y outlive 'a.

Then the borrow checker can verify it from two perspectives:

  • For the callee: It verifies that the reference returned by the function body indeed outlives the abstract lifetime parameter 'a. Note that this needs to hold for all possible concrete 'a, which in practice might be ensured through subtyping.

  • For the caller: It tries to find some (not necessarily all) concrete lifetime for 'a, which encompasses all the uses of the returned reference, while ensuring that the passed-in x and y both outlive 'a.

Crinite answered 13/8 at 9:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.