Why is a lifetime required for a reference to a type in a trait bound?
Asked Answered
E

1

8

I think I understood how lifetimes work with function parameters/outputs and with structs, since those cases are explained in the book (and further in the nomicon), but I do not understand how lifetimes are used for type bounds. For example:

trait Min<Elmt> {
    fn get_min(&self) -> Elmt;
}

impl<T, Elmt> Min<Elmt> for T
where
    &T: IntoIterator<Item = Elmt>,
    Elmt: Ord,
{
    fn get_min(&self) -> Elmt {
        let mut iter = self.into_iter();
        match iter.next() {
            None => panic!("Empty List"),
            Some(x) => iter.fold(x, |acc, y| acc.min(y)),
        }
    }
}

Here I require the type T to fulfill the following property: Any of its references must implement IntoIterator<Item = Ord>. This code does not compile because Rust wants &T to have a lifetime and I can not figure out why. The where clause should only ensure that something is of a certain type, which is done at compile time. How do lifetimes/references come into this? Ensuring that &T of an implementing type T implements IntoIterator has nothing to do with dangling pointers or memory leaks.

Someone from the Rust Discord helped me by making this working implementation:

trait Min<'a, Elmt> {
    fn get_min(&'a self) -> Elmt;
}

impl<'a, T, Elmt> Min<'a, Elmt> for T
where
    &'a T: IntoIterator<Item = Elmt> + 'a,
    Elmt: Ord,
{
    fn get_min(&'a self) -> Elmt {
        let mut iter = self.into_iter();
        match iter.next() {
            None => panic!("Empty List"),
            Some(x) => iter.fold(x, |acc, y| acc.min(y)),
        }
    }
}

(Playground variant)

They tried to explain it to me, but I could not understand, so I am looking for another explanation to find its way into my brain.

Let me go through it one by one:

  1. &'a T: IntoIterator<Item = Elmt> + 'a, we give the generic reference &T a lifetime 'a. Why is this needed?
  2. Why do we have to restrict &'a T to the lifetime 'a? I mean isn't that kind of a tautology?

    Since we used a lifetime in the where clause, we have to add it to some signature. That is where impl<'a, T, Elmt> comes from - fine

  3. Why does the trait Min and the get_min() function need a lifetime parameter now? Lifetime elision should give every input element a lifetime and the output does not have a lifetime anyway.

I think it might help if I knew what traits are/how they are implemented. Objects are basically a list of variable and method pointers. Do the methods from traits simply get appended to the method pointers of every implementing object? Are they separate objects somehow? Do they actually have a lifetime? Which lifetime? I am never assigning the "general Min trait" by using let a = Trait::new() syntax, so how can it have a lifetime?


Addressing the for<>-implementation comments:

While the trait and implementation would compile with a for<'a> lifetime bound on &T, the result is not usable. Since &Vec apparently does not fulfill the trait bounds then (cf. this playground). This was also brought up in the Discord discussion, and I also do not understand the for<> syntax yet, but I thought that this would maybe be out of scope for this question.

Evangelize answered 13/5, 2020 at 19:50 Comment(18)
That may not even be the right solution. See How do I write the lifetimes for references in a type constraint when one of them is a local reference? and How does for<> syntax differ from a regular lifetime bound?.Consign
I think your question is answered by the answers of Why are explicit lifetimes needed in Rust?. If you agree, we can mark this question as already answered.Consign
TL;DR the proposed duplicate: because explicit is usually better for understanding.Consign
Okay no "why are explicit lifetimes need in Rust" is explicitly NOT answering my question. I do understand the point of them for function parameters/outputs and structs (as said in the question). But I do not understand them in the context of traits and type restrictionsEvangelize
The first comment is orthogonal to the thrust of this question (only matters for the solution you got) , and it may not even be applicable. The second comment is much more relevant.Consign
See also Why is adding a lifetime to a trait with the plus operator (Iterator<Item = &Foo> + 'a) needed?Consign
@Consign that particular + 'a seems unrelated to this, since there Box actually defines the lifetime as 'static by default. In this case we literally name the lifetime of &T 'a, but for some reason have to restrict &T to the lifetime which it already hasEvangelize
I think you are getting confused because you're trying to write code that is confusing, not because of anything to do with lifetimes or traits in general. In particular, it is not at all obvious to me why Elmt should be a parameter to Min. If you allow Elmt to be an associated type instead, it's much easier to write Min in a way that makes your main compile.Sayers
@trentcl well, the trait was not my idea, it is the trait from the statrs crate: github.com/boxtown/statrs/blob/master/src/statistics/traits.rs. If you search for statistical crates, you would likely come across this: reddit.com/r/rust/comments/5qv11u/… where 3 packages (statrs, statistical, streaming-stats) are named, of which statrs has the most downloads. So in some sense it is the default statistical crate for rust. I just wanted to play with it and maybe try to extend it. Right now I do not want to obtain working code, but understand itEvangelize
@trentcl but on that note: what is the difference between trait<T> and trait { type Elmt...? I am guessing that the first version would result in multiple versions of the trait, so you could implement Min<i64> for Vec<i64> as well as Min<f64> for Vec<i64>, while the second variant would have you chose the output type when implementing Min once and for all. Correct?Evangelize
Yes, in a sense generics (trait Min<Elmt>) are input types and associated types (trait Min { type Elmt; }) are output types. There is another Q&A on this topic: When is it appropriate to use an associated type versus a generic type? but the answers may be a bit dense. Your summary is essentially correct.Sayers
Just my humble observation: the main reason around of the issue is that "something" with a type Elmt is getting out of shared reference (fn get_min(&self) -> Elmt), that is what Rust is worrying about, and Elmt can have a lifetime of &self (like 'a lifetime) as well as a 'static lifetime... Seems like trait has a 'static lifetime by default, while it would be 'a for a function... So it's kinda confusing to elide. And there is no problem if self is moving into get_min without a reference like get_min(self), so Rust doesn't worry that Elmt outlives the self..Birgit
@AlexanderFadeev that sounds plausible, but I am not sure why it is related to &T. If you remove the & from the &T (i.e. implement it for types which implement IntoIter, instead of for types whose references implement IntoIter) Rust does not complain about lifetimes. But the &self and Elmt issue stays.Evangelize
@FelixB. Yeah, because if self is taking out its life into get_min there is no reason to control it's lifetime: there is no lifetime, it's gonna be dead in any case. So whatever goes out the function will be legit: either something new is created inside the function, or the part of the self.Birgit
@AlexanderFadeev you could do that without changing &self to self, cf. play.rust-lang.org/…Evangelize
@FelixB. Oh, this is because you added a Copy trait. Remove it, and you will see that Rust tries to make a *self + copy at the place of into_iter, but it can't. P.S.: with copying it's safe, right, but it differs from your original example.Birgit
@AlexanderFadeev yes I added the Copy trait since I wanted it to compile, and then we have a function with an &self in and a (potential reference) Elmt out. So the situation you referred to as problematic in your first comment. But it compiles fine. But if I add a & back at the T it stops compiling: play.rust-lang.org/… That is why I wonder why this is related.Evangelize
@Felix B. As I noticed when you put T: it tries to deref *self and copy, and when you have &T: it doesn't deref and you have to put lifetime, and we come back to initial situation. And why Rust knows that it should to get derefed and copied in the first case: I don't really know, though I can guess, but this is just another story... :)Birgit
S
0

What went wrong

This problem is actually pretty subtle, and I had to break it down step by step to work out what was going wrong.

Let's see why the original fails for Vec<i32>. We're calling the function like this:

let foo = vec![5i32, 6, 12, 0, -55]; // `foo` is a `Vec<i32>`
let res = foo.get_min();

get_min takes an &self parameter, so when it's used as a method, it'll be called on a reference to the given argument (assuming it isn't a reference already). The second line will be desugared by the compiler into this:

let res = <Vec<i32> as Min<???>>::get_min(&foo);

where ??? is the type of result we're returning. So what is that type? It needs to be the type that's produced by <&Vec<i32> as IntoIterator>. But what that type is depends on the lifetime of the reference that you're iterating over. The problem is, the start of the impl block says impl<T, Elmt> Min<Elmt> for T, and the way Rust's trait syntax works, the angle-bracketed generics immediately after impl are the only thing that Rust is allowed to generalise over when determining whether Vec<i32> implements Min.

Let's work out what the generic Elmt should be in Min<Elmt>, for Vec<i32>. A &Vec<i32> does support IntoIterator, and it returns &i32 elements with the same lifetime as the vector itself. (Generally speaking, iterating over a reference to a collection returns references to the elements inside, and this is true even if the elements are Copy; you can ask for copies of the elements inside using .copied() but this isn't done by default.) So the trait we're trying to implement is Min<&i32>.

But this isn't any old &i32 – because we're dealing with a reference, Rust needs to know that the reference will last long enough to be valid. What we need is some specific &i32 that lasts long enough that we can read it from the return value of our actual call to get_min(). So in order to prove that the trait implementation is valid, the compiler needs to know the type of T (which is easy – it's Vec<i32>); the type of Elmt (which is &i32 with some specific lifetime that is valid for the return value of get_min); and the lifetime of self in the call to get_min (which it isn't allowed to take into account because that isn't specified as a generic at the start of the impl block).

Because one of the components is missing, the compiler can't prove the trait implementation to be valid; an implementation of Min<&'a i32> for some specific lifetime 'a should allow its get_min method to be called with any appropriate &self (which by default means a value which is valid for the duration of the call to get_min), but there's nothing in the trait generics that says that the lifetime of the self in the call to get_min has to match the lifetime of the Elmt. Because the IntoIterator method on &Vec<i32> can only return &i32s that have the same lifetime as the &Vec<i32>, that leads to a lifetime error because the trait isn't specifying the requirement that the lifetime of the self type in get_min and the lifetime of the Elmt match.

The simplest (and best) fix

The underlying problem here is that you are trying to implement a trait Min<&i32> (which returns elements of some specific lifetime) on Vec<i32> (which doesn't have any particular lifetime involved). It makes much more sense to design the trait so that it implements Min<i32> on Vec<i32> and Min<&'a i32> on &'a Vec<i32> – in other words, we match up between the trait generics and the type the trait is implemented on. That implementation looks like this, which is very easy to read and has no explicit lifetimes:

trait Min<Elmt> {
    fn get_min(self) -> Elmt;
}

impl <T, Elmt> Min<Elmt> for T
where 
    T: IntoIterator<Item=Elmt>,
    Elmt: Ord 
{
   fn get_min(self) -> Elmt {
        let mut iter = self.into_iter();
        match iter.next() {
            None => panic!("Empty List"),
            Some(x) => iter.fold(x, |acc, y| acc.min(y))
        }
   } 
}

and it works fine on both borrowed and owned types:

fn main() {
    let foo = vec![5i32, 6, 12, 0, -55];

    let res = (&foo).get_min();
    dbg!(res);
    let res2 = foo.get_min();
    dbg!(res2);
}

Here, res is an &i32 that references the vector foo – because we converted an &Vec<i32> into an iterator and received an &i32 iterating over it – and res2 is an i32 that was moved out of the vector foo, because if you call into_iter() on a Vec rather than &Vec, you get an iterator that moves out of the vector.

The only significant change I made was to make get_min take self rather than &self. The reason is that the &self is basically just a shortcut to say "if this trait is defined on a T, this method takes an &T" – but we can get the same result by defining the trait on an &T instead, in which case the syntax to take an &T is self rather than &self. It's usual to see self rather than &self on trait methods when the trait is designed to be usually defined on references, like iterator-based traits are (and this gives the bonus that you get moving/consuming versions of the trait for free).

Why the Min<'a, Elmt> version works

With the Min<'a, Elmt> version, there's one important subtle change in the signature of get_min, which changed from sugar for this:

impl<T, Elmt> … {
    fn<'a> get_min(&'a self) -> Elmt { … }
}

to this:

impl<'a, T, Elmt> … {
   fn get_min(&'a self) -> Elmt { … }
}

In the non-working code, the lifetime of the self parameter is a generic on get_min. That means that, for any implementation of Min, that implemetnation has to be able to work with any lifetime that's valid for the time that get_min is running.

With the more complicated version, we have a Min<'a, Elmt> that has a specific lietime, and get_min is now locked to that one specific lifetime. In other words, it can only be called with a reference of that one valid lifetime, and doesn't have to work in other situations.

That's enough to make the code valid because an &'a Vec<i32> is now able to implement Min<'a, Elmt> – the iterator doesn't have to be able to produce an &'a i32 regardless of when get_min is called, it just has to be able to produce an &'a i32 if it's called with an &'a Vec<i32>, and it's able to do that. Hopefully that answers your questions 1 and 3 (why a lifetime parameter is needed).

The rest of the changes are not really fundamental, but just there to make the rest of the code typecheck. For your question 2, you were asking about this line:

&'a T: IntoIterator<Item = Elmt> + 'a,

and why the 'a at the end is needed. The problem is that Rust doesn't know that an &'a T will stay alive for the entire lifetime 'a: although the reference itself will be alive, it's only useful if the thing it references is also alive, and so the compiler needs to guard against the possibility that T might also be a reference, but with a shorter lifetime (which would make it impossible to perform the iteration safely). (The suggestion made by rustc is to have a T: 'a requirement in the impl block, i.e. impl <'a, T: 'a, Elmt> Min<'a, Elmt> for T – this expresses the same requirement more clearly and is probably preferable to the + 'a version.)

Although this version works, it's unnecessarily complicated and less general than it needs to be – I would prefer the get_min(self) example in the previous section, which is much easier to read.

How to make the for<'a> version work

It is also possible to make the code work using the for<'a> syntax (although there is no advantage to doing this, compared to the get_min(self) version). It doesn't work if you try to return an Elmt firectly, but the trick here is to make get_min explicitly return a reference:

trait Min<Elmt> {
    fn get_min(&self) -> &Elmt;
}

impl <T: ?Sized, Elmt> Min<Elmt> for T
where 
    for<'a> &'a T: IntoIterator<Item=&'a Elmt>,
    Elmt: Ord 
{
   fn get_min(&self) -> &Elmt {
        let mut iter = self.into_iter();
        match iter.next() {
            None => panic!("Empty List"),
            Some(x) => iter.fold(x, |acc, y| acc.min(y))
        }
   } 
}

fn main() {
    let foo = vec![5i32, 6, 12, 0, -55];

    // sugar for `let res = <Vec<i32> as Min<i32>>::get_min(&foo);`
    let res = foo.get_min();
    dbg!(res);
}

Here, for<'a> &'a T: IntoIterator<Item=&'a Elmt> means "for every lifetime 'a, a reference &'a T can be converted into an iterator that produces &'a Elmt". This fact is true of vectors, and we can make use of it in get_min by iterating over our &self to produce items with the same lifetime as self.

It probably didn't work for you because you most likely wrote for<'a> &'a T: IntoIterator<Item=Elmt>. This means "for every lifetime 'a, a reference &'a can be converted into an iterator that produces Elmt, and it's the same type Elmt for any given 'a." This fact is not true for vectors (because for vectors, the lifetime of the produced item depends on the lifetime of the reference being iterated over). As such, Rust refused to compile the code because it required vectors to have a property that they don't actually have.

Smother answered 30/6 at 2:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.