Rust newbie questions

I have been learning Rust occasionally for a while and I have a lot of questions about explicit lifetimes. Some of the answers might be found in the documents but I still want to make sure I really get it.

  1. Given that only references, as opposed to objects, can be borrowed in the situations of arguments passing, does that mean explicit lifetime annotations are only associated with parameters of reference type?

  2. Following the above, which means if a function does not have any reference type in its function signature (parameter and return value) then it doesn't bother with explicit lifetime annotations at all. Right?

  3. A lifetime variable is meaningful only if it creates a relation between two or more values/references. Which means a lifetime variable 'a must associate something with unknown lifetime to something else with a known lifetime in order to propagate the lifetime information. Something like fn<'a> foo(x : &'a str) int or fn<'a> bar(x : &str) -> &'a str doesn't convey any useful lifetimes information, much like the 'defined by unused variables'. Am I correct?

  4. In the case of struct declarations, does the lifetime annotations mean the lifetime relationships between the containing struct object and its fields (of reference type) ?

  5. If lifetime annotations create relation between two references/objects, can I say something like the following:
    a. Define a binary operator ~
    b. A ~ B means A and B belong to the same lifetime (or in other words, has the same lifetime value).
    c. Then fn<'a> foo(x : &'a str) -> &'a str can be interpreted as:
    Given that x has an initialized lifetime L denoted by the variable 'a, the returned str reference has an uninitialized (unknown) lifetime b,
    We declare a ~ b, so the compiler can assign L to both the variable 'a and 'b (or IOW to propagate L from 'a to 'b)

This is what all I have imagined about lifetimes annotations. :smile:
Do I get something wrong?

1 Like

Meta comment: make sure you understand lifetime elision rules: Lifetime Elision - The Rustonomicon. This syntactic sugar is not calories free: sometimes your function itself compiles fine, but you get lifetimes errors at the call site because lifetime elision resulted in the wrong signature.

  1. Not exactly. You can have a generic type, parametrized over lifetimes:
struct Holder<'a> {
    field: &'a i32
}

fn make_holder(thing: &i32) -> Holder {
    Holder { field: thing }
}

fn make_holder_explicit<'a>(thing: &'a i32) -> Holder<'a> {
    Holder { field: thing }
}

In some sense, the opposite is true: you can think of &'a T syntax as a sugar for a generic type Reference<'a, T: 'a>.

  1. See the example above :slight_smile:

  2. Yes: lifetimes are for relating things to each other. The only lifetime you can get out of thin air is 'static.

  3. Not sure how to answer this question properly. I think the relation is the same as with plain references. Imagine this wrapper around a reference struct Holder<'a> { r: &'i32 }. It behaves exactly as &i32.

  4. Yes, this sounds right! Keep in mind though that it is the compiler who decides the lifetime L. That is, if the result of foo is required to live long, then it adds the constraint that the original L must live that long as well.

1 Like

Thanks for your explanations.

I have to digest this subject carefully again later when I have had better sleep. Yes, I am a little confused.

Regarding a type parametrized over lifetimes, it sounds like types in Rust is not just a 'scalar' but a tuple where one of its (two?) dimensions is of lifetimes.
I would probably post follow-up questions after I cleaning up my thoughts.

1 Like

(The following is my view toward my experience of learning Rust, a general self-centered ranting from a noob's perspective :wink:)

I really feel like I need to learn the internal data model and representations of lifetimes to fully grasp the idea rather than just trying to get used to it or to appease the complier errors (despite I haven't encountered too many so far). This reminds me of how I finally learned Git and confidently adopted it without backing up my repo every time I had to use a new combo of commands. I had to read the materials about the things under-the-hood rather than reciting the recipes copied from other people. Even for a much simpler language like Go, it helps a lot after reading the blog posts about the internal representation of slices and interfaces. Is there something similar for Rust, especially about the implementation of lifetimes? Hopefully it's not monstrously complicated.

Yes, in a sense, though only references use the extra bit. (I think most call these "kinds").

2 Likes

And you can think of the former as constraining the type spatially (or topologically :slight_smile: ) and the latter as constraining it temporally.

2 Likes

This. (Now that I do have better sleep.)
I think the clarification about Reference<'a, T: 'a> is gold. Syntax sugar is really the enemy for noobs :smile:
It's probably worth adding to the very beginning of the introduction of lifetimes in the docs at which the strange &'a notations suddenly start to appear everywhere.

I am absolutely new to rust, but have been programming and learning new languages for going on 40 years. Sigh, is it really that long? Anyway I have been working my way through the rust book and doing all of the problems, and working on my own as well. In the section on closures there is a construct "Cacher" that implements some basic memoization and the author puts the challenge up to make it A) more general with additional type parameters, and B) make it more correct by implementing it with HashMap to cache any new value. I have done so and my implementation works but I think I am not doing this optimally. The errors I got and read about along the way drove me to doing it this way:

struct Cacher<T,A,B> 
    where T: Fn(A) -> B, A: Eq + hash::Hash 
{
    calculation: T,
    value: HashMap<A,B>,
}

impl<T,A,B> Cacher<T,A,B>
    where T: Fn(A) -> B, A: Eq + hash::Hash + Clone, B: Clone
{
    fn new(calculation: T) -> Cacher<T,A,B> {
        Cacher {
            calculation,
            value: HashMap::new(),
        }
    }

    fn value(&mut self, arg: A) -> B {
        if !self.value.contains_key(&arg) {
            let result = (self.calculation)(arg.clone());
            self.value.insert(arg, result.clone());
            result
        } else {
            self.value.get(&arg).unwrap().clone()
        }
    }
}

It seems to me that clone is not the best way but my failure to understand how borrowing works probably drove this. Anyway, my question is "What would be the right way to perform the calculation AND be able to pass the argument into the map? " Every way I tried resulted in the error that the argument was already borrowed in some way. Any help would be appreciated.

Thanks,
BTW who ever wrote the Rust Book online ebook did an outstanding job, I have never seen a programming book so clearly written!

2 Likes

First, I'd change this to use the entry API to avoid double searches into the hashmap:

impl<T,A,B> Cacher<T,A,B>
    where T: Fn(A) -> B, A: Eq + hash::Hash + Clone, B: Clone
{
        let calculation = &self.calculation;
        self.value.entry(arg.clone())
            .or_insert_with(|| calculation(arg))
            .clone()
}

I had to borrow calculation separately because borrows in closures are a bit dumb, and if I wrote (self.calculation)(arg) in there it would have thought the closure borrows all of self instead of just one field.

Removing the clone of arg

In this case the clone of arg is not necessary and is something the caller could easily do on their own if you took F: Fn(&A) -> B instead. But how does one do this with the Entry API?

Unfortunately, we can't use .or_insert_with for this because it consumes the Entry and gives us no way to recover the argument to our computation! Thankfully, if you look at the docs for Entry you'll see there's actually a lot you can do with the output of entry. To eliminate the clone of the key, you'll probably have the easiest time using match:

use std::collections::hash_map::{HashMap, Entry};

impl<T,A,B> Cacher<T,A,B>
    where T: Fn(&A) -> B, A: Eq + hash::Hash, B: Clone
{
    fn value(&mut self, arg: A) -> B {
        match self.value.entry(arg) {
            Entry::Occupied(entry) => entry.get().clone(),
            Entry::Vacant(entry) => {
                let value = (self.calculation)(entry.key());
                entry.insert(value).clone()
            },
        }
    }
}

A bit trickier: Removing the clone of B

Now one more thing: The user only needs to receive back a borrow of B. But here's where things get a bit tricky. If you simply remove the clone()s and return &B, you'll get

error[E0515]: cannot return value referencing local variable `entry`
  --> src/main.rs:23:39
   |
23 |             Entry::Occupied(entry) => entry.get(),
   |                                       -----^^^^^^
   |                                       |
   |                                       returns a value referencing data owned by the current function
   |                                       `entry` is borrowed here

This is because, if you look at the signatures of your function, HashMap::entry, and OccupiedEntry::get, and take lifetime elision into account on all of them, you'll find that they are:

impl<T, A, B> Cacher<T, A, B> {
    fn value<'a>(&'a mut self, arg: A) -> &'a B;
}

impl<K, V> HashMap<K, V> {
    fn entry<'a>(&'a mut self, key: K) -> Entry<'a, K, V>;
}

// the match produces an OccupiedEntry<'a, K, V> from an Entry<'a, K, V>

impl<'a, K, V> OccupiedEntry<'a, K, V> {
    fn get<'b>(&'b self) -> &'b V;
}

you'll see that the lifetime of the &mut Cacher is encoded in the type of the Entry as a lifetime parameter 'a, but the &V returned by get() does not use this lifetime; it uses lifetime of a borrow of the entry. The entry is local to the function, resulting in the error message we see.

Looking at the other methods on OccupiedEntry, we see there is one that returns something bound to the 'a lifetime:

impl<'a, K, V> OccupiedEntry {
    fn into_mut(self) -> &'a mut V;
}

You can just call this method instead; rust will coerce the &mut B output into a &B and the function will typecheck:

impl<T,A,B> Cacher<T,A,B>
    where T: Fn(&A) -> B, A: Eq + hash::Hash,
{
    fn value(&mut self, arg: A) -> &B {
        match self.value.entry(arg) {
            Entry::Occupied(entry) => entry.into_mut(),
            Entry::Vacant(entry) => {
                let value = (self.calculation)(entry.key());
                entry.insert(value)
            },
        }
    }
}
7 Likes

WOW, ok and now the borrow is more understandable (not saying I fully understand yet :slight_smile: but that's a great explanation, I am going to play around with all of this but I am starting to see how this works. Thanks for taking the time to make such a great response.