Separate &'special self lifetime reason?

Hello,

I'm trying to understand the logic behind lifetime definition from this piece of xmas-elf crate used for a self-referential iterator.

My understanding is that this means that the ElfFile reference stored in the iterator must not outlive the "data" of the ElfFile, which kinda makes sense.

The iterator is created using this method where I'm really confused about the &'b self lifetime.

What exactly is the logic here? I'm hitting issues with this in my code that uses this library, but I'm trying to fully understand how it's supposed to work wrt. lifetimes in general first.

If you look at the definition of the iterator type itself and compare it with the signature of section_iter(), you can see what's going on:

pub struct SectionIter<'b, 'a: 'b> {
    pub file: &'b ElfFile<'a>,
    pub next_index: u16,
}

pub fn section_iter<'b>(&'b self) -> SectionIter<'b, 'a> {
        SectionIter {
            file: self,
            next_index: 0,
        }
    }
}

That 'a: 'b is basically saying that 'a, the lifetime of the file being iterated over, should be at least as long as 'b, the lifetime of *self. This also makes sense: the iterator must not outlive the view it is iterating over, else it would be dangling.

Now why is there a separate lifetime 'b? Why isn't the field defined as just file: &'a ElfFile<'a>?

The reason is that a single lifetime identifier can only designate the exact same lifetime in a given context. Therefore, writing &'a ElfFile<'a> would tie any eventual borrows to the lifetime of the whole ELF file, forever. This would in turn make the iterator useless, because it couldn't then be borrowed mutably more than once, since the first mutable borrow would lock it for its entire lifetime.

So, this whole &'b Foo<'a> where 'a: 'b construct is actually a common idiom for avoiding such an API design mistake, while also ensuring that references and iterators won't be dangling.


As an exercise, try to implement the Display trait in terms of io::Write, by means of writing a wrapper around the formatter, and note what lifetime annotations you need to make it compile:

struct TextIo(&mut fmt::Formatter);

impl io::Write for TextIo {
    ...
}

struct Foo;

impl Display for Foo {
    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        serde_json::to_writer(
            &mut TextIo(formatter),
            "example string",
        ).map_err(
            |_| fmt::Error
        )
    }
}
4 Likes

Thanks that makes sense. I'm still stuck on "boxing" these tho.

What I'm trying to achieve is wrap the ElfFile and SectionIter with my own boxed (trait object) iterator which provides some additional logic and returns my own types from ElfFile/SectionIter as source.

I've come up with something like Rust Playground but I can't figure out how to get this to like the lifetimes.

I suspect my problem lies with the Box<... + 'l> but I'm not quite sure why it's causing a problem here.

Ok, let's handle this. First of all, a few tricks to help diagnose and understand the problem at hand:

  1. I have re-ordered a bit the type definitions and impls so that they "chain" nicely;

  2. I have replaced the &'lt self shortand with the full syntax: self: &'lt Self, replacing Self with the actual type whenever possible. This is surprisingly not mentioned or suggested enough in the code and guides out there, and yet is one of the first things to do: why add the cognitive burden of replacing Self with the type involved "by memory" when the type can just be explicitly spelled out?

  3. When reordering, I have put the impl before the trait definition, since I think it's hard to get a trait definition 100% correctly, whereas when implementing for a specific type it is easier to come up with the right signature.

These things yield the following start:

#[derive(Clone)]
pub
struct Section<'data> {
    pub
    src: &'data [u8],
}

pub
struct R1<'data> {
    sections: Vec<Section<'data>>,
}

impl<'data> R1<'data> {
    pub
    fn new (s: Section<'data>)
      -> Self
    {
        R1 {
            sections: vec![s],
        }
    }
}

impl<'data> /* Relocatable<'data> for */ R1<'data> {
    fn sections (self: &'data R1<'data>)
      -> BSI<'data>
   // -> Box<dyn Iterator<Item = Section<'data>> + 'data>
    {
        Box::new(self.sections.iter().cloned())
    }
}

By now, there is a "code smell" that is easier to spot, which @H2CO3 has already talked about: we are having a &'data Stuff<'data> pattern. Let's try to loosen that a bit:

impl<'data> R1<'data> {
    fn sections<'sections> (self: &'sections R1<'data>)
      -> impl 'sections + Iterator<Item = Section<'data>>
    {
        self.sections
            .iter() // impl 'sections + Iterator<Item = &'sections Section<'data>>
            .cloned() // … Item = Section<'data>
    }
}

So, as you can see, it is important not to unnecessarily tie the lifetime of the obtained / yielded items, with the lifetime used during the iteration itself. Remember that iterators are often used as:

//               -------- 'iteration --------     |
let collection = iterable.iterate().collect(); // |
… //                                              | 'items / 'collection
drop(collection) //                               |

So, as you can see, while it may be important to be able to keep access to the original big lifetime for the 'items ('data in your example) so that the collected items can be kept around for as long as the original data can, it is also important to make sure we allow using a shorter lifetime for the iteration.

Now, let's trait-ify the whole thing:

  • impl … becomes Box<dyn …> (with a wrapping Box::new(…) in the function's body);

  • Since the return type of the impl involve two lifetimes, so will that of the trait.

  • This, in turn, implies that the type alias will also involve two lifetimes.

type BSI<'iter, 'data> =
    Box<dyn 'iter + Iterator<Item = Section<'data>>>
;

trait Relocatable<'data> {
    fn sections<'sections> (self: &'sections Self)
      -> BSI<'sections, 'data>
    ;
}

impl<'data> Relocatable<'data> for R1<'data> {
    fn sections<'sections> (self: &'sections R1<'data>)
      -> BSI<'sections, 'data>
    {
        Box::new(self.sections.iter().cloned())
    }
}

This already suffices to make the code compile :slightly_smiling_face:

I'll just add a final nit, regarding:

impl<'data> LinkState<'data> {
    pub
    fn rels<'l> (self: &'l LinkState<'data>)
      -> impl Iterator<
            Item = &'l Box<dyn Relocatable<'data> + 'data>,
        >
    {
        self.relocatables.iter()
    }
}
  • Oh, by the way, I forgot I had also removed that : 'data bound on 'l. Try to avoid such explicit bounds, since when they are used right they only add a bit of obvious stuff, but when misused, they hurt a lot. Remember, the : "operator" on lifetimes must be viewed as . And when having stuff such as &'outer Thing<'inner>, the natural bound is 'inner ≥ 'outer, and thus 'inner : 'outer. This is automagically deduced by Rust 95% of the time, let's say. So, when you go and add the incorrect 'outer ≥ 'inner bound, such as with 'l : 'data, the double inequality yields 'inner == 'outer, and we are back at the &'same Thing<'same> antipattern.

I hope you agree our yielded Items have too much indirection in them.

  • On top of that, the signature is also a bit "lifetime heavy / redundant", so for aesthetic purposes, let's try to reduce it a bit. Generally, when &'lifetime dyn Trait is involved, it implicity expresses &'lifetime (dyn Trait + 'lifetime), and this is very often the right bound (the only / main exception I can think of is the Any trait and its special relation with 'static (see, for instance, the deprecated Error::cause() vs. the correct Error::source()), or object-safe Clone-like methods.

This hints at Item = &'l dyn Relocatable<'data> being a more idiomatic yielded object type, so let's use it:

impl<'data> LinkState<'data> {
    pub
    fn rels<'rels> (self: &'rels LinkState<'data>)
      -> impl 'rels + Iterator<Item = &'rels dyn Relocatable<'data>>
    {
        self.relocatables
            .iter()
            .map(|ref_to_box: &'rels Box<_>| {
                &**ref_to_box // flatten the indirection
            })
    }
}
  • Playground

  • Note how I've also used a more readable lifetime name, and how, in practice (c.f. your main(), this 'rels will be kind of the same as 'sections :slightly_smiling_face:

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.