Question about lifetimes

I'll try to understand Lifetimes, but I have issues to understand why the following does work.
I have a simple struct.

pub struct XMLReader<B: BufRead> {
    reader: B,
}

And then I do have the following implementation:

impl<'a, 'b> XMLReader<&'a [u8]> {
    /// Constructs a new \`[XMLReader]\`.
    pub fn from_str(input: &'a str) -> Self {
        Self {
            reader: input.as_bytes(),
        }
    }

    pub fn read(&self) -> &'b str {
        "HELLO"
    }
}

And here's my MAIN function:

fn main() {
    let mut str = "";

    {
        let reader = XMLReader::from_str("INPUT STRING");
        str = reader.read();
    }

    println!("This is the contents of the reader: {}", str);
}

I don't get why this code works? I tought to see an error that str outlives the XMLReader.
Can anyone explain what's going on?

String literals that are hard-coded in the source code are compiled as references to immutable global variables. The str variable can outlive reader because the thing that bounds its lifetime is the immutable global variable, not reader.

  1. impl<'a, 'b> XMLReader<&'a [u8]> {
        fn read(&self) -> &'b str {
            "HELLO"
        }
    }
    
  2. Unsugaring &self:

    impl<'a, 'b> XMLReader<&'a [u8]> {
        fn read(self: &'_ XMLReader<&'a [u8]>) -> &'b str {
            "HELLO"
        }
    }
    
  3. Uneliding the '_ lifetime:

    impl<'a, 'b> XMLReader<&'a [u8]> {
        fn read<'c>(self: &'c XMLReader<&'a [u8]>) -> &'b str {
            "HELLO"
        }
    }
    
  4. Moving from an outer (impl block) lifetime generic parameter to an inner (fn def) one, since the parameter is otherwise unused:

    impl<'a> XMLReader<&'a [u8]> {
        fn read<'b, 'c> (self: &'c XMLReader<&'a [u8]>) -> &'b str {
            "HELLO"
        }
    }
    
    • Note: this transformation does not exactly showcase the same semantics, but the differences don't matter for the OP's snippet

    Here, we can see that 'b is a generic parameter, thus chosen by the caller, with no constraint whatsoever: the 'b lifetime used in your impl is not constrained by anything and thus not tied to self: it has no constraint whatsoever w.r.t. the lifetime 'a of the slice contained in *self, nor w.r.t. the lifetime 'c of the borrow over *self itself.

  5. Thus, there is nothing preventing the caller from picking 'c = 'static (and, FWIW, with 'c = 'static, since lifetime of references can shrink (called covariance), the other direction applies here as well):

    impl<'a> XMLReader<&'a [u8]> {
        fn read<'c> (self: &'c XMLReader<&'a [u8]>) -> &'static str {
            "HELLO"
        }
    }
    
  6. impl<'a> XMLReader<&'a [u8]> {
        fn read (self: &'_ XMLReader<&'a [u8]>) -> &'static str {
            "HELLO"
        }
    }
    

Hence why Rust let you compile that code. In practice, however, you may choose to yield a &'a str if you were to be directly borrowing from the input bytes (btw, to better understand these lifetimes, giving them meaningful names can greatly improve the readability: let's call 'a 'input_bytes instead, for instance, and 'c 'read, because it is the lifetime of the borrow of *self created from the .read() call, or 'xml_reader, because the borrowee is the XMLReader itself):

impl<'input_bytes> XMLReader<&'input_bytes [u8]> {
    /// Constructs a new \`[XMLReader]\`.
    fn from_str (input: &'input_bytes str)
      -> XMLReader<&'input_bytes [u8]>
    {
        Self {
            reader: input.as_bytes(),
        }
    }

    /// Option 1: we borrow from the input bytes _directly_, 
    /// nevermind the `XMLReader` in the middle.
    fn read<'reader> (self: &'reader XMLReader<&'input_bytes [u8]>)
      -> &'input_bytes str
    {
        "HELLO"
    }

    /// Option 2: we borrow the `XMLReader` itself:
    fn read<'reader> (self: &'reader XMLReader<&'input_bytes [u8]>)
      -> &'reader str
    {
        "HELLO"
    }
}

Note that the constrained cases are only constrained because the lifetime used, be it 'input_bytes, or be it 'reader, appears in input position of some function at some point: that's what will make "picking 'input_bytes = 'static or 'reader = 'static" impossible for the caller, since in order to do that:

  • to pick 'input_bytes = 'static, the input bytes themselves would need to be alive for the 'static lifetime / span, i.e., alive forever –in practice only true if borrowing a string literal;

  • to pick 'reader = 'static, the XMLReader itself would need to be alive forever as well1 –in practice almost never true.

1 Since 'reader ≤ 'input_bytes (for self: &'reader XMLReader<'input_bytes> to be well-formed), having 'reader = 'static, the maximum lifetime, implies that 'input_bytes = 'static.

Thanks for the very detailed explanation.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.