Iterator: lifetime may not live long enough method was supposed to return data with lifetime `'a` but it is returning data with lifetime `'1`

I'm trying to build a struct that slowly accumulates lines read from a file, but also can return a reference to each line read via an iterator. The idea is that whoever the caller is can request the next line whenever it's needed and then and only then is the next line read. However, I want to store all the lines in a string for later use. I keep getting stuck on the lifetimes, though. Although my struct holds on to the string for 'a and the iterator returns a reference to the string with lifteim 'a, it seems like the reference is only referring to the iterator itself rather than the string. How can I fix this? This is what I have so far:

pub struct SourceHandler<R>
where
    R: BufRead,
{
    lines: Lines<R>,
    source: String,
}

impl<R> SourceHandler<R>
where
    R: BufRead,
{
    pub fn new(lines: Lines<R>) -> Self {
        Self {
            lines,
            source: String::new(),
        }
    }
}

pub struct SourceHandlerIter<'a, R>
where
    R: BufRead,
{
    source_handler: &'a mut SourceHandler<R>,
}

impl<'a, R> Iterator for SourceHandlerIter<'a, R>
where
    R: BufRead,
{
    type Item = Result<&'a str, io::Error>;

    fn next(&mut self) -> Option<Self::Item> {
        let line_start = self.source_handler.source.len();
        let next_line = self.source_handler.lines.next();
        match next_line {
            Some(line) => match line {
                Ok(line) => {
                    self.source_handler.source.push_str(&line);
                    Some(Ok(&self.source_handler.source[line_start..]))
                }
                Err(err) => Some(Err(err)),
            },
            None => None,
        }
    }
}

An instance of an iterator should outlive the references returned for each of its items during the iteration (i.e. everytime that the next method is called), but you are giving those two concepts the same lifetime 'a:

impl<'a, R> Iterator for SourceHandlerIter<'a, R>
where
    R: BufRead,
{
    type Item = Result<&'a str, io::Error>;

Storing references within structs is a common anti-pattern in Rust. If you don't feel comfortable working with the concept of lifetimes, I recommend you not to store reference types, and instead store owned types (i.e. &str vs String).

Have result type as a String. So Some(Ok(line))
Also your source does not include added new lines.

Returning a reference is too complicated. It needs a fixed capacity allocation to start and unsafe code to split out references.

1 Like

That's what I don't understand. I'm storing a String, not a reference. I'm only returning a reference to the owned String. Or that's what I'm trying to do.

Blockquote Also your source does not include added new lines.

Yes it does.

self.source_handler.source.push_str(&line); <<--

The newline byte; lines return does not include.

The pattern is called lending iterator, and the solution in stable Rust is via GAT:

use ::lending_iterator::prelude::*;
use std::io::{BufRead, Result};

pub struct SourceHandler<R>
where
    R: BufRead,
{
    reader: R,
    source: String,
}

impl<R> SourceHandler<R>
where
    R: BufRead,
{
    pub fn new(reader: R) -> Self {
        Self {
            reader,
            source: String::new(),
        }
    }
}

pub struct SourceHandlerIter<'a, R>
where
    R: BufRead,
{
    source_handler: &'a mut SourceHandler<R>,
}

#[gat]
impl<'a, R> LendingIterator for SourceHandlerIter<'a, R>
where
    R: BufRead,
{
    type Item<'next> = Result<&'next str> where Self: 'next;

    fn next(&mut self) -> Option<Result<&str>> {
        let source = &mut self.source_handler.source;
        let line_start = source.len();
        match self.source_handler.reader.read_line(source) {
            Ok(len) if len == 0 => None,
            Err(err) => Some(Err(err)),
            _ => Some(Ok(source.split_at(line_start).1)),
        }
    }
}

fn main() {
    let s = b"aaa
        bbb
        ccc";
    let mut source = SourceHandler::new(s.as_slice());
    let iter = SourceHandlerIter {
        source_handler: &mut source,
    };
    iter.for_each(|s| println!("{s:?}"));
    dbg!(source.source);
}
// output:
//Ok("aaa\n")
//Ok("        bbb\n")
//Ok("        ccc")
//[src/main.rs:58] source.source = "aaa\n        bbb\n        ccc"

Update: Note in this way, you're using a separate trait instead of the iterator in std, because the lending pattern is incompatible. For the lending iterator pattern, I'd recommend the blog post written last month by Niko: Giving, lending, and async closures .

Since you are unconditionally allocating a new String upon each iteration anyway, there's nothing to be gained from returning a reference as the iterator item, instead of just returning the freshly-read String. Just return line by-value after having appended it to the buffer.

2 Likes

From a compiler point of view, it's as if through self you have access to a &'self mut &'a mut String (where 'self is shorter than 'a), which however you can only reborrow to get a &'self String, but not a &'a String.

From a more practical point of view:

The push_str may reallocate the backing buffer of source, invalidating every outstanding reference to it. However in the contract of Iterator there's nothing preventing the caller from calling next again while holding a reference to a previous line. Thus if this was allowed you would end up with a use-after-free bug.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.