Striving to implement an Iterator to read some text-file - totally lost with annotations and lifetimes

I’d like to implement an Iterator that does read a text-file line-wise, examine the read lines, modify them as needed and return some slices that are not owned. Though I read the advise that all data must be owned by someone in the end I thought such an implementation should be possible if one knows how to handle lifetimes and annotations.

If I’m wrong please tell me because: I’m totally lost with my first attempt that does no IO at all but won’t even compile:

Is it worth amending this?

// rustc 1.88.0 (6b00bc388 2025-06-23)

use std::fs::File;
use std::io::{BufReader, Error};
use std::path::Path;

fn main() {
    let tfuf = TextfileUnfolder::new(Path::new("../iCalendaR/radicale/calendar_many.ics"), 1024).unwrap();
    assert_eq!(&tfuf.next().unwrap(), "Hello");
}

struct TextfileUnfolder<'a> {
    file_name: &'a str,
    reader: BufReader<File>,
    buffer: String,
    lines_read: usize,
}

impl TextfileUnfolder<'_> {
    pub fn new<'a>(p: &'a Path, init_buffer_size: usize) -> Result<TextfileUnfolder<'a>, Error> {
        Ok(TextfileUnfolder {
            file_name: p.file_name().unwrap().to_str().unwrap(),
            reader: BufReader::new(File::open(p)?),
            buffer: String::with_capacity(init_buffer_size),
            lines_read: 0,
        })
    }
}

impl<'a> Iterator for &'a TextfileUnfolder<'_> {
    type Item = &'a str;
    fn next(&mut self) -> Option<Self::Item> {
        self.lines_read += 1;
        Some("Hello")
    }
}
This seems to compile.

use std::fs::File;
use std::io::{BufReader, Error};
use std::path::Path;
fn main() {
    let mut tfuf = TextfileUnfolder::new(Path::new("../iCalendaR/radicale/calendar_many.ics"), 1024).unwrap();
    assert_eq!((&mut tfuf).next().unwrap(), "Hello");
}
struct TextfileUnfolder<'a> {
    file_name: &'a str,
    reader: BufReader<File>,
    buffer: String,
    lines_read: usize,
}
impl TextfileUnfolder<'_> {
    pub fn new<'a>(p: &'a Path, init_buffer_size: usize) -> Result<TextfileUnfolder<'a>, Error> {
        Ok(TextfileUnfolder {
            file_name: p.file_name().unwrap().to_str().unwrap(),
            reader: BufReader::new(File::open(p)?),
            buffer: String::with_capacity(init_buffer_size),
            lines_read: 0,
        })
    }
}
impl<'a> Iterator for &'a mut TextfileUnfolder<'_> {
    type Item = &'a str;
    fn next(&mut self) -> Option<Self::Item> {
        self.lines_read += 1;
        Some("Hello")
    }
}

So if you require &mut self then it can not be using &tfuf but rather it should be &mut tfuf since the inner state changes.

Also, the expression needs parenthesis I suspect. Not &tfuf but (&tfuf). And this ends up being (&mut tfuf) in the code as I changed it.

1 Like

Iterator can only be implemented if all the items can be held simultaneously, which is not possible when the items are borrows of a buffer that gets overwritten every time you read a new line.

For that pattern you need a "lending iterator". There's no trait for that in std, though there are some crates. Or just have an inherent

fn next(&mut self) -> Option<&str> { ... }

(The error in your playground can be fixed by implementing for TextfileUnfolder<'a> instead, but that won't help with your ultimate goal.)

6 Likes

Thank you for your gentle help.

To get me going I dropped the approach of a lending iterator and turn to a pretty simple way: loading the entire (text-) file into some String (now all data is present and owned) and iterate it, passing out &strs.

The beginner’s lifetime curse returns with this:

use std::fs::read_to_string;
use std::io::Error;
use std::path::Path;
fn main() {
    let mut tfuf = TextfileUnfolder::new(Path::new("../iCalendaR/radicale/calendar_many.ics"), 1024).unwrap();
    assert_eq!((&tfuf).next().unwrap(), "Hello");
}
pub struct TextfileUnfolder {
    file_content: String,
    buffer_size: usize,
}
impl TextfileUnfolder {
    pub fn new(p: &Path, init_buffer_size: usize) -> Result<TextfileUnfolder, Error> {
        Ok(TextfileUnfolder {
            file_content: read_to_string(&p)?,
            buffer_size: init_buffer_size,
        })
    }
}
impl Iterator for TextfileUnfolder {
    type Item = &str;
    fn next(&mut self) -> Option<&str> {
        Some("Hello")
    }
}

I bet I can not get rid of the Item but have to annotate it, next and impl Iterator for TextfileUnfolder as well. I tried various combinations but still don’t get it compiled. Can somebody help me with this?

You could get the playground to compile with a &'static str as the Item, but it still doesn't get you closer to implementing an iterator over lines.

Let's try taking a step back. The reason why a lending iterator (not the Iterator in std) is called a lending iterator is that it allows the Item<'_> to be a borrow of some data the iterator owns. It allows that by allowing the returned Item<'_> to capture the borrow from the &mut self parameter.

trait LendingIterator {
    type Item<'a> where Self: 'a;
    //              vvvvv------- same lifetime ----------vvvvv
    fn next<'this>(&'this mut self) -> Option<Self::Item<'this>>;
}

This method signature means "uses of the returned value keep Self exclusively borrowed" -- that allows the Item<'_> to have a reference into owned data.

The std Iterator trait has no such lifetime/borrowing relationship in its signature. This is what allows consumers of iterators to

  • Collect all items simultaneously
    • Items don't keep the iterator exclusively borrowed, so you can call next again while holding onto the items
  • Keep items around after dropping the iterator
    • Items don't keep the iterator borrowed at all, so you can drop the iterator while holding onto the items

Now, consider the second point in combination with your data structure:

pub struct TextfileUnfolder {
    file_content: String,
    buffer_size: usize,
}

If this was an Iterator with Items that were &str pointing into file_content, and you held on to those items after you dropped the TextfileUnfolder, you would have danging references.

So this TextfileUnfolder can't be an Iterator that returns such &strs.


You could instead create an iterator that instead borrows TextfileUnfolder -- a second, lifetime-carrying struct. In general that's how borrowing iterators work for collections -- search the std docs to find many structs called Iter and IterMut.

In this case, you would basically be recreating Lines.


By this point we've given up on buffering and still haven't reached a satisfying conclusion with regards to Iterator. So for the goals of your OP, I still suggest just going with the lending iterator pattern (with or without a trait, depending on other needs).

2 Likes

Hello, I totally agree that one fine day I should implement some lending iterator.

By now I'm not getting basic concepts right and I'd love to know what's really going on. So I'd like to get an iterator going that can refer to a struct that has all of the data present (as an owned String) - no solution for productive software, I know.

The example does declare the next -method for std::iter in another block than new . And there is a new compile-error the method nextexists for mutable reference&mut TextfileUnfolder, but its trait bounds were not satisfied.

This sounds like I need to link the second method next to the struct and its new. i recon that's not done by shifting both methods into the same block with label impl TextfileUnfolder. How to tell the compiler that next from iter belongs and refers to the struct TextfileUnfolder?

The error you get is because you implemented the Iterator trait for a shared reference but are trying to call it on a mutable reference, and so it's the wrong type. However, fixing that doesn't work either.

The issue here is that you can't have an Iterator that returns references to data that belongs to the iterator. Your code will compile if you instead only store a reference to the original string in TextfileUnfolder: Rust Playground - doing this means that each of the returned references can have the same lifetime as the original reference: a lifetime that is not related to the iterator.

(but the tests still don't pass because it looks like your logic isn't doing what you wanted)

Looks like you didn't update the Playground link?

Thank you for the hints. Having the TextfileUnfolder not owning the String to iterate is a good idea. But going this way yields new errors - while logic issues still prevail.

First thing I still don't get are the both different impl-blocks.

The first new is implemented for any TextfileUnfolder with arbitrary lifetime (<_>).

Second next introduces another lifetime (<'a>) and should be implemented for TextfileUnfolders as well. This next should turn the TextfileUnfolder into something the compiler treats like an Iter and calling next.

But the main can not find next in scope?! Even more astonishing: I need to implement fn ne<I> - why is that? Isn't there some default-implementation plus my next never calls some ne itself?

OK, following this (and getting rid of arbitrary lifetimes and logic-failures) the following finally works:

use std::io::Error;
fn main() {
    let data = "1,2,3";
    let mut data_iter = data.split(",");
    while let Some(n) = data_iter.next() {
        println!("{}", &n);
    }
    let mut data_iter = data.split(",");
    assert_eq!(data_iter.next().unwrap(), "1");
    assert_eq!(data_iter.next().unwrap(), "2");
    assert_eq!(data_iter.next().unwrap(), "3");
    let mut textfile_iter = TextfileUnfolder::new(data, 1024, 0).unwrap();
    assert_eq!(textfile_iter.next().unwrap(), "1");
    assert_eq!(textfile_iter.next().unwrap(), "2");
    assert_eq!(textfile_iter.next().unwrap(), "3");
}
pub struct TextfileUnfolder<'a> {
    file_content: &'a str,
    buffer_size: usize,
    pos: usize,
}
impl TextfileUnfolder<'_> {
    pub fn new(p: &str, init_buffer_size: usize, pos: usize) -> Result<TextfileUnfolder, Error> {
        Ok(TextfileUnfolder {
            file_content: p,
            buffer_size: init_buffer_size,
            pos: pos,
        })
    }
}
impl<'a> Iterator for TextfileUnfolder<'a> {
    type Item = &'a str;
    fn next(&mut self) -> Option<&'a str> {
        if self.pos > 6 {
            None
        } else {
            let end = self.pos + 1;
            let result = Some(&self.file_content[self.pos..end]);
            self.pos = end + 1;
            result
        }
    }
}

You're almost there... oh, you just got there, so maybe this is useless.

Do you have any outstanding questions?


(Original reply)

+// The canonical borrowed string data is a
+// &str not a &String
 pub struct TextfileUnfolder<'a> {
-    file_content: &'a String,
+    file_content: &'a str,
     buffer_size: usize,
     pos: usize,
 }
-impl TextfileUnfolder<'_> {
-    pub fn new(p: &'a str, init_buffer_size: usize, pos: usize) -> Result<TextfileUnfolder, Error> {
+impl<'a> TextfileUnfolder<'a> {
+    // Now you can just take the &str directly
+    // (Before this change, who was responsible for dropping the new String?)
+    pub fn new(p: &'a str, init_buffer_size: usize, pos: usize) -> Result<Self, Error> {
         Ok(TextfileUnfolder {
-            file_content: &p.to_string(),
+            file_content: p,
-impl<'a> Iterator for &'a TextfileUnfolder<'_> {
// Implement for the struct, not references to
// the struct
+impl<'a> Iterator for TextfileUnfolder<'a> {

Wow, wonderful support, thank you.
One thing to explore is how lifetimes affect resolving names of structs and methods as well. Plus the syntax (these tailing lifetimes <_> still are a mystery to me). But there is no question about this example any more, the 'a-lifetimes do make some sense now.
Still a long way to go to get a lending iterator ...

Those errors were (in part) due to implementing Iterator for the reference and not the struct, and not due to lifetimes. Method resolution can add one implicit outer reference ("autoref"), but no more -- and you would have needed a &mut &TextfileUnfolder to call next.

The ne suggestions were just because the names are similar and the complier was trying to be helpful in case you typoed or misspelled, but that was not actually useful in this case.

On the other theme - though it was good exercise, from perspective of original goal

I’d like to implement an Iterator that does read a text-file line-wise, examine the read lines, modify them as needed and return some slices

Most practical implementation would be to return String - modification of lines would probaly require new String allocation anyhow.

Trying to avoid allocation at any cost usually can complicate code significantly and usually it doesn't matter - unless you are doing some low level system programing

Consider Java, where strings are immutable - many String operations require allocations and nobody cares.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.