Reusable/borrowed buffer within an `Iterator`

Have been trying to convert a (seemingly) straightforward loop:

let mut input = Input::new();
loop {
  // fn read_line(&mut self) -> &str
  let line = input.read_line();
  if !line.is_empty() { ... }
}

Into an Iterator to opt into a for line in input { ... }.

The simplest/most intuitive way about it seems to (necessarily) involve unsafe.

Summary
impl Input {
    pub fn lines(&mut self) -> impl Iterator<Item = &str> {
        struct Lines<'i>(&'i mut Input);
        impl<'i> Iterator for Lines<'i> {
            type Item = &'i str;
            fn next(&mut self) -> Option<Self::Item> {
                match self.0.read_line() {
                    Err(e) => {
                        eprintln!("{e}");
                        None
                    }
                    Ok(line) => {
                        let ptr = line as *const str;
                        // SAFETY: the `&str` references the underlying `&'this mut Input`,
                        // not the local `&mut self`; extending the lifetime
                        // of the reference is safe by definition
                        Some(unsafe { &*ptr })
                    }
                }
            }
        }
        Lines(self)
    }
    pub fn read_line(&mut self) -> Res<&str> {
        self.line.clear();
        self.stdin.read_line(&mut self.line)?;
        Ok(self.line.trim())
    }
    pub fn new() -> Self {
        Self {
            stdin: std::io::stdin(),
            line: String::new(),
        }
    }
}

pub struct Input {
    stdin: Stdin,
    line: String,
}

The closest safest alternative, on the other hand, appears to involve GATs and wrappers and lending iterators and god knows what else. Is there any safe alternative for this particular case?

Depends on your definition of “safe”, ultimately.

Yup. Because you “iterator” can no longer be used in the following program:

let bwa_ha_ha: &str = input.next().expect("One line");
for line in input {
   …
}
println("bwa_ha_ha is {bwa_ha_ha}");

Any “normal” iterator have to support such use… where any element yielded by iterator is valid to use for the duration of the iterator existence. How do you propose to do that?

Your implementation doesn't work with that, AFAICS… it's “safe” in C/C++ sense: read the documentation, use everything correctly – and your program would work… step aside – and it'll explode.

Well… what do you expect? Landmines are normally planted with unsafe in Rust.

No, this isn’t safe. You should be able to get this to do very weird stuff if you play with it, maybe even crash.

Suppose the caller reads a line and stores the resulting &str in a local variable.

Then the caller calls iter.next(), reading the next line. At best, the value of the first &str will have changed, which is UB. Shared references aren’t supposed to do that in Rust.

If the second line is longer, so that self.stdin.read_line(&mut self.line) grows the String, then the first buffer is freed and a larger one is allocated. Then, when the caller tries to print the first string again, it accesses memory that was freed. This is also UB and typically worse in practice.

Even if the memory is reused, the first &str may end up pointing to an invalid non-UTF-8 byte sequence, which is also UB.

3 Likes

Fair points: @khimru and @jorendorff alike. Appreciated the breakdown. Note to self: just because things appear not as terse or concise as they might be, doesn't mean they should be made so.

That's short-term resolution. Long-term, of course, these “lendings iterators, GATs, PINs and other things” are there to make things with buffers inside ergonomic… but it's not easy to make them sound and ergonomic (in the absence of GC).

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.