Argument requires that '1 must outlive 'a

I am trying to learn rust and understand how it works in depth. For that purpose I tried to create a simple utility struct that could read and parse from stdin similar to C++ cin.

Here is the code

use std::str::SplitWhitespace;
use std::str::FromStr;
use std::fmt::Debug;

struct InputReader<'a> {
    buf: String,
    iterator: SplitWhitespace<'a>,
}

impl<'a> InputReader<'a> {
    fn new() -> Self {
        InputReader {
            iterator: "".split_whitespace(),
            buf: String::new(),
        }
    }

    // all this traits were required just for calling .unwrap() :))
    fn parse_string<T>(s: &str) -> T
    where
        T: FromStr + std::fmt::Display + Debug,
        <T as FromStr>::Err: Debug,
    {
        s.parse::<T>().unwrap()
    }

    fn read<T>(&mut self) -> T where
        T: std::str::FromStr + std::fmt::Display + std::fmt::Debug,
        <T as FromStr>::Err: Debug,
    {
        while self.iterator.next() == None {
            // if the are no more words in the current line get another line
            std::io::stdin().read_line(&mut self.buf);
            // save the iterator of the current line
            self.iterator = self.buf.split_whitespace();
        }

        // check if there are any words left in the current line
        match self.iterator.next() {
            Some(s) => {
                // if yes parse the string into the type T
                Self::parse_string::<T>(s)
            }
            None => {
                // this can't happen
                panic!("Bad :(");
            }
        }
    }
}

fn main() {
    let mut reader = InputReader::new();
    let firstInteger = reader.read::<i32>();
    println!("Read value: {}", firstInteger);
}

I have multiple questions regarding this code and rust in general.

  1. I get this error
10 | impl<'a> InputReader<'a> {
   |      -- lifetime `'a` defined here
...
26 |     fn read<T>(&mut self) -> T where
   |                - let's call the lifetime of this reference `'1`
...
34 |                 self.iterator = self.buf.split_whitespace();
   |                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ argument requires that `'1` must outlive `'a`

How should I fix it? I don't really understand what is going and why the lifetime of buf and Iterator are different.
As far as I understand the rust borrow checker is trying to prevent memory flaws, when would such a memory vulnerability or issue happen?

  1. How can I know what does the SplitIterator<'_> refer to? Is it trying to borrow the string from which it creates the iterator?

  2. What does '_ actually mean? I did not find anything in the rust book.

  3. How can I understand the rust borrow rules such that I don't have to fight the borrow checker anymore? I read the book, I watched youtube videos but it still feels quite hard for me to understand the compile errors that I get and how to fix them.
    I think I am used to thinking in Java/C++ terms and applying those patterns to rust is not possible maybe.

Thank you for taking the time to answer my questions :slight_smile:

That’s a pattern that Rust doesn’t support, commonly referred to as “self-referencing” data structures. One field of a struct that contains references to another, Rust’s borrow checker has no support for that – structs that contain references like your struct InputReader<'a> are meant to only contain references to data outside of that struct.

That means there are generally 3 kinds of approaches to deal with such a pattern

  • one way is to split up the struct into two. The owned String is left by itself or as part of one struct, and the references such as inside of the SplitWhitespace iterator part of another struct with some kind of fn new(buf: &'a str) -> Struct<'a> method
  • another way is to not use references. You can store indices that indicate relevant subslices or the like. This way you can reference parts of one field’s contents from another field in the same struct, without relying on types that involve the borrow checker
  • the third way is to forego relying on what’s supported by the language itself and to seek out crates that offer macro solutions for defining self-referencing data structures after all. One such crate is e.g. self_cell - Rust

In your case, I haven’t reasoned through the whole code yet, but it looks like your use case might not properly support the first approach of splitting up the data structure, since your pattern to releasing the borrow of the buf and mutating it – in your read method – is dynamically determined. It does seem possible to use the second or third approach though.

2 Likes

Uf. Reading your message it seems that things are complicated...

SplitWhitespace iterator is from standard library and it makes my life easier because I don't have to store indexes by hand, I can just iterate over the elements.

I don't want to store buf into the InputReader but I did that just to escape another borrow checker error.

while self.iterator.next() == None {
    let mut buf = String::new();
    // if the are no more words in the current line get another line
    std::io::stdin().read_line(&mut buf); 

    // save the iterator of the current line
    self.iterator = buf.split_whitespace(); // here buf will be dropped and iterator will have a reference to something that died 
} 

I also looked up in the documentation if I can make the iterator take ownership of the string but no luck with that.

Using an index

use std::str::FromStr;
use std::fmt::Debug;

struct InputReader {
    buf: String,
    // index of end of section of the input that has been processed
    processed: usize,
}

impl InputReader {
    fn new() -> Self {
        InputReader {
            processed: 0,
            buf: String::new(),
        }
    }

    // all this traits were required just for calling .unwrap() :))
    fn parse_string<T>(s: &str) -> T
    where
        T: FromStr + std::fmt::Display + Debug,
        <T as FromStr>::Err: Debug,
    {
        s.parse::<T>().unwrap()
    }

    fn read<T>(&mut self) -> T where
        T: std::str::FromStr + std::fmt::Display + std::fmt::Debug,
        <T as FromStr>::Err: Debug,
    {
        let s = loop {
            let mut iterator = self.buf[self.processed..].split_whitespace();
            match iterator.next() {
                None => {
                    self.buf.clear();
                    self.processed = 0;
                    // if the are no more words in the current line get another line
                    std::io::stdin().read_line(&mut self.buf).unwrap();
                }
                Some(s) => {
                    let start = &self.buf[..] as *const str as *const u8 as usize;
                    let end = &s[s.len()..] as *const str as *const u8 as usize;
                    self.processed = end - start;
                    break s;
                }
            }
        };
        Self::parse_string::<T>(s)
    }
}

fn main() {
    let mut reader = InputReader::new();
    let first_integer = reader.read::<i32>();
    println!("Read value: {}", first_integer);
}

(playground)

Using self_cell:

/*
[dependencies]
self_cell = "1"
*/

use self_cell::self_cell;

use std::fmt::Debug;
use std::ops::ControlFlow::*;
use std::str::FromStr;
use std::str::SplitWhitespace;

struct InputReader {
    internal: InputReaderInternals,
}

self_cell! {
    struct InputReaderInternals {
        owner: String,
        #[not_covariant]
        dependent: SplitWhitespace,
    }
}

impl InputReader {
    fn new() -> Self {
        InputReader {
            internal: InputReaderInternals::new(String::new(), |s| s.split_whitespace()),
        }
    }

    // all this traits were required just for calling .unwrap() :))
    fn parse_string<T>(s: &str) -> T
    where
        T: FromStr + std::fmt::Display + Debug,
        <T as FromStr>::Err: Debug,
    {
        s.parse::<T>().unwrap()
    }

    fn read<T>(&mut self) -> T
    where
        T: std::str::FromStr + std::fmt::Display + std::fmt::Debug,
        <T as FromStr>::Err: Debug,
    {
        loop {
            if let Break(s) =
                self.internal
                    .with_dependent_mut(|_, iterator| match iterator.next() {
                        None => Continue(()),
                        Some(s) => Break(Self::parse_string::<T>(s)),
                    })
            {
                break s;
            }
            let mut buf = std::mem::replace(self, Self::new()).internal.into_owner();
            buf.clear();
            // if the are no more words in the current line get another line
            std::io::stdin().read_line(&mut buf).unwrap();
            // save the iterator of the current line
            self.internal = InputReaderInternals::new(buf, |s| s.split_whitespace());
        }
    }
}

fn main() {
    let mut reader = InputReader::new();
    let first_integer = reader.read::<i32>();
    println!("Read value: {}", first_integer);
}

Edit: The above approach of using self_cell does incur some overhead when the buf needs to be modified and the InputReaderInternals struct re-created for the mutable access. This has the overhead that a self_cell-based struct includes some memory allocation. Thus the code above probably serves more of an illustrative example and the index-based solution seems more lightweight… and more straightforward, anyways.

1 Like

The code examples above also fix 2 bugs:

this would have gotten self.iterator.next() evaluate to Some(next_item), left the upper while, then called .next() again and thus skipped one item.

It’s fixed by using a single loop with match, so that the Some case is directly handled.

This would have appended more input to buf, then re-created the iterator to start from the very beginning of buf so that everything that was already processed is processed again.

This is fixed by calling buf.clear().

1 Like

Another option is to build on top of the language's support for self-reference inside async. I hear that genawaiter is a library that does that (and lets you define iterators that way); I haven't yet used it myself.

1 Like