Lifetime for slice in struct used in struct

This is a shortend example of what I tried to do. It makes not much sense in this shortend case, but it's easier to understand I hope.
What I try to do is to store a String, parse it and store some slices of this String for easy access. I don't really need the original String, but I tried to avoid copying of the data, but I don't get it to run.
And yes, I can't just store the slices directly in the Sentence struct, because the Word struct is more complex in reality.

struct Words<'a> {
    word: &'a str,
}

struct Sentence<'a> {
    text: String,
    words: Vec<Words<'a>>,
}

impl<'a> Sentence<'a> {
    fn new(text: String) -> Self {
        Sentence {
            text,
            words: Vec::new(),
        }
    }
    
    fn parse(&mut self) {
        let mut pos = 0;
        while let Some(s) = self.text[pos..].find(' ') {
            self.words.push(Words{ word: &self.text[pos..pos+s] });
            pos += s;
        }
    }

    fn print_words(&self) {
        for w in &self.words {
            println!("{}", w.word);
        }
    }
}

fn main() {
    let mut s = Sentence::new("This is a test".to_string());
    s.parse();
    s.print_words();
}

Self-referential structs are notoriously difficult to represent in Rust - see this Stack Overflow answer for a good overview of why. You're almost certainly better off restructuring your code to avoid it.

6 Likes

Thanks for your answer.
It's to bad to hear, that this is this difficult. In this case I have to rethink my code.

A good work around for self-referential structs is to use other kind of "pointers" (in the most general meaning possible). That is, instead of using compile-time checked absolute addresses (Rust references), you can use offsets and indices:

struct Range {
    start: usize,
    len: Option<usize>, // None for infinite (or use a custom enum)
}

impl Index<Range> for str {
    type Output = str;

    #[inline]
    fn index (self: &'_ Self, Range { start, len }: Range)
        -> &'_ Self::Output
    {
        let start = &self[start ..];
        if let Some(len) = len {
            &start[.. len]
        } else {
            start
        }
    }
}

struct Words {
    range: Range,
}

struct Sentence {
    text: String,
    words: Vec<Words>,
}

impl Sentence {
    fn new (text: impl Into<String>)
        -> Self
    {
        Sentence {
            text: text.into(),
            words: Vec::new(),
        }
    }
    
    fn parse (self: &'_ mut Self)
    {
        self.words.clear();
        let mut iterator = self.text.char_indices();
        while let Some((start, c)) = iterator.next() {
            if c.is_whitespace() {
                continue;
            }
            let len =
                iterator
                    .by_ref()
                    .find(|&(_, c)| c.is_whitespace())
                    .map(|(i, _)| i - start)
            ;
            self.words.push(Words {
                range: Range { start, len },
            });
        }
    }

    fn print_words (self: &'_ Self)
    {
        for word in &self.words {
            println!("{}", &self.text[word.range]);
        }
    }
}

fn main ()
{
    let mut s = Sentence::new("This is a test");
    s.parse();
    s.print_words();
}
2 Likes

Thanks. That's really great.
While struggling with Rust lifetimes I completely forgot, that I can just store the positions.

@Yandros @17cupsofcoffee are right most times it is just not worth it or impossible to make it work. Depending on what you are doing with this, I thought this might be helpful. Playground it works with a few adjustments, and some hopefully accurate comments.

1 Like

Although, do note that it technically is possible to make a self-referential struct, it just "pins" itself in its scope:

struct X<'a>(usize, Option<&'a usize>);
{   //'a
    let mut foo = X(2, None); //Currently of type X::<'?> 
    foo.1 = Some(&foo.0);     //Here we clarify: '? == 'a
    //foo is now of type X::<'a> and borrows from 'a
    //we cannot give anyone outside of 'a access to 
    //foo because then they'd have access to shorter
    //lived data. In other words: foo is borrowing
    //for its own lifetime, and can therefore not be
    //moved.
}
1 Like

Great example to understand some lifetime problems