Making a reference/slice refer to the original?

I'm trying to write a simple tokenizing program, and I want a Scanner struct to have a method to produce a Vec of Tokens that have slices of a source code char array. The scanner also has a slice of the source char array and uses it to create the subslices for each Token. However, the borrow checker won't allow me to scan multiple tokens with a loop, or to even put a single token into a Vec. I think this is because the Tokens have a slice of the Scanner's slice which is possibly mutable, instead of the original char array that the Scanner doesn't own and cannot not mutate. Is there any way to resolve this, or will I either have to remove the char slice from the Scanner struct and pass it in to functions manually, or just allocate a new string for each Token?

Playground

One of the problems is fairly simple: the lifetime of self and the return value need to be the same. So just add the 'a lifetime to self:

The other problem is different, I think it may be the problem solved by Polonius.

FWIW, a char array is a very unusual thing to use. You really should be using strings.

A problem is on line 51, where you're eliding the lifetime on Token. By default that gets filled with the lifetime of the input references, which in this case is &mut self. Changing that to Token<'a> may fix your problem.

I just tested, and it does.

2 Likes

To clarify: [char] is 4 bytes per character (well, per Unicode code point, but that's a whole thing). If you want to parse the encoded bytes, you can use [u8], or if you want Unicode but also to pop off the 'next' character you can use:

let c = input.chars().next()?;
input = &input[c.len_utf8()..];

as a start to recognizing multiple characters then slicing it all off at once. Don't forget to use len_utf8() (unless you know that character is ASCII).

Alternatively, just use input.char_indices() and let it keep track of the position for you.

2 Likes

Don't do that. It's a &'a mut Scanner<'a> in disguise, which you never want.

Incidentally, if you add the following to your code:

#![deny(elided_lifetimes_in_paths)]

It will error when lifetime parameters like that on Token are completely elided.

6 Likes

Thanks for the correction. You're right of course.

1 Like