I'm trying to write a simple tokenizing program, and I want a Scanner struct to have a method to produce a Vec of Tokens that have slices of a source code char array. The scanner also has a slice of the source char array and uses it to create the subslices for each Token. However, the borrow checker won't allow me to scan multiple tokens with a loop, or to even put a single token into a Vec. I think this is because the Tokens have a slice of the Scanner's slice which is possibly mutable, instead of the original char array that the Scanner doesn't own and cannot not mutate. Is there any way to resolve this, or will I either have to remove the char slice from the Scanner struct and pass it in to functions manually, or just allocate a new string for each Token?
One of the problems is fairly simple: the lifetime of self
and the return value need to be the same. So just add the 'a
lifetime to self:
The other problem is different, I think it may be the problem solved by Polonius.
FWIW, a char array is a very unusual thing to use. You really should be using strings.
A problem is on line 51, where you're eliding the lifetime on Token
. By default that gets filled with the lifetime of the input references, which in this case is &mut self
. Changing that to Token<'a>
may fix your problem.
I just tested, and it does.
To clarify: [char]
is 4 bytes per character (well, per Unicode code point, but that's a whole thing). If you want to parse the encoded bytes, you can use [u8]
, or if you want Unicode but also to pop off the 'next' character you can use:
let c = input.chars().next()?;
input = &input[c.len_utf8()..];
as a start to recognizing multiple characters then slicing it all off at once. Don't forget to use len_utf8()
(unless you know that character is ASCII).
Alternatively, just use input.char_indices()
and let it keep track of the position for you.
Don't do that. It's a &'a mut Scanner<'a>
in disguise, which you never want.
Incidentally, if you add the following to your code:
#![deny(elided_lifetimes_in_paths)]
It will error when lifetime parameters like that on Token
are completely elided.
Thanks for the correction. You're right of course.