Struct containing CharIndices

To avoid certain unstable API I have been trying to make use of CharIndices in my lexer for Scheme R7RS. I had managed to avoid certain dreaded lifetime/borrowing issues, but now I can no longer run away from my ignorance. I feel like this is the one great struggle I have with Rust, and if I can finally understand this, all will be well. Below is a very simple piece of code that does not compile.

use std::str::CharIndices;

struct Lexer<'a> {
    iter: CharIndices<'a>,
}

impl<'a> Lexer<'a> {

    fn new(input: String) -> Lexer<'a> {
        let ci = input.char_indices();
        Lexer {
            iter: ci
        }
    }
}

fn main() {
    Lexer::new("input string".to_string());
}

I have tried all sorts of things, but none of them compile. My goal is simple: have a lexer that uses an instance of CharIndices to move forward and backward through the input text.

Any tips on where I've gone wrong? I've read the Rust book, the reference, just about every Rust tutorial I could find, as well as the excellent Rust by Example book. I sort of think I know what's going on, but I can't for the life of me find a solution.

Thanks

1 Like

More than happy to be wrong, but I don't think this is going to work. Rust isn't great at "store a value and a reference to it together". This creates a sort of self-reference that the system doesn't really know how to deal with (and it's not clear there is a way to deal with it). The simplest solution to me would be just to change your API to take an &str and not let the Lexer be totally self-contained:

use std::str::CharIndices;

struct Lexer<'a> {
    iter: CharIndices<'a>,
}

impl<'a> Lexer<'a> {

    fn new(input: &'a str) -> Lexer<'a> {
        Lexer {
            iter: input.char_indices(),
        }
    }
}

fn main() {
    Lexer::new("input string");
}

Is there a reason you want to have a full-blown String?

I am willing to accept that this just can't be done. I'll happily use an &str instead. I had tried something similar, but without the lifetime in the parameter list, it wouldn't compile. Thanks for the working sample, now I can move forward.

As for why I had String, it has been there for a while, so I can't remember exactly. I think it may have something to do with the lexer running in a separate thread and emitting tokens via a channel. Probably at the time I couldn't get &str to work and found that using String would get it to compile.

Thanks for your help.

In my lexer for a Rust-like language, I'm also passing it a &str and storing a CharIndices instance:

pub struct Lexer<'a> {
    source: &'a str,
    iter: CharIndices<'a>,
    pos: usize,
    curr: Option<char>,
    lineno: usize
}

(Full source)

@msiemens Thanks for the tip, that was helpful. I appreciate you're sharing your example as well, it has given me more food for thought.