Iterator of replaceable input


#1

Last winter I wrote a syntactic analysis of an input vector of tokens. The input of the iterator is replaceable in order to allow adding lines during unfinished syntactic analysis in a command-line session. I took the following approach:

use std::rc::Rc;

struct Token;

struct TokenIterator {
    index: usize,
    a: Rc<[Token]>
}

fn replace(i: &mut TokenIterator) {
    let v = vec![Token,Token,Token,Token];
    i.a = Rc::from(v);
    i.index = 0;
}

fn proceed(i: &mut TokenIterator) {
    replace(i);
    let p = i.a.clone();
    let t = &p[i.index];
    match t {_ => {}}
}

fn main() {
    let v = vec![Token,Token,Token,Token];
    let mut i = TokenIterator{index: 0, a: Rc::from(v)};
    proceed(&mut i);
}

Now my question is, whether this could be solved more elegantly. I know of the possibility to use a TypedArena. But firstly, TypedArena itself would involve an unsafe block and secondly, the memory in many cases is freed at a later point of time. In practical world this is acceptable, but to me not for the sake of the art.

Could it be made more abstract by taking an iterator ADT? I have some input stream in mind, like a dripping faucet where it’s not known from where the water comes. The problem is, that some variable has to take ownership of the allocated input, as long as the input is aliased.

If a new iterator is created, the iterator must be passed upwards. Thus you might think the iterator could be stored in a struct that is taken as a mutable reference. But the old iterator is aliased in this situation. Thus the old iterator must be moved to waste while being aliased. In the approach above this is exactly what Rc is for.

If no other approach is taken, could it at least be made a little bit more pleasant? Or is it actually possible to make the program faster? In non command-line mode the performance matters. By more abstraction it might be possible to switch some internal function pointers.


#2

Any reason not to inject tokens into the original Vec? Or if the tokens are large and replacement/insertion is frequent, perhaps using a LinkedList.

How many tokens are there typically?

Also, when you mention aliasing, what do you mean exactly? If you create a new iterator, who is keeping the old one alive and do the new and old iterators need to be used at the same time? Maybe you can flesh out the example code a bit more to demonstrate it.


#3

Any reason not to inject tokens into the original Vec?

In my situation the iterator is aliased by immutable reference to other tokens. Changing index would be unproblematic, but the vector needs to stay the same.

Or if the tokens are large and replacement/insertion is frequent, perhaps using a LinkedList.

The tokens are small and there may be many of them. Ok, the size of a token is 18 bytes and there could 40 up to 108 of them (a computer algebra system or a theorem prover could generate that amount of code, if not generating symbolic trees directly).

If you create a new iterator, who is keeping the old one alive and do the new and old iterators need to be used at the same time?

The program refers to tokens while “creating a new iterator”. Thus some struct field or an Rc needs to keep the input vector or the iterator alive. Changing the iterator to new input happens usually when the old one is exhausted, but temporary indirection to a included file would also be interesting.


#4

Or 20 bytes by struct padding.


#5

What do you think? To me this seems to be the canonical approach:

use std::rc::Rc;
use std::ops::Deref;

struct Token{value: u8}

struct TokenIterator {
    index: usize,
    a: Rc<[Token]>
}
impl TokenIterator {
    fn next(&self) -> TokenView {
        return TokenView{index: self.index, a: self.a.clone()};
    }
}

struct TokenView {
    index: usize,
    a: Rc<[Token]>
}
impl Deref for TokenView {
    type Target = Token;
    fn deref(&self) -> &Token {
        return &self.a[self.index];
    }    
}

fn replace(i: &mut TokenIterator) {
    let v = vec![Token{value: 0},Token{value: 1}];
    i.a = Rc::from(v);
    i.index = 0;
}

fn proceed(i: &mut TokenIterator) {
    let t: &Token = &*i.next();
    replace(i);
    match t.value {_ => {}}
}

fn main() {
    let v = vec![Token{value: 0},Token{value: 1}];
    let mut i = TokenIterator{index: 0, a: Rc::from(v)};
    proceed(&mut i);
}

In Mecklenburg it’s at 02:52 a.m. and it started pouring rain.